Page 44 - Read Online
P. 44
Waller et al. J Transl Genet Genom 2021;5:112-23 I http://dx.doi.org/10.20517/jtgg.2021.09 Page 116
where µ(X) is the genome-wide false positive rate, C is the number of chromosomes, α(X) is the probability
[26]
of exceeding , and G is the genome length in Morgans . The false-positive rate is set to 0.05 for
the genome-wide significant threshold and 1.0 for the genome-wide suggestive threshold. After solving for
X, the threshold, T is determined by . Thresholds are specific to each fixed pedigree to assess
their duo-SGS results.
MM high-risk pedigrees
The statewide Utah Cancer Registry (UCR) has been an NCI-supported Surveillance, Epidemiology,
and End Results (SEER) Program registry since its inception in 1966. The UCR was utilized to invite all
individuals with myeloma in the state to participate. Peripheral blood was collected for DNA extraction
from individuals who completed informed consent.
[27]
The Utah Population Database (UPDB) is a unique resource . It includes a 16-generation genealogy of
approximately 5 million people with at least one event in Utah that is record-linked to the UCR and state
vital records. Using the UPDB, ancestors whose descendants have an excess of disease based on internal
cancer rates and years at risk can be identified and studied as HRPs. The UPDB was used to identify
ancestors whose descendants showed a statistical excess of MM (P < 0.05). The expectation was based
on internal disease rates based on birth cohort, sex, birthplace (in/outside Utah), and years at risk. The
total number of myeloma cases in each HRP identified ranged from 4 to 37 cases. After annotating the
pedigrees with those with DNA, 11 pedigrees were identified to contain 3 or 4 myeloma cases with DNA
(28 individuals; 8 individuals were in more than one pedigree). In each pedigree, the cases were separated
by 8 to 23 meioses.
DNA from the 28 cases was genotyped on the Illumina Omni Express high-density SNP array at the
University of Utah. Only high-quality bi-allelic SNPs and individuals with adequate call rates across
[28]
the genome were included. The PLINK software was used for quality control. SNPs with < 95% call
rate across the 28 individuals were removed. After filtering, 678,447 SNPs remained. These SNPs were
transformed to match 1000Genomes strand orientation.
Individuals were removed if < 90% of the filtered SNPs are called. One myeloma case had a < 90% call rate
and was eliminated from the study. We also checked for sex inconsistency based on the genotypes - all
cases passed. PLINK relationship estimates were compared with the UPDB pedigree structures - no issues
were found.
The duo-SGS method was applied to the MM pedigrees to identify regions with genome-wide suggestive or
significant evidence. Post-hoc, some duo-SGS regions were removed from consideration. Duplicate regions
occur when the same pair of pedigrees identify the same region in both their fixed-pedigree results. In
these situations, duo-SGS P-values are identical, but thresholds vary by which pedigree is fixed, potentially
leading to different significance levels. The most significant result was reported, and the lesser removed.
If an individual resided in two pedigrees and also shared the region in both pedigrees, the region was
removed. If the region spanned a centromere, it was removed. Forty-two suggestive regions were removed
as duplicates, involving an overlap individual or at the centromere.
RESULTS
Duo-SGS findings were identified for each of the eleven MM HRPs. The significance thresholds for each
fixed pedigree are in Table 1. One region at 18q21.33 reached genome-wide significance and 13 regions
were genome-wide suggestive. Table 2 shows the details of the significant or suggestive regions identified,
including the duo-SGS P-value, expected rate per genome µ(t), the two pedigrees involved, each segregating
shared region in the pedigrees, and the overlapping region.