Page 44 - Read Online
P. 44

Waller et al. J Transl Genet Genom 2021;5:112-23  I  http://dx.doi.org/10.20517/jtgg.2021.09                                      Page 116

               where µ(X) is the genome-wide false positive rate, C is the number of chromosomes, α(X) is the probability
                                                                     [26]
               of exceeding       , and G is the genome length in Morgans . The false-positive rate is set to 0.05 for
               the genome-wide significant threshold and 1.0 for the genome-wide suggestive threshold. After solving for
               X, the threshold, T is determined by       . Thresholds are specific to each fixed pedigree to assess
               their duo-SGS results.


               MM high-risk pedigrees
               The statewide Utah Cancer Registry (UCR) has been an NCI-supported Surveillance, Epidemiology,
               and End Results (SEER) Program registry since its inception in 1966. The UCR was utilized to invite all
               individuals with myeloma in the state to participate. Peripheral blood was collected for DNA extraction
               from individuals who completed informed consent.

                                                                    [27]
               The Utah Population Database (UPDB) is a unique resource . It includes a 16-generation genealogy of
               approximately 5 million people with at least one event in Utah that is record-linked to the UCR and state
               vital records. Using the UPDB, ancestors whose descendants have an excess of disease based on internal
               cancer rates and years at risk can be identified and studied as HRPs. The UPDB was used to identify
               ancestors whose descendants showed a statistical excess of MM (P < 0.05). The expectation was based
               on internal disease rates based on birth cohort, sex, birthplace (in/outside Utah), and years at risk. The
               total number of myeloma cases in each HRP identified ranged from 4 to 37 cases. After annotating the
               pedigrees with those with DNA, 11 pedigrees were identified to contain 3 or 4 myeloma cases with DNA
               (28 individuals; 8 individuals were in more than one pedigree). In each pedigree, the cases were separated
               by 8 to 23 meioses.

               DNA from the 28 cases was genotyped on the Illumina Omni Express high-density SNP array at the
               University of Utah. Only high-quality bi-allelic SNPs and individuals with adequate call rates across
                                                          [28]
               the genome were included. The PLINK software  was used for quality control. SNPs with < 95% call
               rate across the 28 individuals were removed. After filtering, 678,447 SNPs remained. These SNPs were
               transformed to match 1000Genomes strand orientation.


               Individuals were removed if < 90% of the filtered SNPs are called. One myeloma case had a < 90% call rate
               and was eliminated from the study. We also checked for sex inconsistency based on the genotypes - all
               cases passed. PLINK relationship estimates were compared with the UPDB pedigree structures - no issues
               were found.

               The duo-SGS method was applied to the MM pedigrees to identify regions with genome-wide suggestive or
               significant evidence. Post-hoc, some duo-SGS regions were removed from consideration. Duplicate regions
               occur when the same pair of pedigrees identify the same region in both their fixed-pedigree results. In
               these situations, duo-SGS P-values are identical, but thresholds vary by which pedigree is fixed, potentially
               leading to different significance levels. The most significant result was reported, and the lesser removed.
               If an individual resided in two pedigrees and also shared the region in both pedigrees, the region was
               removed. If the region spanned a centromere, it was removed. Forty-two suggestive regions were removed
               as duplicates, involving an overlap individual or at the centromere.


               RESULTS
               Duo-SGS findings were identified for each of the eleven MM HRPs. The significance thresholds for each
               fixed pedigree are in Table 1. One region at 18q21.33 reached genome-wide significance and 13 regions
               were genome-wide suggestive. Table 2 shows the details of the significant or suggestive regions identified,
               including the duo-SGS P-value, expected rate per genome µ(t), the two pedigrees involved, each segregating
               shared region in the pedigrees, and the overlapping region.
   39   40   41   42   43   44   45   46   47   48   49