Page 95 - Read Online
P. 95

Rhoades et al. J Transl Genet Genom 2019;3:1. I  https://doi.org/10.20517/jtgg.2018.26                                           Page 3 of 20

               GWAS and common variants
               The CD-CV hypothesis provides the scientific paradigm for GWAS, which is a powerful tool for
                                                           [23]
               understanding the etiology of complex disorders . GWAS usually applies SNP arrays to identify the
                                                                                       [24]
               target genes that may be involved with common diseases from the entire genome . This methodology
               depends heavily on assessing the correlation of MAFs of different variants and determining whether they
                                           [25]
               are correlated with a set of traits . GWAS has elucidated many common variants in complex illnesses. For
               example, a GWAS study for Crohn’s disease identified over 30 candidate variants, and also demonstrated
                                                                                                    [26]
               that variants in ILK23R and IL12B are also associated with psoriasis and other autoimmune disorders .

                                                                                              [27]
               GWAS has been used to identify many common variants associated with SCZ. Betcheva et al.  performed
               a GWAS analysis to screen 554,496 SNPs in 188 SCZ patients and 376 controls from Bulgaria. One SNP,
               rs7527939, in the HHAT gene demonstrated a significant association with SCZ with an odds ratio of 2.63.
               Previous work has shown that the microstructure of white matter is altered in the brains of SCZ, specifically
               in the left and right anterior cingulate, left and right posterior cingulate, the inferior parietal cortices than
               was present in unaffected controls. Univariate association analysis showed that one variant upstream of
               the CXCR7 gene, was associated with a reduction in white matter, and a multivariate analysis revealed an
                                            [28]
                                                                                            [29]
               association with the SORCS1 gene , which lends support to the polygenic etiology of SCZ . Many of the
               genes affected by these common variants are known to play a role in many important cellular functions,
               such as mitotic arrest, signal transduction, voltage dependent calcium receptors as well growth and
                                [30]
               differentiation cells . GWAS have also been instrumental in elucidating pathways that are enriched with
               SNPs associated with SCZ. These pathways include serotonergic signaling, ubiquitin mediated proteolysis,
               hedgehog signaling, adipocytokine signaling, and renin secretion. It is interesting to note that the SNPs that
                                                                                     [31]
               were enriched in the aforementioned pathways were all found in regulatory regions .
               There are limitations for using GWAS to investigate the role of sequence variants in disease. GWAS studies
               are based around the concept of linkage disequilibrium (LD), whereby alleles within a particular locus are
               generally more closely related than the alleles that are located more distantly. The strength of the LD is
               dependent on the frequency at which alleles appear within the population. The greater the allelic frequency
               or the more common the variation is, the stronger the association or LD. Thus, many of the SNPs identified
                                                                                 [32]
               as being associated with a particular trait are not likely to be causal due to LD . Increasing the sample size
               significantly, however, will result in the selection of several common variants of small effect. Meanwhile, rare
               variants will be masked by or undetected by GWAS because of low statistical power, caused by small number
               of cases, low allelic frequencies, low prevalence rates, etc. [33,34] . In addition, sample size has an important
               effect on the results of common variant analysis in GWAS studies, such that small sample sizes will often
                                                                [35]
               result in the identification of few variants with large effects .
               NGS and rare variants
               Rare variants are generally not detectable by GWAS because of their low frequencies which makes detecting
               them much more difficult than common variants. Multiple NGS technologies have been developed to
               identify the rare variants including single nucleotide variants (SNVs) and copy number variations (CNVs).
               Target resequencing takes advantage of using multiple probes or multiplexed PCR techniques to enrich
                                                                            [36]
               specific regions of genes and is far less costly than using custom arrays . WES uses targeted gene panels
               to sequence the coding regions, approximately 2% of the human genome. It provides a less expensive way
                                                              [37]
               to search for sequence variants throughout the genome . Finally, WGS can be used to search for variants
                                                                                      [38]
                                        [37]
               throughout the entire genome . It also captures the non-coding regulatory regions , which are important
               for gene expression. However, prior knowledge regarding the functional relevance of sequences found in
                                                                                                       [39]
               non-coding regions is necessary in order to collapse or aggregate these rare variants in a meaningful way .
   90   91   92   93   94   95   96   97   98   99   100