Page 99 - Read Online
P. 99
Rhoades et al. J Transl Genet Genom 2019;3:1. I https://doi.org/10.20517/jtgg.2018.26 Page 7 of 20
[53]
variants and multiple phenotypes . Similarity and dissimilarity are assessed for both genotype and
phenotype and a matrix is formed for each variable. Then, the similarity or dissimilarity matrices for each
variable are tested for independence. The calculation of P-values does not require any permutations and the
[53]
method can be utilized on WES or WGS .
Tools for family studies
Family-based study designs are extremely advantageous to the study of rare variants because the frequency
of rare alleles for a particular illness or disorder will be higher in a pedigree than among unrelated
[54]
individuals . Currently, there are very few rare variant analysis tools that are designed to find associations
within sequences from family studies. The sampling of relatives in sequencing studies can help one to avoid
[55]
sequencing errors in the analysis . Therefore, the Minimum P-value Optimized Nuisance parameter Score
Test Extended to Relatives (MONSTER) was developed. MONSTER is an extension of SKAT-O and tests for
[55]
the association between rare variants and a phenotype, however, it can correlate data based on kinship . It
combines the SKAT model with a burden test model, where depending on the dataset presented ρ will either
be equivalent to zero, as in family-based SKAT (famSKAT), or equal to 1 as is the case with family-based
burden test (famBT). famSKAT is a statistical strategy that uses sequence kernel association to evaluate
[56]
rare variants in samples that contain related individuals . FamBT is a burden analysis that can be used to
evaluate associations between rare variants and phenotypes when samples contain kin. However, MONSTER
is capable of adaptively switching between models, performing like either famSKAT or famBT depending on
[55]
the data imported .
A particular challenge in conducting a rare variant analysis of pedigree sequencing data is identifying de
[57]
novo mutations . Pedigree Variant Annotation, Analysis, and Search Tool is one of the tools that exists
for rare variant analysis of familial data, it uses both association testing and the logarithm of odds (lod)
[57]
scores to identify rare causal variants from familial data . Fampipe is a pipeline that can be used to analyze
rare variant data from association studies, the pipeline can calculate identity by descent scores as well as
[54]
lod scores to identify regions that demonstrate association . The pipeline has several modules capable of
calculating allelic frequency, family-specific mutations and more. To analyze binary traits in familial based
[58]
studies, the Kernel Machine Generalized Estimating Equations model (GEE-KM) was developed . The
Rare Variant association analysis with Family data (RVFam) package for R analyzes SNP for associations
[59]
with either continuous, binary, or survival phenotypes in familial sequencing studies . The family-based
[60]
association tests (FBAT) collapse variants using the sums of allele frequencies to generate test statistics
that are weighted. These weighted summed stats are then tested for association with phenotypes using either
multiple regression, linear regression, or linear combination analyses. Family-based Rare Variant Association
[61]
Test is an extension of FBAT, a burden test with a variance component that can be used for rare variant
association testing within extended families. The RVAS approaches can also be used to investigate rare and
de novo noncoding variants in family studies. An analytical framework has been developed to investigate
[62]
the de novo variations from WGS data in autism spectrum disorder (ASD) families . The SNVs and indels
are annotated and grouped by variant type, gene, species conservation, gene set, and regulatory region.
The number of de novo mutations located in these regions in cases was compared to the number in sibling
controls. Burden analyses are then performed to compute the significance of these comparisons. A similar
procedure was used to detect the associations of de novo structural variants in different annotation groups.
The authors analyzed rare variants in 519 ASD families and did not detect the significant association
between rare de novo mutations in non-coding regions and ASD. However, they observed some biologically
[62]
plausible associations that might warrant further investigation .
TaRgeT ReseqUeNCINg Of CaNDIDaTe geNes
[63]
Targeted resequencing was developed to sequence the target genes or regions of interest . The primary
advantage of the technology is that they allow for more targeted sequencing of specific portions of the