Page 65 - Read Online
P. 65
Fichera et al. J Transl Genet Genom 2020;4:114-32 I http://dx.doi.org/10.20517/jtgg.2020.16 Page 123
[13]
ANNOVAR software tool and then filtered according to their predicted effects and allele frequencies
in the available public databases (dbSNP http://www.ncbi.nlm.nih.gov/projects/SNP/, ExAC http://exac.
broadinstitute.org/, 1000 Genomes http://www.1000genomes.org/, and ESP6500 http://evs.gs.washington.
edu/EVS/).
Probability profiling of genomic regions linked to selected traits
To computationally infer the genomic segments being most likely associated with selected clinical features,
we assumed that a specific trait was predominantly the outcome of the hemizygosity of specific DL,
either a protein-coding gene or a putative regulatory element, rather than the synergistic effect of the
haploinsufficiency of several genomic elements.
Given this assumption, the probability for a DL to map at a given genomic location essentially depends on
the penetrance of its haploinsufficiency and on the causative and non-causative deletions that overlap the
genomic position.
For a detailed description of the mathematical model, see the Supplementary Materials (Mathematical
Model).
Briefly, molecular data from patients, in whom the clinical status for a specific trait was assessed, were
grouped and analyzed independently. Clearly, as not all patients were evaluated for a specific trait, the
number of individuals in each group varied. In the first step of the procedure, we identified SRO regions,
taking into account only overlaps between deletions associated with the trait. By definition, these SROs
have probability 1 to contain the DL. The next step was to estimate the probability distribution inside
SRO(s). At this purpose, we used a Bayesian approach to calculate, for each non-overlapping sliding
window (Δ) of 1 kb within the SRO, the posterior probability to intersect the DL, conditioned by the
experimental data (i.e., all the deletions overlapping the specific window inside the SRO). In this regard, we
assumed that the a priori probability P (Δ overlaps DL) was inversely proportional to the SRO size and that
the best estimator for the penetrance of the DL was the value which maximizes the likelihood function P
(Experimental data given that Δ overlaps DL) (see Supplementary Materials, Mathematical Model).
In the last phase of the procedure, for each clinical feature (intellectual disability, microcephaly,
kidney malformations, dysplastic ears, hypertelorism, short hands and feet, hypotonia, brachydactyly,
microretrognathia, speech delay, and walk delay), custom UCSC tracks were automatically built to visualize
in their genomic context the set of deletions and the probability profiles, calculated either in absolute or in
log-scale. The software is available on request.
RESULTS
Chromosomal microarray analysis
The deletions were within bands 1q23.3-1q31.3 and ranged in size from 4.1 to 22.5 Mb [Figure 4]. One
deletion (Case 6) case was inherited from unaffected father (Case 5, Figure 3).
The chromosomal breakpoints defined by CMA [Supplementary Figure 1] in the six cases were:
Case 1:
arr[GRCh37] 1q23.3q24.1(164343729x2,164358165_168556004x1,168586532x2)dn
Case 2:
arr[GRCh37] 1q24.1q25.2(166270638x2,166325047_176709133x1,176724229x2)dn