Page 65 - Read Online
P. 65

Fichera et al. J Transl Genet Genom 2020;4:114-32  I  http://dx.doi.org/10.20517/jtgg.2020.16                                       Page 123
                                     [13]
               ANNOVAR software tool  and then filtered according to their predicted effects and allele frequencies
               in the available public databases (dbSNP http://www.ncbi.nlm.nih.gov/projects/SNP/, ExAC http://exac.
               broadinstitute.org/, 1000 Genomes http://www.1000genomes.org/, and ESP6500 http://evs.gs.washington.
               edu/EVS/).

               Probability profiling of genomic regions linked to selected traits
               To computationally infer the genomic segments being most likely associated with selected clinical features,
               we assumed that a specific trait was predominantly the outcome of the hemizygosity of specific DL,
               either a protein-coding gene or a putative regulatory element, rather than the synergistic effect of the
               haploinsufficiency of several genomic elements.

               Given this assumption, the probability for a DL to map at a given genomic location essentially depends on
               the penetrance of its haploinsufficiency and on the causative and non-causative deletions that overlap the
               genomic position.


               For a detailed description of the mathematical model, see the Supplementary Materials (Mathematical
               Model).

               Briefly, molecular data from patients, in whom the clinical status for a specific trait was assessed, were
               grouped and analyzed independently. Clearly, as not all patients were evaluated for a specific trait, the
               number of individuals in each group varied. In the first step of the procedure, we identified SRO regions,
               taking into account only overlaps between deletions associated with the trait. By definition, these SROs
               have probability 1 to contain the DL. The next step was to estimate the probability distribution inside
               SRO(s). At this purpose, we used a Bayesian approach to calculate, for each non-overlapping sliding
               window (Δ) of 1 kb within the SRO, the posterior probability to intersect the DL, conditioned by the
               experimental data (i.e., all the deletions overlapping the specific window inside the SRO). In this regard, we
               assumed that the a priori probability P (Δ overlaps DL) was inversely proportional to the SRO size and that
               the best estimator for the penetrance of the DL was the value which maximizes the likelihood function P
               (Experimental data given that Δ overlaps DL) (see Supplementary Materials, Mathematical Model).

               In the last phase of the procedure, for each clinical feature (intellectual disability, microcephaly,
               kidney malformations, dysplastic ears, hypertelorism, short hands and feet, hypotonia, brachydactyly,
               microretrognathia, speech delay, and walk delay), custom UCSC tracks were automatically built to visualize
               in their genomic context the set of deletions and the probability profiles, calculated either in absolute or in
               log-scale. The software is available on request.


               RESULTS
               Chromosomal microarray analysis
               The deletions were within bands 1q23.3-1q31.3 and ranged in size from 4.1 to 22.5 Mb [Figure 4]. One
               deletion (Case 6) case was inherited from unaffected father (Case 5, Figure 3).

               The chromosomal breakpoints defined by CMA [Supplementary Figure 1] in the six cases were:


               Case 1:
               arr[GRCh37] 1q23.3q24.1(164343729x2,164358165_168556004x1,168586532x2)dn


               Case 2:
               arr[GRCh37] 1q24.1q25.2(166270638x2,166325047_176709133x1,176724229x2)dn
   60   61   62   63   64   65   66   67   68   69   70