Page 108 - Read Online

P. 108

Chu et al. J Transl Genet Genom 2023;7:196-212 https://dx.doi.org/10.20517/jtgg.2023.22 Page 102

Figure 1. The operational workflow of HKGP using the main data managers: clinical FrontEnd stores all clinical-related data and
documents, connected with Sample Manager using de-identified sample IDs. Sample Manager manages the biobank, records the GS
journey of the sample, and works as a reagent inventory.

The GS library insert size was determined using the 4200 TapeStation and D1000 ScreenTape assay
(Agilent). The library concentration was determined using the dsDNA HS assay kit and measured with the
Qubit 4 Fluorometer (Thermo Fisher Scientific). The libraries were quantified by quantitative PCR using
TM
KAPA Library Quantification kit (Roche) and QuantStudio 5 Real-Time PCR system, 384-well or
StepOnePlus Real-Time PCR system (Thermo Fisher Scientific). An equimolar library pool containing 24
TM
dual-indexed GS libraries was combined prior to sequencing on the Illumina NovaSeq 6000 sequencer using
NovaSeq 6000 S4 Reagent kit v1.5 (300 cycles), with 1% spike-in PhiX control (Illumina).

Sequence data analysis and validation
Base-calling was done using DRAGEN version 4.1.5. The secondary analysis workflow followed the best
practice guidelines provided by the Genome Analysis Toolkit (GATK) . Reads were aligned to the GATK-
[35]
provided reference genome Homo_sapiens_assembly38.fasta using BWA version 0.7.17 and duplicates
[36]
were removed using Picard version 2.27.4 . Base quality score recalibration, variant calling, and variant
[37]
filtering were performed using GATK version 4.2.6.1 and in-house tools. Annotation was performed using
Variant Effect Predictor version 104, BCFtools version 1.13, and in-house tools [38,39] .

Following sequence data quality control steps, the bioinformatic pipelines identify and filter a list of variants
for each GS sample. Candidate variants are prioritised based on the phenotype-based Exomiser , and the
[40]
expert crowdsourced reviewed PanelApp software . Sequence variants are classified according to the
[41]

103 104 105 106 107 108 109 110 111 112 113