Page 123 - Read Online
P. 123

Pham et al. Microbiome Res Rep 2024;3:25  https://dx.doi.org/10.20517/mrr.2024.01  Page 11 of 16

                                                CLARK            0.833           0.259      0.395
                                                KrakenUniq       0.800           0.135      0.231
                                                Kraken2          0.778           0.079      0.143
                                                Centrifuge       0.833           0.169      0.280
                              RL_S001           MetaBIDx         1.000           0.421      0.593
                                                CLARK            1.000           0.421      0.593
                                                KrakenUniq       1.000           0.269      0.424
                                                Kraken2          0.889           0.308      0.457
                                                Centrifuge       1.000           0.308      0.471

               Bolded numbers in the table are the best scores in the comparison.
               showing high precision and recall, resulting in F1-scores ranging from 0.889 to 0.985. CLARK, KrakenUniq,
               and Kraken2 also showed notable improvements in precision and F1-scores compared to their performance
               without clustering, particularly in samples with 100 and 400 species.


               CAMI Dataset: In the CAMI dataset, the application of clustering also enhanced precision for all methods.
               MetaBIDx consistently demonstrated high precision and recall, with F1-scores ranging from 0.535 to 0.807
               across different samples. Other tools, including CLARK and KrakenUniq, exhibited considerable
               improvements in precision, leading to higher F1-scores compared to their initial performance without
               clustering. However, MetaBIDx maintained an edge in terms of overall accuracy.

               In conclusion, by adopting the clustering of “approximate” coverage, all methods showed an increase in
               precision, thereby reducing false positives. This approach demonstrates that integrating coverage-based
               clustering can significantly enhance the accuracy of species prediction in metagenomic analysis. MetaBIDx,
               with its inherent design to utilize this technique, consistently outperformed or matched the performance of
               other tools under this enhanced comparison framework.

               Identification of pathogens in human samples
               We evaluated the performance of all tools in identifying the pathogen in the human sample PT-8 (S2)
               dataset, at the species level using an index built from 2,850 reference genomes. This sample was diagnosed
               with a disease organism, which we assumed as the ground truth.


               We found that MetaBIDx had the highest rate of identified reads at 83%, followed by Kraken2, KrakenUniq,
               and Centrifuge with similar rates. CLARK had the lowest rate of identified reads, only reaching 42%. All
               tools assigned approximately 70% of identified reads to Mycobacterium tuberculosis and the remaining 30%
               to other species.

               When clustering of species based on coverage derived from identified reads was used, all tools identified
               Mycobacterium tuberculosis as the predicted species. It is important to note that PT-8 (S2) was used in a
               prior study  and was derived from brain tissue biopsies of a 67-year-old patient with osteomyelitis, lung
                         [38]
               disease, and multifocal brain and spinal lesions. The patient was diagnosed with Mycobacterium tuberculosis
               and responded promptly to anti-tuberculous treatment. This suggested that our approach to reducing false
               positives via clustering based on genome coverage was effective and could be clinically beneficial.

               The impact of using high-quality k-mers
               Sequencing errors can lead to false positives, reducing the precision of species prediction. We evaluated the
               impact of k-mer quality on the accuracy of bacterial prediction using MetaBIDx. K-mer quality was
               determined by averaging the quality scores of its constituent bases. For the Mende dataset, thresholds of 33
   118   119   120   121   122   123   124   125   126   127   128