Page 123 - Read Online

P. 123

Pham et al. Microbiome Res Rep 2024;3:25 https://dx.doi.org/10.20517/mrr.2024.01 Page 11 of 16

CLARK 0.833 0.259 0.395
KrakenUniq 0.800 0.135 0.231
Kraken2 0.778 0.079 0.143
Centrifuge 0.833 0.169 0.280
RL_S001 MetaBIDx 1.000 0.421 0.593
CLARK 1.000 0.421 0.593
KrakenUniq 1.000 0.269 0.424
Kraken2 0.889 0.308 0.457
Centrifuge 1.000 0.308 0.471

Bolded numbers in the table are the best scores in the comparison.
showing high precision and recall, resulting in F1-scores ranging from 0.889 to 0.985. CLARK, KrakenUniq,
and Kraken2 also showed notable improvements in precision and F1-scores compared to their performance
without clustering, particularly in samples with 100 and 400 species.

CAMI Dataset: In the CAMI dataset, the application of clustering also enhanced precision for all methods.
MetaBIDx consistently demonstrated high precision and recall, with F1-scores ranging from 0.535 to 0.807
across different samples. Other tools, including CLARK and KrakenUniq, exhibited considerable
improvements in precision, leading to higher F1-scores compared to their initial performance without
clustering. However, MetaBIDx maintained an edge in terms of overall accuracy.

In conclusion, by adopting the clustering of “approximate” coverage, all methods showed an increase in
precision, thereby reducing false positives. This approach demonstrates that integrating coverage-based
clustering can significantly enhance the accuracy of species prediction in metagenomic analysis. MetaBIDx,
with its inherent design to utilize this technique, consistently outperformed or matched the performance of
other tools under this enhanced comparison framework.

Identification of pathogens in human samples
We evaluated the performance of all tools in identifying the pathogen in the human sample PT-8 (S2)
dataset, at the species level using an index built from 2,850 reference genomes. This sample was diagnosed
with a disease organism, which we assumed as the ground truth.

We found that MetaBIDx had the highest rate of identified reads at 83%, followed by Kraken2, KrakenUniq,
and Centrifuge with similar rates. CLARK had the lowest rate of identified reads, only reaching 42%. All
tools assigned approximately 70% of identified reads to Mycobacterium tuberculosis and the remaining 30%
to other species.

When clustering of species based on coverage derived from identified reads was used, all tools identified
Mycobacterium tuberculosis as the predicted species. It is important to note that PT-8 (S2) was used in a
prior study and was derived from brain tissue biopsies of a 67-year-old patient with osteomyelitis, lung
[38]
disease, and multifocal brain and spinal lesions. The patient was diagnosed with Mycobacterium tuberculosis
and responded promptly to anti-tuberculous treatment. This suggested that our approach to reducing false
positives via clustering based on genome coverage was effective and could be clinically beneficial.

The impact of using high-quality k-mers
Sequencing errors can lead to false positives, reducing the precision of species prediction. We evaluated the
impact of k-mer quality on the accuracy of bacterial prediction using MetaBIDx. K-mer quality was
determined by averaging the quality scores of its constituent bases. For the Mende dataset, thresholds of 33

118 119 120 121 122 123 124 125 126 127 128