Page 20 - Read Online
P. 20

Lugli et al. Microbiome Res Rep 2023;2:15     https://dx.doi.org/10.20517/mrr.2022.21                                                                                          Page 9 of 16



                          Table 1. MEGAnnotator2 report of 10 sequentially processed microbial genomes using short-read technology

                                                       High                                                                                                   Number                           Number      Number
                                        Sequencing               Filtered  16S rRNA gene                        Genome           Genome           Average                Genome  Number
                          SRA                          quality                                ANI screening                                                   of                               of rRNA     of tRNA
                                        output                   reads     identity                             completeness     contamination    coverage               length      of genes
                                                       reads                                                                                                  contigs                          genes       genes
                          SRR11910208 1000000          630615    630615    Streptococcus      Streptococcus     99.89%           11.19%           51          913        2,904,834 2,934       7           50
                                                                           salivarius subsp.   thermophilus
                                                                           thermophilus 100%  99.2%
                          SRR14415532 1000000          998717    998716    Leuconostoc        Leuconostoc       100%             0.18%            154         13         2,110,850   2,089     5           55
                                                                           mesenteroides 100% suionicum 93.9%

                          SRR15311866   1000000        999947    999947    Bifidobacterium breve  Bifidobacterium   100%         0.12%            126         35         2,374,842   2,011     3           55
                                                                           JCM 7019 99.7%     breve 98.3%
                          SRR16352010 1000000          997594    996662    Bifidobacterium    Bifidobacterium   100%             0%               254         17         2,365,405   1,959     3           57
                                                                           longum 100%        longum 98.7%
                          SRR18214268 1000000          726379    726379    Lactobacillus      Lacticaseibacillus   99.46%        0%               67          90         3,055,144   2,903     5           54
                                                                           paracasei 100%     paracasei 99.0%
                          SRR22378037 1000000          892220    892220    Lactococcus lactis   Lactococcus     100%             0%               72          63         2,460,545 2,462       4           58
                                                                           99.9%              cremoris 88.0%

                          SRR22543247 1000000          998973    998973    Enterococcus faecium  Enterococcus   99.63%           0.50%            104         147        3,1005,07   3,022     8           59
                                                                           100%               faecium 94.6%
                          SRR22666477 1000000          986064    986062    Shigella sonnei    Shigella boydii   99.93%           0.33%            58          76         5,089,127   4,834     9           86
                                                                           99.9%              98.7%
                          SRR8981643    1000000        997089    997089    Clostridium botulinum  Clostridium cagae   100%       0%               140         47         3,825,030   3,529     14          77
                                                                           100%               97.6%

                          SRR9222459    1000000        841930    841930    Faecalibacterium   Faecalibacterium   100%            0.14%            72          87         3,356,538   3,213     9           63
                                                                           prausnitzii 99.9%  duncaniae 85.8%




                          using 500,000 long reads coupled with one million short reads for the hybrid approach. The average execution of the complete pipeline using long reads was

                          56.5 min, with the assembly step managed by CANU representing the most time-consuming (median of 2,761 sec) [Figure 2B]. Instead, by using a
                          combination of different sequences, MEGAnnotator2 takes an average of 53.5 min, validating the assembly step of long reads to be the most complex

                          procedure to date [Figure 2C]. Furthermore, using a hybrid approach, we highlighted the impact of long read filtering using the information of short reads that
                          takes approximately five times more than the long read filtering alone, while the polishing of the assembled data takes additional 3.4 min [Supplementary
                          Tables 2 and 3].



                          Thus, based on the achieved results, MEGAnnotator2 can manage all its functions in approximately 14.5 min for short reads, 56.5 min for long reads, and 53.5

                          min using hybrid reads. Even if the hybrid pipeline introduces two additional analyses represented by long-read filtering by short-read data and genome
                          sequence polishing, the average computing time of the pipeline is the same, highlighting high variability in the capability of the assembler to manage long reads
   15   16   17   18   19   20   21   22   23   24   25