Page 78 - Read Online
P. 78

Renzi et al. Microbiome Res Rep 2024;3:2  https://dx.doi.org/10.20517/mrr.2023.27  Page 9 of 16

                                                                               [60]
               sequence with a single SNP with respect to another) from sequencing errors . Since these processes rely on
               the single nucleotide variation of amplicons for defining taxonomy, they usually lead to an increased
               estimation of alpha diversity, mainly due to their higher sensitivity with respect to identity-based
               approaches. One of the greatest assumptions of these methods is that the amplicon sequence should not
               vary in length, and ITS sequences from fungi do not share this assumption. This may lead to biases in the
                                                                                                      [122]
               discriminatory potential of these methods, even if, at present, no extensive survey has been performed .
               To reduce these biases, a number of ITS sequencing-based systems have been created to identify different
               fungal species. Some of them are able to examine both 16S rRNA (from bacteria) and ITS (from fungi), such
               as Kraken , Mothur , Qiime [119,124] , Vsearch , and DADA2 ; others are specialized on fungi only, such
                                 [115]
                        [123]
                                                      [117]
                                                                    [60]
                                                                              [129]
                               [126]
                                        [116]
               as Plutof , Clotu , PIPITS , CloVR-ITS , MICCA , and BioMaS . Despite these well-known
                                                      [127]
                                                                [128]
                       [125]
               issues, standardized pipelines are still to come, leaving the choice of the analysis method in the hands of
               researchers. This situation opens a whole new scenario where researchers are responsible for the pipeline
               they used (which, in most cases, is published and freely available), and this choice may alter the research
               outcomes , paving the way for contrasting conclusions. Although pipelines based on the bacterial 16S
                       [130]
               gene (or part of it) have been extensively used in the last three decades, the “yeast world” remains largely
               unexplored, and the effect of one pipeline compared to another is unpredictable. A summary of the main
               pipelines available is reported in Table 3 [60,115-117,119,123,124,126-129,131,132] .
               In the context of metagenomic WGS, two primary strategies are commonly employed to analyze raw data:
               the alignment-based approach and the assembly-based approach. The first one involves mapping individual
               sequencing reads to a reference database or a reference genome. On the other hand, the second approach
               assembles reads de novo to form contigs, which are then clustered into so-called genome bins during a
                                                                                           [84]
               binning process. Combining both approaches is frequently advocated for result accuracy . By now, many
               bioinformatic tools are available. Alignment-based tools are strong in taxonomic profiling and identifying
               known microorganisms. They include a step of fragment recruitment in order to map all the reads to one or
               more selected references. Among taxonomic profilers, MetaPlhAn2 , Kraken2 , and DIAMOND
                                                                           [133]
                                                                                                       [135]
                                                                                      [134]
               stand out for different skills. If you need high specificity and rapid analysis, MetaPhlAn2 might be a good
               choice. For comprehensive database coverage and strain-level resolution, Kraken 2 is valuable. DIAMOND
               allows customization and offers fast alignment capabilities, but it requires additional steps for taxonomic
               profiling. Assembly-based tools, instead, are essential for discovering novel organisms and in-depth
               functional analysis within metagenomic communities. Their workflow includes an assembler  that is well
                                                                                              [136]
               suited for the reconstruction of long contigs and a genome binner to cluster such sequences from the same
               organism . When selecting an assembler for WGS data, the type of sequencing technology used, the
                       [137]
               genome size, the desired level of assembly completeness, and the availability of computational resources
               should be taken into consideration. MetaSPAdes , MegaHit , and IDBA-UD  are the most popular
                                                                                    [140]
                                                                    [139]
                                                          [138]
               metagenome assemblers, also for fungal genomes. As well as for assemblers, there is no binning tool
               designed exclusively for fungal sequences, so general metagenomic binners are being used, like
               METABAT2 , CONCOCT , MaxBin 2.0  and MetaWrap  to name a few of the most efficient. Many
                          [141]
                                                                    [144]
                                       [142]
                                                    [143]
               researchers also employ hybrid assembly strategies that combine short-read and long-read data to achieve
               more accurate and complete genome assemblies . To delve deeper into the metagenomic data beyond
                                                         [95]
               taxonomic composition, functional annotation becomes necessary. Fragment recruitment, as previously
               described, involves leveraging a database of functionally annotated genes or proteins. This approach
               provides a straightforward means to achieve functional annotation. Subsequently, annotations showing a
               specific level of coverage can be linked to various aspects, such as metabolic pathways, with tools like
                     [145]
               KEGG . Metagenomic WGS of fungi offers valuable insights into complex fungal communities, but it also
               comes with several drawbacks and challenges. Bioinformatic complexity, functional annotation, short-read
               sequencing, not standardized pipelines, data volume and processing are probably the main ones. Addressing
               these drawbacks often requires a combination of improved sequencing technologies, more comprehensive
   73   74   75   76   77   78   79   80   81   82   83