Page 38 - Read Online
P. 38
Page 8 of 18 Fabbrini et al. Microbiome Res Rep 2023;2:25 https://dx.doi.org/10.20517/mrr.2023.25
In addition to the tools mentioned above, there are others that allow for the estimation of separate networks
for groups defined by a binary variable, allowing the differences between each network to be recovered,
while also providing interval estimates for each parameter and evaluating the impact of the covariates on the
[50]
network properties. Examples of such tools are MDiNE , which makes use of a Bayesian graphical model
fit with Markov Chain Monte Carlo (MCMC) methods, and NetCoMi , an all-around tool for single and
[51]
differential network construction, analysis and comparison that encloses most of the aforementioned
methods (e.g., Pearson correlation, Spearman correlation, SparCC, CCLasso, SPIEC-EASI, SPRING) as well
as association and dissimilarity methods, combined in a modular and supervised fashion.
Multiomics data integration
Networking analysis has thus emerged as a powerful approach for modeling microbiome data, oftentimes
by integrating these data with other omics data to evaluate functional linkages. Microbiome multi-omics
requires collecting multiple sorts of high-dimensional biological data, including those from amplicon (e.g.,
16S rRNA) sequencing, shotgun metagenomics, metatranscriptomics, metabolomics, etc., from a
microbiome sample and its environment or host. This kind of integration holds the potential to resolve
functional mechanisms of the microbiome ; consequently, tools and methods have been produced to
[52]
address these procedures.
Multi-omics integration mostly exploits correlation-based methods, such as the Patient Similarity Networks
(PSN) and Weighted Gene Correlation Network Analysis (WGCNA) , and dimension reduction methods
[53]
such as Principal Component Analysis (PCA), Partial Least Squares regression (PLS) or Co-inertia Analysis
(CIA). Dimension reduction techniques aim to reduce the high dimensionality of multi-omics datasets
while preserving as much relevant information as possible. By reducing dimensionality, these methods
facilitate the visualization, interpretation, and analysis of integrated multi-omics data. Canonical correlation
analysis can identify linear relationships between multi-omics datasets by finding the canonical variates that
maximize the correlation between datasets. It is often used to reveal shared biological signals across different
omics layers. Network-based integration, on the other hand, combines multi-omics data by constructing
and analyzing molecular networks. Network-based methods utilize graph theory and network analysis
techniques to identify modules or communities of interconnected genes, proteins, or metabolites that are
functionally related. Packages providing this type of analysis have been released and allow for easy
[56]
implementation of such approaches. Examples include DIABLO -part of MixOmics -and MiBiOmics ,
[55]
[54]
both available as R packages.
In recent years, machine learning-based integration has become increasingly relevant in data science,
including multi-omics data integration. Machine learning algorithms, such as random forests, support
vector machines, or deep learning models, can be used to integrate and analyze multi-omics data. These
algorithms can capture complex relationships and patterns across multiple omics layers, enabling predictive
modeling or classification tasks.
ASPECTS TO CONSIDER WHEN CONSTRUCTING NETWORKS FROM MICROBIOME
DATA
The first aspect to consider before starting a networking analysis using microbiome data is the sample size.
Sample size in network analysis refers to the number of individual entities (i.e., nodes representing variables
or taxa) for which data are available. In network analysis, the sample size can be determined based on
various considerations, including the number of samples that will be included in the smallest network, to
ensure that even the smallest network computed in the study is going to be statistically robust and reliable.
Determining a priori the minimum sample size required for co-occurrence microbiome networking analysis