Page 7 - Read Online

P. 7

Comertpay et al. J Transl Genet Genom 2022;6:84-94 https://dx.doi.org/10.20517/jtgg.2021.44 Page 86

Although various studies are being conducted to gain a better understanding of the association between
obesity and breast cancer, integrative analysis is needed to detect novel molecular signatures and pathways
to determine the obesity related breast cancer risk biomarkers.

In the present study, a gene expression dataset was analyzed to compare obesity-associated breast cancer
samples and non-obesity-associated with breast cancer samples. The co-expression network and protein-
protein interaction (PPI) network of differentially expressed genes (DEGs) were determined. Seed genes,
common DEGs, were then identified from the co-expression gene network and hub genes of the PPI
network. Next, to examine the molecular mechanisms of obesity-associated breast cancer, statistically
significant pathways were determined. The Ridge penalty regression model was executed by using p-values
of enriched pathways and seed gene pathway association score to determine the potential to be a molecular
signature of seed genes in obese patients with breast cancer to obtain the most relevant molecular
signatures. Finally, we identified several candidate genes and pathways in obese patients with breast cancer.

METHODS
Gene expression datasets and identification of differentially expressed genes
To characterize gene expression profiles of obesity in breast cancer, raw data of the obesity-related high-
[15]
throughput gene expression dataset GSE24185 in breast cancer were obtained from the Gene Expression
[16]
Omnibus . In total, 74 samples were analyzed, including those from 36 historically normal (BMI ≤ 24.9)
breast cancer patients as a control sample and 38 obese patients with breast cancer (BMI ≥ 30). The affy
package of the R/Bioconductor platform (version 3.6) was used. Normalization for each dataset was
performed with robust multiarray techniques. Normalized log-expression values, which were calculated
[17]
[18]
using multiple test options of linear models for microarray data to define DEGs, were used in the
statistical analysis of each dataset to contrast obese vs. non-obese breast cancer patients. For DEGs
identification, they were selected according to computed P-values greater than the significance level (P value
< 0.05) with the fold change of 1.5 used as statistical threshold parameters.

Construction of co-expression networks in breast cancer and obese states
By separating the expression profiles of non-obesity-associated and obesity-associated breast cancer
samples, two new data subsets were generated using the expression profiles of resultant DEGs. The co-
expression network of DEGs was reconstructed by calculating the Pearson correlation coefficients of the
mean expression values of DEGs in samples from obese patients with breast cancer and non-obese patients
with breast cancer. To specify the statistical meaning of binary gene correlations, the obtained correlation
coefficients were normally distributed (P-value < 0.05), and positive and negative correlation cutoff
significance levels (cutoffs > 0.47 and ≤ 0.47) were selected, respectively. An obesity-associated breast
cancer-specific co-expression network was reconstructed, including 15 nodes and 17 edges, by using
significant pairwise gene correlations.

PPI network reconstruction and identification of seed genes
The physical protein-protein interaction information was obtained from the BioGRID database, which
[19]
includes 43,219 physical interactions associated with proteins. Resultant DEGs of PPI networks were
reconstructed using Cytoscape . Seed genes were obtained from the intersection of DEGs, co-expressed
[20]
genes, and hub genes of the PPI network.

Gene set overrepresentation analyses
[21]
Overrepresentation analyses were built using the ConsensusPathDB bioinformatics tool to determine
biological processes, molecular functions, metabolic pathways, and signaling information crucially
associated with DEGs of obese patients with breast cancer and seed genes. The Kyoto Encyclopedia of

2 3 4 5 6 7 8 9 10 11 12