Identification of novel candidate indicators for assessing zinc status during pregnancy in mice from microarray data

Background This study aimed to identify potential zinc status indicators and to clarify the mechanisms underlying zinc deficiency-induced organ damage and mortality in mice. Methods The dataset GSE97112, including placental tissues of mice fed diets containing normal and low concentrations of zinc, was downloaded and preprocessed. Differentially expressed genes (DEGs) were calculated and identified for zinc deficiency-related gene clusters by using the weighed gene co-expression network analysis (WGCNA) algorithm. The Gene Ontology (GO)-Biological Process (BP) and KEGG pathway of genes in the zinc deficiency-related WGCNA modules were analyzed, and the protein-protein interaction (PPI) network was constructed. In addition, modules of the PPI network were identified, and transcription factors (TFs) and miRNAs regulating DEGs were predicted. Finally, drug-gene interactions were selected. Results A total of 1055 DEGs containing 586 up- and 469 down-regulated genes were obtained. Three modules based on WGCNA had high correlation with degree of zinc deficiency. Annexin A1 (ANXA1), C-C motif chemokine receptor 3 (CCR3), C-X-C motif chemokine receptor 2 (CXCR2), and interleukin 2 (IL-2) were hub nodes in the PPI network. Three modules in the PPI network were identified, including module 1 associated with olfactory conduction and module 2 associated with inflammatory response. ANXA1, CCR3, and IL-2 were regulated by TFs. In addition, CXCR2, ANXA, and IL-2 were drug targets. Conclusion CXCR2, ANXA1, and CCR3 as well as olfactory receptor-related genes (proteins) may be used as biomarkers to assess zinc status in mice.


Background
Zinc as an important metal is involved in numerous metabolic processes. Inadequate zinc intake is highly prevalent worldwide. Wessells and Brown have estimated that at least 17% of the world's population suffers from inadequate zinc intake [1]. Zinc is also an indispensable trace element during pregnancy, and zinc deficiency because of maternal nutritional deficiency can result in severe fetal growth restriction [2]. Severe zinc deficiency in pregnancy can result in increased fetal loss and high rates of congenital malformations in several organs of surviving fetuses [3,4]. Furthermore, inadequate zinc intake is thought to be a leading cause of infant mortality [5].
Plasma zinc is widely used as a biomarker of zinc status [6,7]. However, decrease of serum zinc concentration is only detectable after long-term or severe depletion, making serum zinc an unreliable biomarker of zinc status [6]. Novel zinc biomarkers, such as the erythrocyte linoleic acid:dihomo-γ-linolenic acid ratio, negatively correlate with plasma Zn status [8][9][10]. However, there is no valid, sensitive, and reliable biomarker for zinc deficiency, particularly for marginal zinc deficiency. Previous studies have shown that changes in expression of several zinc transporters (ZnTs) influence zinc homeostasis and metabolism and subsequently change zinc status [11][12][13][14].
Moreover, dysregulation of other molecules, such as cytokines [15], ProSAP/Shank family members [16], antioxidant enzymes, and heat shock proteins [17] are associated with zinc status. Therefore, abnormal expression of genes could be a potential biomarker for zinc deficiency during pregnancy in the placenta.
In this study, microarray technology and bioinformatics methods were used to identify genes whose expression was influenced by zinc deficiency using the microarray data GSE97112 [18], which has been used to illustrate the relationship between abnormal gene expression and mean arterial pressure changes during gestation in mice. A systematic bioinformatics analysis of GSE97112 was conducted in the study. The results may help to identify potential zinc status indicators and to clarify the mechanisms underlying zinc deficiency-induced organ damage and mortality.

Data downloading and pre-processing
The original CEL file of the dataset GSE97112 [18] was downloaded from Gene Expression Omnibus (GEO, http:// www.ncbi.nlm.nih.gov/geo/). The samples were placental tissues of seven-week-old C57Bl/6J female mice fed a diet containing different concentrations of zinc from 6 weeks prior to gestation to the conclusion of experiments. The animals were divided into the following groups: control (40 mg/kg zinc, 20 cases) and zinc deficient (10 mg/kg zinc, 12 cases). The chip platform was Affymetrix Mouse Gene 2.1 ST Array, and the data were downloaded in June 2018.
The raw data were read using the R 3.4.0 extension package oligo [19] (Version 1.44.0, http://www.bioconductor.org/packages/release/bioc/html/oligo.html), preprocessed using the robust multi-array average (RMA) method [20,21] for data normalization, and were annotated using R package mogene21sttranscriptcluster.db (version 8.7.0, http://bioconductor.org/packages/release/data/annotation /html/mogene21sttranscriptcluster.db.html) to remove probes that did not match the transcript (Gene symbol). For different probes mapped to the same gene, the mean value of the different probes was taken as the final expression value of the gene.

Differential expression analysis of genes
Differentially expressed genes (DEGs) were calculated using the empirical Bayes linear model in the R package limma [22] (Version 3.32.5, http://bioconductor.org/packages/release/bioc/html/limma.html) for the P value of all genes. The significance threshold for DEGs was a P value < 0.05.
Disease related modules and genes by weighed gene co-expression network analysis (WGCNA) The WGCNA algorithm was used to discover gene clusters (or modules) in high-throughput data that were highly correlated with the sample phenotype. Modular characteristic genes in these modules were summarized, and the modules that were significantly associated with the phenotype were further evaluated.
The R package WGCNA [23] (Version 1.61, https:// cran.r-project.org/web/packages/WGCNA/) was used to identify gene sets that were significantly associated with zinc deficiency from DEGs. By setting a series of soft-thresholding power values, the correlation coefficient and the average connection degree of the connection degrees k and p(k) under each power value were calculated. The threshold was a correlation coefficient > 0.85. Based on clustering and dynamic pruning, the highly correlated genes were aggregated into modules. Finally, the WGCNA modules associated with the disease (zinc deficiency) were identified.

Functional enrichment analysis
The commonly used enrichment analysis tool Database for Annotation Visualization and Integrated Discovery (DAVID) [24] (version 6.8, https://david-d.ncifcrf.gov/) that was based on hypergeometric distribution was used to analyze the Gene Ontology (GO)-Biological Process (BP) [25] and KEGG pathway [26] of genes in the zinc deficiency-related WGCNA modules. Results with a P value < 0.05 were considered to be significantly enriched.

Protein-protein interaction (PPI) network construction
PPI network is available for identification of cellular functions of proteins in various organisms [27], facilitating to identification of key proteins associated with zinc deficiency. The interactions between gene-encoded proteins in the disease-related WGCNA modules were predicted based on the STRING [28] (version: 10.0, http:// www.string-db.org/) database. The input gene set was the genes in WGCNA modules which were significantly associated with zinc deficiency. The species was Mus musculus, and the parameter of PPI score was set to 0.7 (high confidence). Protein nodes were all included in the disease-related WGCNA modules.
After obtaining the PPI relationship, a network diagram was constructed using Cytoscape software [29]. Then the CytoNCA [30] plugin (version 2.1.6, http://apps.cytoscape.org/apps/cytonca) was used to analyze the topological properties of the network without weighting. The Degree Centrality (DC), Betweenness Centrality (BC), and Closeness Centrality (CC) were obtained and nodes were ranked based on topological properties of the network. The most important node of the PPI network involved in protein interaction was identified and designated as the hub protein.

Modules of the PPI network
Using the MCODE plugin [31] of Cytoscape software, the function module was identified, and the relationship between network topology and network components was obtained using default parameters (Degree Cutoff = 2, Node Score Cutoff = 0.2, K-core = 2, Max.Depth = 100). DAVID was also used to perform GO-BP and KEGG pathway enrichment analysis for genes in modules with threshold scores > 4.

Predicting transcription factors (TFs) and miRNAs regulating DEGs
TFs and miRNAs can play important regulatory roles in gene expression. To better understand the regulatory mechanism affected by zinc deficiency, TFs and miRNAs that could regulate key genes/proteins associated with zinc deficiency were predicted. In this study, the WebGestalt GAST [32] (http://www.webgestalt.org/option.php) tool was used for prediction of miRNA and TFs in the PPI network. The Overrepresentation Enrichment Analysis (ORA) enrichment method was used to predict miRNAtarget and TF-target with the minimum enriched gene number of 5.

Therapeutic drug prediction
The Drug-Gene Interaction database (DGIdb) database [33] (http://www.dgidb.org/) was used to predict genes targeted by the therapeutic drug, which will provide a new perspective for designing effective targeted drugs for prevention of zinc deficiency. Literature-supported drug-gene interactions were selected to construct network maps.

Differential expression analysis
After data preprocessing, a gene expression matrix containing 32 samples and 23,992 genes was obtained. The distribution of gene expression levels before and after standardization showed that the gene expression levels between samples were nearly a straight line after normalization, which was suitable for further analysis.
Based on the determined thresholds, 1055 DEGs containing 586 up-and 469 down-regulated genes were obtained. The bidirectional clustering heat map of DEGs was shown in Fig. 1. The results showed that DEGs could Fig. 1 Heatmap of differentially expressed genes (DEGs) in 12 zinc deficient samples and 20 control samples from placenta tissues of mice. Red and green represent high and low expression, respectively, and white refers to missing expression values clearly distinguish the samples according to the zinc concentration, indicating that these genes showed significant changes following treatment with different concentrations of zinc, which suggested that these were genes potentially associated with degree of zinc deficiency.

Modules and genes screened by WGCNA
To further screen genes associated with zinc deficiency, we performed a WGCNA analysis of the differential gene expression matrices obtained in the previous step. According to the method, the power value when the correlation coefficient squared value of connection degree k and p(k) for the first time to reach 0.85 (green line) were selected, that was, power = 7; under this power parameter, the average connectivity of the constructed co-expressing network nodes was 0.875, as shown in Fig. 2 (right). In addition, k was negatively correlated with p(k), indicating that the selected power value could be used for establishing a gene-scale network.
Based on clustering and dynamic pruning, 1055 highly correlated genes were clustered into 5 modules, where the grey module was a collection of genes that could not be aggregated to other modules. The 5 modules were clustered when the correlation coefficient was greater than 0.8, that was, the module with the dissimilarity coefficient less than 0.2 was merged. As a result, 4 WGCNA modules were constructed.
Two methods were used to mine modules related to degree of zinc deficiency: the correlation between each module's feature vector gene and the degree of zinc deficiency was calculated; the correlation between the traits and the expression of each gene in each module as the significance of the trait in the module, with greater significance signifying greater relevance between the module and the trait (Fig. 2, left). As a result, three modules (except for the gray) had high correlation with degree of zinc deficiency (Fig. 2, right). The yellow module contained 160 genes including 81 up-regulated and 79 down-regulated genes. The blue module contained 469 differential genes, of which 292 were up-regulated and 177 were down-regulated. The brown module contained 185 genes, of which 104 were up-regulated and 81 were down-regulated.

Functional terms and pathways enriched by DEGs in WGCNA modules
GO-BP enrichment analysis and KEGG pathway enrichment analysis results showed that the genes in the blue module were mainly associated with chloroplast transmembrane transport and cell meiosis. The genes in the brown module were mainly enriched in GO-BP terms of transcriptional regulation, and multicellular organism development, as well as pathways related to glycerophospholipid metabolism, and the transcription factor regulatory of the FOXO family. The yellow module gene in GO-BP terms was related to negative regulation of myoblast differentiation, multicellular organism development, and cell differentiation, as well as pathways of cytokines, and transport and catabolic of peroxidases.

PPI network constructed by DEGs in WGCNA modules
PPI analysis of DEGs in three WGCNA modules was performed. Two hundred forty-six PPI relationship pairs for 150 DEGs were obtained. Among these, 75 genes were up-regulated and 75 genes were down-regulated. The topological properties of the PPI network were analyzed, and the DC, BC, and CC scores of the top 30  Table 1. DEGs of Annexin A1 (ANXA1), C-C motif chemokine receptor 3 (CCR3), C-X-C motif chemokine receptor 2 (CXCR2), dynein light chain, LC8-type 2 (DYNLL2), interleukin 2 (IL-2), SEC13 homolog, nuclear pore and COPII coat complex component (SEC13), and transforming growth factor beta 1 (TGFB1) were in three ranks, which might represent hub nodes in the network.
In addition, 7 modules were identified from the PPI network. As shown in Fig. 3, there were 3 modules with score > 4, where module 1 contained 8 up-regulated genes and 6 down-regulated genes, all of which were olfactory receptor-related genes. Module 2 included 4 up-regulated genes and 4 down-regulated genes, including ANXA1, CCR3 and CXCR2. Module 3 contained 4 up-regulated genes and 2 down-regulated genes.
The GO-BP and KEGG pathway analyses were performed on the genes grouped in the three modules. The results in Table 2 showed that module 1 was mainly involved in olfactory conduction and protein-coupled receptor signaling pathways. Module 2 was mainly enriched in BP terms related to signal transduction, inflammatory response, and angiogenesis. Module 3 was mainly involved in translation, rRNA, and ribosome related pathways.

DEGs were regulated by miRNAs and TFs
Based on Webgestalt prediction, 303 TF-Target relationship pairs were obtained, including 35 TFs and 85 target genes. Forty-one miRNA-target relationship pairs were obtained, including 5 miRNAs and 23 target genes. As shown in Fig. 4, The TFs with the highest number of target genes were summarized. Calcium voltage-gated channel subunit alpha1 a (CACNA1a), CACNA1d, and mitogen-activated protein kinase kinase 6 (MAP2K6) were simultaneously regulated by multiple TFs and    miRNAs. According to the results of functional and pathway analysis, these genes were involved in solute transmembrane transport, positive regulation of apoptosis, multicellular organism development, and biological process of MAPK activity activation. The hub proteins ANXA1, CCR3, DYNLL2, IL-2, SEC13, and TGFB1 in the above PPI network were regulated by the TFs of P53/ TATA, CCAAT/enhancer binding protein (CEBP), Paired Box 3 (PAX3), Nuclear Factor Kappa B (NFKB)/E74 like ETS transcription factor 2 (ELF2)/heat-shock transcription factor 2 (HSF2), GA-binding protein (GABP), and ELF1/ myeloid zinc finger gene 1 (MZF1), respectively.

Therapeutic drug prediction
Based on the DGIdb database, the interaction network diagram of drug-gene interactions was constructed (Fig. 5). Eighty-three drug-target relationship pairs, including 4 DEGs and 80 interaction drugs were identified. The 4 DEGs were CXCR2, ANXA, TGFB1, and IL-2.

Discussion
Zinc is reported to exert antioxidant activity through guarding sulfhydryl groups and stabilization of cell membranes, and it may play a key role in modulating cell cycle and apoptosis [34,35]. Two zinc transporter families, ZnTs and Zrt-, Irt-related proteins (ZIPs) are shown to function in zinc mobilization across biological membranes [14]. For instance, ZnT 1 is shown to play a key role in zinc homeostasis in adult mice via modulating the transport of maternal zinc into the embryonic environment, and deletion of the Zinc Transporter 1 gene in mice may result in embryonic lethal [36]. ZIP8 can function indispensable effects on both multiple-organ organogenesis and hematopoiesis during early embryogenesis in mice [37]. Moreover, Zinc deficiency during pregnancy is harmful for both the mother and the fetus [38], which is considered as a risk factor for adverse pregnancy outcomes and preterm delivery [39]. As such, effective monitoring for zinc deficiency is very important. In this study, some potential biomarkers for zinc deficiency during pregnancy in the placentas of mice were identified. Zinc is essential for immunity, oxidative stress, and chronic inflammatory response [40]. Zinc deficiency is involved in immune dysfunction and systemic inflammation [41] and was found to exert a significant influence Fig. 4 The transcription factor (TF)-miRNA-target regulatory network. Red circles indicate up-regulated genes, green circles indicate downregulated genes, yellow triangles represent predicted miRNAs, purple diamonds indicate predicted TFs (only TOP5 with higher number of target genes than others), red arrows indicate miRNA-regulated target genes, and gray arrows indicate target genes regulated by TFs on the outcome of inflammatory diseases, such as inflammatory bowel disease [42]. Moreover, maternal zinc supplementation is found to impact beneficial effects on neonatal immune status and infant morbidity from infectious diseases [43]. This study revealed that CXCR2, CCR3, ANXA1, and IL-2 were hub nodes in the PPI network and were regulated by TFs. The chemokine receptor CXCR2, involving in the innate immune system, is a receptor for interleukin 8 (IL-8). It mediates neutrophil migration to sites of inflammation and angiogenic effects [44]. CCR3 encodes a receptor for C-C type chemokines which contributes to accumulation and activation of eosinophils and inflammatory cells [45]. IL-2 regulates proliferation of T and B lymphocytes and plays an essential role in the immune response [46]. ANXA1 has anti-inflammatory activity and is important for innate immune response [47]. The functional enrichment results confirmed the relationship between three genes and immune response. Prasad et al. suggested that IL-2 and IL-2 receptors were down-regulated during zinc deficiency [48]. However, there is no evidence of relationship of abnormal expression of the three genes (CXCR2, CCR3 and ANXA1) and zinc deficiency. Therefore, CXCR2, ANXA1, and CCR3 would be potential biomarkers for zinc deficiency in mice. In addition, therapeutic drug prediction indicated that CXCR2 and ANXA1 may be targets of drugs, suggesting these two genes may be implicated as therapeutic targets to reduce risk of zinc deficiency. Notably, there is no currently available information that supports the routine use of zinc supplementation on improving pregnancy outcome [43]. Moreover, it is challenging to the implementation of targeted interventions for reducing the adverse effects of zinc deficiency through therapeutic and preventive supplementation, fortification, and biofortification [39]. Given the side effects of many drugs, especially to the fetus, the best option is to consume dietary zinc (abundant in meat and beans) for improving mild zinc deficiency, rather than using medicines.
In addition, it is noted that the PPI network module 1 was composed entirely olfactory receptor-related genes. Zinc is highly concentrated in the olfactory bulb of the brain and is important for embryonic neural development of the olfactory system [49,50]. In recent years, zinc research has mainly focused on zinc metal nanoparticles, which could enhance odorant responses of olfactory receptor neurons [51]. These results suggested that zinc deficiency may affect development of the central olfactory system. Therefore, these genes may be used as gene biomarkers for zinc deficiency in pregnant mice.
However, the nutrients in mouse diet were largely unknown because the microarray data were downloaded from a public database, and the presence of other metals such as cadmium in the diet that could compete with Zn absorption was unknown. Moreover, we did not perform experiments or analyze another appropriate dataset to validate the differential expression of key genes associated with zinc deficiency. Furthermore, it is hard to extrapolate mouse data to human biochemistry, as the genetic origin of the placenta differs between mice and humans. Further experiments using human samples will provide stronger evidence for clinical guidance. The drug-target gene interaction network. Blue circles represent the key genes, and the white hexagon represents the predicted drug interactions Conclusion Embryonic olfactory system development, immune dysfunction, and systemic inflammation may be disturbed by zinc deficiency. Expression levels of CXCR2, ANXA1, and CCR3, as well as olfactory receptor-related genes (proteins) may be used as biomarker to assess zinc status in mice.