Analysis of lung cancer-related genetic changes in long-term and low-dose polyhexamethylene guanidine phosphate (PHMG-p) treated human pulmonary alveolar epithelial cells

Background Lung injury elicited by respiratory exposure to humidifier disinfectants (HDs) is known as HD-associated lung injury (HDLI). Current elucidation of the molecular mechanisms related to HDLI is mostly restricted to fibrotic and inflammatory lung diseases. In our previous report, we found that lung tumors were caused by intratracheal instillation of polyhexamethylene guanidine phosphate (PHMG-p) in a rat model. However, the lung cancer-related genetic changes concomitant with the development of these lung tumors have not yet been fully defined. We aimed to discover the effect of long-term exposure of PHMG-p on normal human lung alveolar cells. Methods We investigated whether PHMG-p could increase distorted homeostasis of oncogenes and tumor-suppressor genes, with long-term and low-dose treatment, in human pulmonary alveolar epithelial cells (HPAEpiCs). Total RNA sequencing was performed with cells continuously treated with PHMG-p and harvested after 35 days. Results After PHMG-p treatment, genes with transcriptional expression changes of more than 2.0-fold or less than 0.5-fold were identified. Within 10 days of exposure, 2 protein-coding and 5 non-coding genes were selected, whereas in the group treated for 27–35 days, 24 protein-coding and 5 non-coding genes were identified. Furthermore, in the long-term treatment group, 11 of the 15 upregulated genes and 9 of the 14 downregulated genes were reported as oncogenes and tumor suppressor genes in lung cancer, respectively. We also found that 10 genes of the selected 24 protein-coding genes were clinically significant in lung adenocarcinoma patients. Conclusions Our findings demonstrate that long-term exposure of human pulmonary normal alveolar cells to low-dose PHMG-p caused genetic changes, mainly in lung cancer-associated genes, in a time-dependent manner. Supplementary Information The online version contains supplementary material available at 10.1186/s40360-022-00559-5.

a main constituent of HD, is highly correlated with inflammatory lung fibrosis. These concomitant lung injuries following respiratory exposure to HDs are known as HD-associated lung injuries (HDLI) [1][2][3].
Although there are many reports on the association between PHMG-p and clinical manifestations, including lung fibrosis and inflammation, there are few longterm and low-dose studies on its carcinogenic potential. There is convincing evidence that studies are needed to elucidate the relationship between PHMG-p and carcinogenesis. First, in our previous studies, the possibility of tumorigenesis in PHMG-p-instilled rat lung was confirmed using computed tomography (CT) image analysis and an elevated expression of several cancerrelated genes [4,5]. Indeed, in our recent 52-week follow-up study on PHMG-p toxicity, we suggested the possibility of PHMG-p as a lung carcinogen by confirming that PHMG-p causes squamous cell carcinoma in the rat lung [6]. Second, although no specific pattern was observed in the relationship between PHMG and malignant neoplasms, it has been reported that the difference in the effect marginally appears in some infants, within the malignant neoplasms of the digestive tract, respiratory and intrathoracic organs, and leukemia [7]. Third, it is reported that polyhexamethylene biguanide (PHMB), which is structurally similar to PHMG, develops angiosarcoma in the liver during oral exposure [8].
Although the cancer-associated pathways affected by PHMG-p treatment have already been elucidated in a human alveolar A549 cell line, which originates from lung carcinoma, the fact that the research focus was on the epithelial to mesenchymal transition (EMT) and that the in vitro research was conducted in lung adenocarcinoma cells, not by normal cells, might be considered limitations in terms of carcinogenesis research [9,10]. To overcome these limitations, we introduced human pulmonary alveolar epithelial cells (HPAEpiCs) consisting of type I and type II alveolar cells that occupy more than 99% of the internal surface area of the lung. If the cytotoxicity of PHMG-p itself causes acute cell death, long-term subculture cannot be conducted. Therefore, the cytotoxicity of each concentration was measured over a wide range of concentrations and terms, and a concentration that did not cause superficial damage to cells was set and applied to longterm HPAEpiCs sub-culture.
In this study, we showed that PHMG-p induced changes in the transcriptional expression of oncogenes and tumor suppressor genes in terms of lung carcinogenesis in human pulmonary alveolar normal cells through the analysis of total RNA sequencing data. In addition, we designed this study to secure the reliability of candidate genes by comparing three sets of the PHMG-p short-term treatment group and three sets of the longterm treatment group, either individually or together.

RNA isolation.
Total RNA was isolated using TRIzol reagent according to the manufacturer's instructions. RNA quality assessment was performed using an Agilent 2100 bioanalyzer using the RNA 6000 Nano Chip (Agilent Technologies, Amstelveen, Netherlands), and RNA samples were quantified using an ND-2000 spectrophotometer (Thermo Inc. DE, USA).

Library preparation and sequencing.
Libraries were prepared from total RNA using the NEBNext ® Ultra ™ II Directional RNA-Seq Kit (NEW ENGLAND BioLabs Inc., MA, USA). Ribosomal RNA was eliminated using the RiboCop rRNA depletion kit from LEXOGEN Inc. (Vienna, Austria). RNAs that do not contain rRNA were used for cDNA synthesis and shearing, following the manufacturer's instructions. Indexing was carried out using Illumina indexes 1-12. The enrichment step was performed by PCR. Subsequently, libraries were verified using the Agilent 2100 Bioanalyzer (DNA High Sensitivity Kit) to assess the average fragment size. Quantification was performed using a library quantification kit by applying a StepOne Real-Time PCR System (Life Technologies Inc., CA, USA). High-throughput sequencing was performed as paired-end 100 sequencing using a NovaSeq 6000 (Illumina Inc., CA, USA).

Data analysis.
Quality control of the raw sequencing data was performed using FastQC [11]. Low-quality reads (< Q20) and adapters were eliminated using FASTX_Trimmer [12] and BBMap [13]. Then, the trimmed reads were mapped to the reference genome using TopHat [14]. Gene expression levels were estimated using fragments per kilobase per million reads (FPKM) by Cufflinks [15]. The FPKM values were normalized based on the quantile normalization method using EdgeR within R [16]. Data mining of mRNA expression profiling and graphic visualization, including Venn diagrams, was carried out using ExDEGA from Ebiogen Inc. (Seoul, Korea). Gene ontology analysis was performed using the Database for Annotation, Visualization, and Integrated Discovery (DAVID) bioinformatics resources using input and output data files obtained from ExDEGA [17,18]. A hierarchical clustering (HCL) map was created using MultiExperiment Viewer (MeV).

Kaplan-Meier plot analysis.
To determine the prognostic values of the ISG15, MMP1, TRPA1, KRT19, PLAU, FMO3, COL14A1, FMO2, TIMP3, and SLITRK6 mRNA, transcription levels were measured using a Kaplan-Meier (KM) plotter, an open-source database (www. kmplot. com) that consists of gene expression profiles and survival lifetime data of patients with lung adenocarcinoma. The analysis was performed on a total of 719 lung adenocarcinoma patients. Only the JetSet best probe set was used for analysis. The patients were divided into low (black line) and high (red line) expression levels using the optimal expression cutoff point based on the log-rank test and median split procedure. Other statistical results, including hazard ratios (HRs), 95% confidence intervals, and log-rank P values, were also calculated and presented using this database. Statistical significance was set at P < 0.05.

Statistical analyses.
All data were analyzed using GraphPad Prism v.5.0 (GraphPad Software, CA, USA) and are expressed as the mean ± standard deviation. Statistical significance was set at P < 0.05. The log-rank test was used to evaluate the survival differences.

PHMG-p-induced cytotoxicity in HPAEpiCs.
To determine the long-term exposure concentration that is not superficially reactive, such as cell death by PHMG-p, cell viability assays were carried out in HPAEpiCs (Fig. 1). Cytotoxicity and cell viability assays were performed at 24 h, 48 h, and 72 h. The concentration range was set to under 8 ug/ml through a preliminary experiment. The viability of HPAEpiCs decreased in a time-and dose-dependent manner following PHMG-p exposure. The survival rate was 90% with PHMG-p at a concentration of 5 to 6 µg/mL at 24 h, but the concentration gradually decreased in a time-dependent manner, and with 2 ug/mL at 72 h, the viability was approximately 90%. In contrast, a higher susceptibility to PHMG-p was reported in lung-derived IMR-90, A549, and BEAS-2B cells [19]. Taken together, HPAEpiCs constituting the internal surface area of the lung have lower susceptibility to PHMG-p treatment than the various lung-originated cells mentioned above, so it is a suitable model for identifying genetic changes that are not biased towards cell death signaling in terms of long-term PHMG-p toxicity. Considering the period for long-term sub-culture intervals with PHMG-p exposure, the maximum treatment concentration of 1 ug/ml, presenting cell viability of greater than 90% at 72 h, was selected.

Fig. 2
Schematic of PHMG-p low-dose exposure with long-term subculture and cell harvesting. Odd numbers indicate control samples and even numbers indicate PHMG-p-treated samples. The detailed description is present in the main text a significant number of genes were involved in multiple mechanisms related to lung cancer ( Fig. 3a and Table 2). Of these genes, GO analysis indicated that in PHMG-p long-term treated HPAEpiCs, genes mainly involved in response to external stimuli and cell proliferation were upregulated. On the other hand, the genes involved in cell cycle arrest and negative regulation of the cell cycle were increased in response to short-term treatment with PHMG-p (Fig. 3b).
From the above results, to ensure higher reliability, we selected genes with a common tendency to increase or decrease in all sets, not in each set in the same group.
The numbers of changed genes are shown in a Venn diagram (Fig. 4a), and the number of genes that changed in common within each of the three sets of the short-term and long-term treatment groups are shown graphically (Fig. 4b). When the PHMG-p exposure period was prolonged, the number of overlapping genes between the two sets in each group increased from 45 to 57 genes (1.27-fold), but the number of overlapping genes between the three sets increased from 7 to 29 genes (4.14-fold) ( Fig. 4a and b). These results indicated that the 29 overlapping genes in all sets were, with a high probability, altered due to long-term exposure to PHMG-p.

PHMG-p responsive genes associated with lung cancer in PHMG-p-treated HPAEpiCs in a time-dependent manner.
We found that the expression of SNORD95 (small nucleolar RNA, C/D box 95) was significantly elevated in PHMG-p- ; however, the transcriptional levels of these genes showed a fluctuating pattern over time (Table 1).In contrast to the results of short-term PHMG-p treatment, in the long-term treatment, the    (Table 2).

RT-qPCR validation of altered protein coding genes
To validate the candidate genes selected according to the criteria (more than 2.0-fold or less than 0.5-fold) on total RNA sequencing, RT-qPCR was performed. To confirm the increased transcriptional level in upregulated candidates, the expression of MX1, KRT19, HMGA2, ISG15, IL33, MMP1, TRPA1, IFI6, PLAU, CDKN1A, PLAT, AK5, and NT5E was determined by RT-qPCR. In particular, it was confirmed that MX1 significantly increased from 5.4-fold to less than 13.3-fold in the long-term treated sets (Fig. 5). Based on the total RNA sequencing results, CDKN1A was not selected because it did not meet the criteria in the PHMG-p short-term treatment group (set of 4 days), but in this validation phase, it increased up to 6.7-fold in the short-term group (set of 10 days). As a result, it was confirmed that CDKN1A increased 3.5-fold in the short-term treatment group and 3.3-fold in the long-term treatment group (Fig. 5). All selected proteincoding genes were found to be increased in the long-term treatment group (within 35 days from 27 days). Unlike other candidate genes, it has been reported that AK5 expression was reduced due to methylation of the promoter region within the CpG islands in lung adenocarcinoma [45]. The downregulated expression of NDUFA4L2, COL14A1, SLITRK6, TIMP3, FMO3, HBG1, MGP, HBG2, TBX4, GPX3, and FMO2 was also confirmed (Fig. 6). However, the basal expression of COL14A1 was very low, and the cycle threshold (Ct) value of RT-qPCR was not measurable; therefore, it was excluded from Fig. 6. In addition, contrary to the downregulated tendency, NDUFA4L2 was overexpressed in human NSCLC, reported to occur under hypoxic conditions, one of the characteristics of cancer. Its overexpression is a key factor for maintaining NSCLC growth [69]. The tendency of transcriptional alteration of all candidate genes in RT-qPCR was 100% consistent with the total RNA sequencing results because the candidate genes were selected based on matching in all three sets (#13 vs #14, #15 vs #16, and #17 vs #18).

Discussion
In this study, we confirmed the distorted homeostasis of oncogenes and tumor suppressor genes that change in a time-dependent manner by PHMG-p in normal human lung alveolar cells. Previous in vitro studies have investigated the effect of PHMG-p exposure on lung cells and (i) focused on fibrotic inflammation, (ii) used immortalized cells or lung cells derived from malignant tumors, and (iii) identified a molecular mechanism that changes with short-term PHMG-p treatment [2,9]. To overcome these limitations, we (i) focused on carcinogenesis-related genetic changes, (ii) introduced human type I and II normal alveolar cells that make up most of the inner surface of the lungs, and (iii) obtained results through a longterm exposure procedure. In the case of RNA sequencing, instead of repeating the experiment with same conditions, 3 groups were selected from short-term treatment and 3 groups from long-term treatment to increase  Fig. 4 was selected by reflecting the fold change, p-value, and normalized data (log2) values rather than reflecting only a single variable. Therefore, it is likely that the differences in the selected genes between each group will be increase. However, we think that the number of genes that change in common during long-term culture is more reliable because they were commonly selected despite applying 3 variables mentioned above. In addition, we suggested the possibility that PHMG-p can increase lung carcinogenesis by performing GO, heatmap cluster, differentially expressed gene (DEG) analysis including pre-mRNA, lncRNA, and miRNA, and KM plotter analysis with selected genes. Within 10 days of exposure to PHMG-p, two proteincoding genes and five non-coding genes were altered. In the PHMG-p group treated for 27-35 days, 24 proteincoding and 5 non-coding genes were identified in the PHMG-p group. Interestingly, in our selection criteria for candidate genes, the number of ncRNAs did not change, but the number of protein-coding genes increased by a factor of 12. In addition, it was found that the degree of change in the case of ncRNAs was largely compared to the degree of change in the protein-coding genes (Tables 1 and 2), but it is not clear whether this phenomenon is a transcriptional characteristic of ncRNA itself or a characteristic of reactivity in PHMG-p treatment.
To confirm whether the selected genes are related to lung cancer, we searched all published papers and also listed genes that contradict our hypothesis to eliminate bias. MT1G was upregulated in all three short-term treatment groups. Although the expression of MT1G in breast, thyroid cancer, and hepatocellular carcinoma was downregulated compared to that in non-cancerous tissue, it has been reported that the expression of MT1G in NSCLC is higher than that in non-malignant lung tissues [21,[77][78][79]. Furthermore, MT1G was enriched in the most aggressive large-cell lung carcinoma, and high expression of MT1G correlated with poor prognostic values in 24 lung large-cell lung carcinomas [22]. However, only one group contended that the expression of MT1G was lower in lung cancer tissues than in peri-cancer tissues [23]. In a different trend from altered genes, CDKN1A mRNA gradually increased in the shortterm treatment group and continuously decreased in the long-term treatment group. Inferring from the above results, the possibility exists that expression of CDKN1A was upregulated to inhibit cell growth in the short term, but it is thought that as many oncogenes increase, they lose their ability to regulate homeostasis with respect to growth. IL-33, a member of the IL-1 family that promotes the production of Th2-related cytokines, was increased in our study (Table 2). In addition, reports on the elevated expression levels of IL33 and NT5E have focused on NSCLC [40][41][42][46][47][48]. Indeed, blockage of IL-33 is known to prevent the growth of NSCLC by inhibiting M2 macrophage polarization and reducing the accumulation of Treg cells in the tumor microenvironment [40]. NT5E is a novel target for the treatment of many cancer types, and various NT5E/CD73 inhibitors are currently being tested in clinical trials. In NSCLC, the level of NT5E, a target gene of miR-30a-5p, is increased due to decreased expression of miR-30a-5p. NT5E also contributes to the survival of NSCLC by inhibiting its function by trapping miR-134 [46,48]. IFI6 and MX1 tended to increase serially as the exposure time of PHMG-p increased. The IFI6 protein plays a central role in resistance to apoptosis in various cancer types [80,81], and MX1 protein levels are increased in lung adenocarcinoma [25]. Furthermore, it has been reported that MMP1, HMGA2, TRPA1, and PLAU are highly expressed in various subtypes of lung cancer [26-28, 31-34, 43, 44, 49-51]. However, ISG15, COL14A1 and HBG2 are mainly decreased in lung adenocarcinoma [30,64,70,71].
TBX4, which is involved in the regulation of embryonic developmental processes, is downregulated in lung cancer [56][57][58], and it was decreased in all sets of the longterm treatment group. TIMP3 is also known to play a role in inducing apoptosis and suppressing NSCLC growth [59,60]. The COL14A1 promoter region was confirmed to be hypermethylated, with a probability of 60.4% in 48 NSCLC patient samples [64]. Although the role of FMO2, whose main function is an NADPH-dependent enzyme, in tumorigenesis is still unclear, it has been reported to play a role as a tumor suppressor in lung adenocarcinoma [65,66]. GPX3, a scavenger of reactive oxygen species, is known to inhibit the growth, invasion, and migration of various lung cancer cells, including h157, h460, h1299, h1650 h1975, and A549 [67,68]. HBG1 and HBG2, the gamma globin genes, have been reported to have low expression in NSCLC, including adenocarcinoma and squamous cell carcinoma [70,71]. MGP expression was found to be reduced during the symptomatic illness stage in lung cancer [72].
In malignant tumors, there are reports of distorted homeostasis between oncogenes and tumor suppressor genes, including failure to regulate the expression levels of ncRNAs [82,83]. Most of these ncRNAs cannot directly bind to ribosomes and can be translated into proteins, but they can interact with other coding genes or non-coding genes; by binding to several proteins, they affect the interactions between proteins and serve as sponges for microRNAs [84,85]. The expression of SNORD95 was upregulated in PHMG-p-exposed samples, and it was also reported that SNORD95 was increased in 11 lung squamous cell carcinoma and 11 lung adenocarcinoma samples compared to their matched normal samples [86]. Conversely, the expression of SNORA75, SNORA28, and NARR was downregulated in our study. However, there are no previous reports that the three genes described above are decreased in lung cancer. In the case of MIR3687-2, the FC value was significantly increased, especially in the #16/#15 set. In the case of MIR490, a significant decrease was observed in all sets. Therefore, there is a need to find the putative target genes of microRNA-3687 and microRNA-490-3p and -5p. FGFRL1 (fibroblast growth factor receptor-like 1), NCS1 (neuronal calcium sensor 1), UCN2 (urocortin 2), TMEM167B (transmembrane protein 167B), and CDR1 (cerebellar degeneration-related protein 1) are predicted targets of miR-3687. VDAC1 (voltage-dependent anion channel 1), TMOD3 (tropomodulin 3), COMMD10 (COMM domain containing 10), HNRNPA1 (heterogeneous nuclear ribonucleoprotein A1), PIP4K2B (phosphatidylinositol-5-phosphate 4-kinase, type II, beta) are putative targets of miR-490-3p. ZNF627 (zinc finger protein 627), TTC29 (tetratricopeptide repeat domain 29), BTC (betacellulin), CST8 (cystatin 8), and LILRA3 (leukocyte immunoglobulin-like receptor, subfamily A (without TM domain), member 3) are putative targets of miR-490-5p which were predicted in the microRNA target prediction programs, such as TargetScan and miRDB. Surprisingly, it was recently reported that among the two miRNAs that change in a PHMG-p-dependent manner, miR-3687 is increased in lung squamous cell carcinoma and miR-490 is decreased in lung squamous cell carcinoma and lung adenocarcinoma [52].
Furthermore, since the HPAEpiCs used in this research are known to consist of pulmonary alveolar type I (AT1) and alveolar type II (AT2), we identified AT1 and AT2 specific gene markers. AT1-specific genes, IGFBP2, CAV1, and CAV2, indicated a tendency to increase as the subculture days prolonged. Therefore, we inferred that AT2 cells in HPAEpiCs gradually differentiated into AT1 cells. In addition, considering the normalized data, we confirmed that most HPAEpiCs consisted of AT1 cells, and that AT2 cells were also included (Table S2) [87][88][89][90][91][92].
Our follow-up studies will be focused on delineating the transcriptional regulation of ncRNAs including microRNA, snoRNA, and lncRNA altered by PHMG-p, discovering their interacting genes. Also, there is a need to investigate the molecular mechanisms in human bronchial and tracheal epithelial cells that are not limited to alveolar epithelial cells but are primarily exposed upon inhalation of respirable particles containing PHMG-p.
Taken together, for the first time, we confirmed and validated the distorted regulation of repetitively altered genes in three experimental sets of PHMG-p long-term treatment groups using normal pulmonary alveolar cells, which constitute the majority of the internal surface of the lung. In addition, most of the altered genes are closely related to lung cancer and the survival rate of patients with lung adenocarcinoma.

Conclusions.
Based on our description, we suggest that PHMG-p, as the main gradient of humidifier disinfectants, has a carcinogenic potential in normal human lung alveolar cells in case of long-term exposure.