Modeling skin sensitization potential of mechanistically hard-to-be-classified aniline and phenol compounds with quantum mechanistic properties

Background Advanced structure-activity relationship (SAR) modeling can be used as an alternative tool for identification of skin sensitizers and in improvement of the medical diagnosis and more effective practical measures to reduce the causative chemical exposures. It can also circumvent ethical concern of using animals in toxicological tests, and reduce time and cost. Compounds with aniline or phenol moieties represent two large classes of frequently skin sensitizing chemicals but exhibiting very variable, and difficult to predict, potency. The mechanisms of action are not well-understood. Methods A group of mechanistically hard-to-be-classified aniline and phenol chemicals were collected. An in silico model was established by statistical analysis of quantum descriptors for the determination of the relationship between their chemical structures and skin sensitization potential. The sensitization mechanisms were investigated based on the features of the established model. Then the model was utilized to analyze a subset of FDA approved drugs containing aniline and/or phenol groups for prediction of their skin sensitization potential. Results and discussion A linear discriminant model using the energy of the highest occupied molecular orbital (ϵHOMO) as the descriptor yielded high prediction accuracy. The contribution of ϵHOMO as a major determinant may suggest that autoxidation or free radical binding could be involved. The model was further applied to predict allergic potential of a subset of FDA approved drugs containing aniline and/or phenol moiety. The predictions imply that similar mechanisms (autoxidation or free radical binding) may also play a role in the skin sensitization caused by these drugs. Conclusions An accurate and simple quantum mechanistic model has been developed to predict the skin sensitization potential of mechanistically hard-to-be-classified aniline and phenol chemicals. The model could be useful for the skin sensitization potential predictions of a subset of FDA approved drugs. Electronic supplementary material The online version of this article (doi:10.1186/2050-6511-15-76) contains supplementary material, which is available to authorized users.


Background
Skin sensitization related dermatitis and rash represents the most common manifestation of chemical immunotoxicity in humans, which results in a cost estimated $1 billion annually due to lost work, reduced productivity, medical care, and disability payments in USA [1,2]. In addition, as part of the regulatory review process, an increase in the incidence of skin allergies and hypersensitivity-related adverse events associated with the use of FDA regulated products or approved drugs has been observed, suggesting a safety gap between premarket review and the post market surveillance [2].
Common testing methods to assess skin sensitization potential of materials include: (1) guinea pig maximization test (GPMT); (2) murine-based local lymph node assay (LLNA). In GPMT tests, hazard identification is done by visual observations of erythema and edema reactions, which are subjective, are difficult to differentiate between contact allergens and strong irritants, and is time consuming [3]. The LLNA is recommended by international regulatory agencies; however, inconsistencies between LLNA and clinical observations have been documented [4]. Considering the existence of vast compounds around today, developing rapid and effective methods for chemical sensitizer identification/risk assessment is still a challenge [2].
In silico approaches are an attractive alternative to animal testing through analyzing the structural features of sensitizers/non-sensitizers to derive predictive rules or models [5]. The risks of thousands of commercially available chemicals could be assessed in a cost effective manner. Among these approaches, mechanism based rules, which investigate the structural characteristics of sensitizers, are promising [6].
Historically, the first study of chemical reactivity and skin sensitization was reported in 1936 [7]. A mechanism of small organic molecules to form an immunogenic complex by reacting with macromolecules (proteins or others) in the skin to cause sensitization was postulated. Currently, a more plausible mechanism reported involves a formation of covalent bonding between electrophilic allergens and nucleophilic moieties of amino acids from skin proteins (usually side chains) [8]. Such amino acids include cysteine thiol (mainly) and lysine (amino), and to a lesser extent arginine, histidine, methionine and tyrosine [9]. Based on the well-established principles of mechanistic organic chemistry, the skin sensitization potential of a chemical in many cases was predicted by its reactivity with these residues [9,10]. However, some compounds need to be activated via either autoxidation outside the skin (prehaptens) or bioactivation inside the skin (prohaptens) to be able to form immunogenic complexes with skin proteins [11].
Structure-activity relationship (SAR) studies of skin sensitization potential have been successfully carried out for epoxyaldehydes [12], enone [13], halogenated aromatics [14], benzaldehydes [15], dienes [16], oximes [17], aldehydes [18] and epoxides [19]. Aniline/aromatic amine and/or phenol derivatives are two large classes of frequently sensitizing chemicals. Quite a few pilot studies have been conducted [20][21][22][23]. Roberts et al. specifically investigated the sensitization mechanisms of diaminobenzenes or dihydroxylbenzenes [24]. However, the predictability of the skin sensitization potential for these two classes of chemicals is unsatisfactory [6,25]. Further exploration of novel sensitization mechanisms will be informative for constructing better SAR models/rules. In addition, aniline and phenol moieties that are often present in approved drugs can also cause skin sensitization. For example, contact dermatitis occurs in one individual following prolonged subcutaneous infusion of hydromorphone [26], a cancer pain treatment agent which contains one phenol moiety.
Drug-induced skin reactions may be associated with several biological mechanisms, but in many cases the precise mechanism is unclear [27]. It is well-known that Type IV allergic reaction induced by many chemicals and drugs is a T-cell mediated delay type hypersensitivity which can cause skin sensitization/dermatitis [27].
In this study, we intended to establish an in silico model for a class of mechanistically hard-to-classify anilines and phenols to study the relationship between their chemical reactivity and biological allergic response. We then investigated sensitization mechanisms of action associated with these compounds based on the features of this model. The model was further utilized to analyze a subset of FDA approved drugs containing aniline and/or phenol groups in skin sensitization potential. The predicted skin sensitization potential for these drugs was validated according to relevant literatures and adverse event reports.

Data sets
A data set of 63 chemicals, including 30 anilines and 33 phenols, was collected from published literature [11,23,[28][29][30][31][32][33][34][35][36]. Chemicals with well-known allergic mechanisms, i.e. Michael acceptors (MA), SN 2 electrophiles, S N Ar electrophiles, Schiff base formers, and acylation agents, were excluded from the data set. For example, pentachlorophenol (CAS: 87-86-5) belongs to S N Ar electrophiles; benzyl salicylate (CAS: 118-58-1) and 3,3′,4′,5tetrachlorosalicylanilide (CAS: 1154-59-2) are acylation agents. In addition, chemicals having two OH and NH 2 substituents at aromatic rings were also excluded from the data set because these compounds are known to easily form a benzoquinone (a Michael acceptor) or a nitrogen analogue of benzoquinone (also a Michael acceptor) [24]. Finally, a list of 30 chemicals was obtained (Table 1). They represent a class of mechanistically hard-to-be-classified compounds because they can't be classified into any of the abovementioned five categories. These compounds were then randomly split into a training set of 15 compounds and a test set of 15 compounds. As shown in Table 1, the training set includes 7 anilines and 8 phenols, while the test set includes 6 anilines and 9 phenols. The detailed information including initial screening of the 63 selected chemicals is available as Additional file 1.

Quantum mechanics calculations
All chemical optimization and subsequent orbital analysis were performed by using the Gaussian 03 suite of programs [37]. Chemicals were optimized using the AM1 Hamiltonian with the default optimization criteria [38,39]. Calculations of the frontier molecular orbital, charge distribution and other quantum properties were carried out by using the 6-31Gd basis set. The quantum descriptors used in this study include the energies of the highest occupied molecular orbital (ϵ HOMO ), the lowest unoccupied molecular orbital (ϵ LUMO ), the second lowest unoccupied molecular orbital (ϵ LUMO+1 ), the second highest occupied molecular orbital (ϵ HOMO-1 ), the Mulliken atomic charges of the most negative (Q min ) and most positive atoms (Q max ), the Mulliken atomic charges of the N atom (Q N ) in anilines or O atom (Q O ) in phenols, the average of the absolute values of the charges on all atoms (Q m ), and molecular dipole moment (μ). The shapes of the resulting orbitals were visualized using the GaussView application within Gaussian 03. All structures were either drawn or converted from SMILES (Simplified molecular-input line-entry system) strings, using Chembiodraw Ultra V12.0 (PerkinElmer Informatics Desktop Software).

Statistical analysis
The skin sensitization potency of a compound was symbolized by 1 (Yes) and 0 (No). The values of each quantum descriptor were linearly normalized to the same range (0 to 1), stepwise linear regressions between the quantum properties and experimental outputs of the training set were performed by the statistical package of R program version 3.0.0 [40]. The properties with lower weighting factors were abandoned in the second step of linear regression.

Results and discussion
The compounds with aniline and/or phenol moieties can be classified into a single subclass for consideration of skin sensitizers. However, not all of the compounds possessing aniline or phenol groups are sensitizers, suggesting some compounds can form covalent bonds with skin proteins whereas others cannot. In this study, the sensitization potential of anilines and phenols were modeled using quantum mechanical descriptors.
Modeling the skin sensitization potential by quantum properties of anilines and phenols The coefficient constant of ϵ HOMO was determined as the highest weighting factor based on the results of linear regression analysis. The skin sensitization potential of anilines and phenols can be formulated as: The median of the symbolized skin sensitization potency, 0.50, was considered as the threshold for prediction of sensitizers and non-sensitizers. An aniline or phenol is predicted as a sensitizer if P is greater than 0.50, and as a non-sensitizer if P is below 0.5. With a threshold of P =0.50, Equation 1 implies that a chemical within the applicability domain is predicted to be a skin sensitizer if the HOMO energy is greater than −0.30 Hartree ((0.5-5.08)/15.30 = −0.299 ≈ −0.30). The experimental allergenicity categories (sensitizer or non-sensitizer) and predicted results of the training set are shown in Figure 1, where red-open circle dots indicate wellknown sensitizers at 1 and non-sensitizers at 0, respectively. The blue-solid-diamond dots indicate the predicted values. All of the training compounds were correctly predicted by Formula 1. The same model was applied to the test set. Interestingly, all test compounds were correctly predicted ( Figure 2). The total prediction accuracy of chemicals in training and test sets was 100% (30/30). The model shows very high accuracy and only depends on the value of ϵ HOMO , suggesting that ϵ HOMO is a key factor for the assessment of skin sensitization potential of those aniline and phenol containing compounds.
The linear relationship between the predicted values and ϵ HOMO also suggests that a chemical with higher predicted value implies a higher reactivity for oxidation consequently resulting in higher skin sensitization potential. The LLNA data as a quantitative endpoint, posed Figure 1 Correlation of skin sensitization potential of anilines and phenols in the training set between experimental allergenicity categories and predicted values from the model built with quantum mechanistic properties. Experimental allergenicity categories: 1 for sensitizer and 0 for non-sensitizer respectively; Predicted Value (P) = 15.30 *ϵ HOMO + 5.08. A compound with a P greater than 0.50 is predicted as a sensitizer; otherwise, it is predicted as a non-sensitizer. Figure 2 Correlation of skin sensitization potential of anilines and phenols in the test set between experimental allergenicity categories and predicted values from the model built with quantum mechanistic properties. Experimental allergenicity categories: 1 for sensitizer and 0 for non-sensitizer respectively; Predicted Value (P) = 15.30 *ϵ HOMO + 5.08. A compound with the P greater than 0.50 is predicted as a sensitizer; otherwise, it is predicted as a non-sensitizer. a semi-dose-dependent manner, allows for prediction of potency. The EC3 values (effective concentration for a three-fold proliferation of lymph node cells) from the reported LLNA experiments of most allergen phenols were also collected as shown in Table 1 For most chemicals, their -logEC3 values correlate with P values quite well, but for aniline, its -logEC3 value is much less potent than its P value predicted. This may indicate that the initial oxidation of aniline, which is quite fast, is not in this case the rate-determining step for protein haptenation. The analysis of the relationship between EC3 and ϵ HOMO for these nine chemicals was reported in the Additional file 1.

Possible reaction mechanisms of aromatic anilines and phenols
Occurrence of electrophilic-nucleophilic reactions between chemical and skin proteins is a primary reason of chemical induced skin sensitization [8]. Most chemicals with high skin sensitization potential can be classified as Michael acceptors (MA), SN 2 electrophiles, S N Ar electrophiles, Schiff base formers, or acylation agents. The reaction mechanisms of anilines and phenols, however, are poorly understood and very few of them can be classified into the aforementioned five categories. One proposed mechanism is that sensitization occurs via oxygen attack ortho to an amino group or via oxidative quinonemethide formation [25,41]. For example, Roberts et al. reported the mechanistic chemistry of aromatic diamino-, dihydroxy-, and amino-hydroxy compounds [24] where two parallel chemical mechanisms were described as the most possible processes: oxidation to electrophilic (protein reactive) quinones, quinone imines, or quinone di-imines or formation of protein reactive free radicals. These mechanisms, unfortunately, are not applicable to the all single NH/OH substituted anilines and phenols. For instance, aniline and 4-butylaniline are sensitizers whereas 4aminobenzoic acid, 4-aminobenzenesulfonamide, and 4aminobenzenesulfonic acid are non-sensitizers. Beside the solubility effects and the formation of ions/zwitterions, the reactivity variety of chemical entities by substituent effects play an important role in reducing dermal penetration and immunogenicity of protein conjugates.
By analyzing the relationship between quantum properties and chemical reactivity, we successfully modeled the skin sensitization potential of two groups of chemicals (aromatic anilines and phenols) with a single coefficient of ϵ HOMO , while the energy of the lowest unoccupied molecular orbital (ϵ LUMO ), considered as the critical factor for most electrophilic reactions [8,11], was poorly correlated with sensitization potential. These results suggest the skin sensitization mechanism of those compounds may result from several steps but not a directly electrophilic reaction.
The ϵ HOMO dependent results implied that a process of losing electron may be involved in the activation of those sensitizers. Those compounds may be activated via an autoxidation mechanism to further interact with skin proteins as prehaptens. In addition, the mechanisms where these chemicals directly react with free radical of skin proteins also should be considered [42]. In the present study, we proposed that two potential pathways could lead these compounds to cause skin sensitization as shown in Scheme 1 [42]. In the first pathway (Scheme 1a), an aniline (or a phenol) is readily oxidized to a radical cation through loss of an electron at the aromatic ring [43] and forms two possible reactive intermediates. A protein-associated sulfhydryl radical then attacks the aromatic of the radical cation to form a covalent bond at the orth-or para-position. Or, the reactive intermediates bind to nucleophilic moieties on proteins through the Michael addition. The second pathway corresponds to what Lepoittevin defined as a direct haptenation route [44], whereby attack of a protein associated sulfhydryl radical on the ring gives an intermediate radical (Scheme 1b).

Predicting skin sensitization potentials of a subset of FDA approved drugs with aniline and phenol groups
There are no effective tools to predict the skin sensitization potential of drugs, because drug-induced skin reactions may be caused by several mechanisms either single or mixed [27]. The skin sensitization, in the context, refers to T-cell mediated sensitization (type IV allergy). The reaction of chemicals with proteins was recognized as one of the necessary process of the T-cell mediated sensitization [23]. The in silico mechanistic models may offer valuable insights into better understanding the initiation of drug induced allergies.
We collected 53 drugs containing aniline and/or phenol moieties from the DrugBank database. The information of these 53 compounds is also available in the Additional file 1. These FDA approved drugs were then analyzed to filter out those with structural alerts of skin sensitization. The sulfonamide drugs were also removed due to they have different mechanisms of action.   be oxidized to a hydroxylamine metabolite and subsequently form a reactive nitroso intermediate by autooxidation that enables it to react with skin proteins [45]. Finally, twenty six compounds were obtained and their skin sensitization potential was predicted by our model. Among these 26 compounds, six of them were reported to be able to induce "allergic dermatitis" according to the side effect information in MetaADEDB database (Table 2). Interestingly, our results showed that five of them, e.g. Clenbuterol, Dapsone, Morphine, Hydromorphone and Raloxifene were correctly predicted as sensitizers (Table 2) as their P values are greater than the threshold 0.50. In addition, allergic dermatitis is a rare side effect of Liothyronine according to the information from https://www.universaldrugstore.com/medications/ Liothyronine/side-effects. However, users should be cautious that there is no label for drugs not causing "allergic dermatitis", thus it is hard to find a negative set in FDA approved drugs to further evaluate our model.

Conclusion
This study has demonstrated how quantum chemical calculations can be utilized to predict skin sensitization potential and to infer the reaction mechanism for a class of mechanistically hard-to-be-classified chemicals containing aniline and phenol moieties. The outcomes emphasized that the energy of highest occupied molecular orbital plays an important role for predicting skin sensitization potential of these compounds, indicating the activation process occurred via either autoxidation or direct reaction with free radical. Our model was further applied to predict the allergenic potential of the approved drugs containing aniline and/or phenol moieties. Several of these drugs were identified as sensitizers and the prediction agreed well with their "allergic dermatitis" side effect. Thus, the data indicate that our newly developed in silico algorithm shows promise as a preclinical risk assessment tool for screening allergenic potential.
Again, we should point out that skin allergic reactions are not commonly seen for drugs given via the oral route. Though they may share similar mechanisms, caution should be taken when extrapolating our model from skin sensitization potential for topically applied chemicals to predict "allergic potential" of drugs.

Additional file
Additional file 1: Predicted values and experimental data of reported chemicals and FDA approved drugs that contain aniline and/or phenol moieties.
Abbreviations ACD: Allergic contact dermatitis; AM1: Austin model 1; CAS: Chemical abstracts service; EC3: Effective concentration for a three-fold proliferation of lymph node cells; FDA: Food and drug administration; GPMT: Guinea pig maximization test; HOMO: Highest occupied molecular orbital; LLNA: The murine-based local lymph node assay; LUMO: Lowest unoccupied molecular orbital; MA: Michael acceptors; SAR: Structure-activity relationship; SMILES: Simplified molecular-input line-entry system; SN 2 : A kind of nucleophilic substitution reaction mechanism; S N Ar: Nucleophilic aromatic substitution.

Competing interests
The authors declare that they have no competing interests.
Authors' contributions OQ and LW carried out the experiments; OQ, LW and X-QX designed the study and carried out the analysis. OQ, LW, YM and X-QX interpreted the data and drafted the manuscript. X-QX supervised the progress and critically revised the manuscript. All authors read and approved the final manuscript.

Acknowledgements
OQ would like to acknowledge the support from the National Natural Science Foundation of China (NSFC21202201).

Disclaimer
The mention of commercial products, their sources, or their use in connection with material reported herein is not to be construed as either an actual or implied endorsement of such products by the Department of Health and Human Services. The findings and conclusions in this article have not been formally disseminated by the Food and Drug Administration and should not be considered to represent any agency determination or policy.

Release of copyright permission
There is no copyright in U.S. Government work (per 17 U.S.C. 105), and the work I am providing is a U.S. Government work. A drug is predicted as an sensitizer if its P value is greater than 0.50; Otherwise, as a non-sensitizer. b Compounds having the keywords "allergic dermatitis" in their side effect reports in the MetaADEDB database are indicated as sensitizers. Y: Sensitizer; N: Non-sensitizer.