The results show that the majority of the drugs have the potential to cause systemic toxicities (70–90%) (Fig. 2; Table 1; Supplementary Table 1). However, it is also reported that the occurrence of the drug-induced toxicities depends on many factors such as dosage, metabolites reactivity, structural alert metabolism, competition for detoxification pathways, and individual differences between patients [3, 22,23,24]. The results of the analyses have also shown that the majority of the drugs (72%) were potentially reactive, unstable or toxic, and 70% of them were also potentially toxic at normal treatment concentrations, based on their doses or LD-50 values evaluation (Supplementary Table 1, Fig. 2). The activity of a structural alert depends on the molecule to which it is attached, as illustrated in the thiophene structural alert in methapyrilene and eprosartan, where in methapyrilene, the structural alert undergoes bioactivation but the same thiophene is not activated in eprosartan. Methapyrilene was withdrawn from the market because of hepatotoxicity while eprosartan is safe and is prescribed for hypertension [25, 26]. In the USA, 50% of the 200 most frequently prescribed drugs were found to have at least one structural alert, yet the majority of the drugs with structural alerts were not associated with IADRs. This shows that structural alerts do not predict metabolism and toxicity adequately. Therefore, some medicines may be designated as safe or unsafe when actually it is not true [3, 22]. The drug-induced toxicities caused by structural alerts and reactive metabolites may be caused by either covalent interactions or noncovalent interactions with cellular macromolecules such as DNA, proteins and lipids , but in many cases their exact mechanism is not known [3, 22]. About 78–86% of drugs, which is a substantial proportion of drugs used, are found to have structural alerts linked to particular organ toxicity or unspecified adverse drug reactions, most of which are idiosyncratic adverse drug reactions (IADRs). It is also widely reported that a substantial proportion (62–69%) of drugs in use have the potential to form reactive metabolites. However, not all structural alerts lead to toxicities as there are some molecules that contain structural alerts but never generate toxic effects .
The results of our study are in agreement with a study by Liu et al.  and Pizzo et al.  in which they found some structural alerts that had the potential for liver induced toxicities and the drugs that contained such structural alerts. The structural alerts that the studies found were compared with the structures of the drugs in the MEML and it was found that indeed some drugs had toxicity potential. When these drugs were fed into the StopTox software, it was confirmed that these drugs had at least one predicted toxicity (Fig. 2).
Predicting toxicity too often maybe regarded as a safe position that reduces the risk of exposing patients to dangerous drugs. However, it also has the potential to stop the development of useful drugs that are not actually toxic in clinical use. Prediction in-silico is currently a useful tool but cannot replace in-vivo or in-vitro toxicity testing.
This study also showed that there was generally consensus between the three models for the majority of the medicines studied (Figs. 2 and 3). However, greater consensus was noted between Stoptox and the other two models. This could be attributed to the fact that the StopTox incorporates all the parameters used for the development of the other models, i.e., functional groups or structural alerts for ToxAlerts and doses in case of LD-50 values. The lower consensus between LD-50 values and ToxAlerts could be attributed to the limited overlap between the two models, i.e., LD-50 uses only concentration while ToxAlerts uses largely the functional groups of structural alerts although both parameters play significant role in the two models, explaining the siginificant overlap observed in the two models. The differences in the overlap amongst the models are worrying as they bring an issue of choice of the model to be used. Making a choice without substantial evidence could be challenging. In the literature, there are no guidelines yet for choosing a model for predicting toxicity of the drugs. Furthermore, although the two programmes were compared, it was challenging to attribute their similaries or differences to their limitations or strengths because they mostly had different endpoints for the toxicity predictions, which makes comparison a little inconclusive. Furthermore, it is challenging to make a conclusive decision based on the overall consensus because it is still on the lower side despite being over 50%.
Therefore, it is important to conduct further studies in Malawi starting with clinical data to find out if the indicated toxicities indeed occur or not for the medicines designated as toxic or not. Furthermore, it is also important to find out if the underlying conditions suggestive of the toxicity occurrence are also present or absent for the medicines found to be toxic or non-toxic. This would act as an evaluation of the clinical application of the software as well as the provision of further data for the development of the software so that it can be as reliable as possible for future use for clinical, research and regulatory purposes. However, the results can still be useful now as they would give a guided identification of the targeted toxicities for the health care providers, which may enhance the patients’ safety. Furthermore, either the models should be improved to minimize the discrepancies shown in the medicines analysis, or guidelines should be developed for use of each of the models. Therefore, there is need for further development of the models for them to be able to replace animal studies whilst producing significantly reliable toxicological screening computer generated data or information.
This study was not spared of some difficulties. One of the limitations is that there were some difficulties in the identification of the structural alerts manually and this was minimised by using software that is designed to identify the structural alerts. Another limitation is that full interpretation of the results from software relied on US partners that we collaborated with and they have not yet given us the full interpretation especially on the differences of the colours of the highlighted functional groups. However, even without this information, we were able to get the results and interpret their meanings by the software. The use of LD-50 value ranges might use some values that were not necessarily the oral LD-50 values, and that might have affected the categorization of the medicines as toxic when they are not.