AI to Transform the Drug Approval Process and De-Risk Drug Development

By Cherubin Manokaran



Manokaran C. AI to transform the drug approval process and de-risk drug development. HPHR. 2021;40. 10.54111/0001/NN4

AI to Transform the Drug Approval Process and De-risk Drug Development


The public tends to view high drug prices through the lens of a contentious political argument about the right healthcare model. Scientists might choose to view research as the path to better and cheaper drugs. However, there is a bridge between science and business that might play as great a role in drug prices – the drug approval process. The greatest issue facing the pharmaceutical industry seems to be the risk associated with drug development, because companies invest significant amounts of money in products that frequently fail to gain approval. Though discovering new disease targets has the benefit of increasing the number of ways to attack a disease, it certainly does not offer any new assurance about performance in humans. In fact, very few drugs are effective in humans, which becomes especially crippling for small companies. However, the government and FDA can greatly reduce this risk. The pharmaceutical industry would benefit from an approval system that more effectively decided whether companies should move forward with drug development. And, despite a growing body of data about drugs that fail or succeed, approvals still rely on crude comparisons and arbitrary expectations for performance. For these reasons, researchers should mobilize AI to improve how the government and pharmaceutical companies evaluate their preclinical and clinical data for new drug candidates to ensure, ultimately, that more pharmaceutical companies develop drugs that are likely to succeed.


Drug prices are at the forefront of most political conversations; however, the public and politicians often underestimate the inherent difficulty of science. Every stage of drug research and development is prone to failure. In the same way that physician compensation can be attributed to years of schooling and accumulated debt, high drug prices can be attributed to risk. Regulation of drug prices is important, but addressing the inherent risk of drug development is as critically important. To assist companies, the government has enacted various avenues of economic support through the FDA. For instance, companies can seek public funds to support clinical investigation and may be able to charge study participants for investigational drugs (1). If a drug is approved, companies can apply for extensions to their drug patents or for exclusivities to keep generics off the market (1). These factors certainly incentivize innovation, but fail to improve the likelihood of success. In fact, the industry remains at a success rate below 10%, despite seeing a historically high number of new drug approvals over the past decade (3,6). Late-stage success has modestly improved, but over one-third of drugs still fail at phase III, which is a major problem considering the resources required to reach that stage (6,8). Therefore, alternative measures are required to attenuate risk.


One way to de-risk drug development is to develop computational methods to find and validate new drug targets and compounds, as well as new drug response markers, for individual diseases. However, as a computational problem, using AI to consolidate all information about any potential new drug into one metric with a threshold to determine its approval status at each stage of development will be more widely applicable. In principle, this is a more effective way of determining whether one drug candidate is better than others and likely to be successful. AI has often been discussed and implemented in the context of drug discovery and even medical diagnosis but has not been applied to the more rudimentary problem of drug approvals. Replacing inadvertently crude, inconsistent, and often non-objective drug approval standards are an unmet need that AI has the capacity to address both promptly and effectively. Presenting an opportunity to truly de-risk drug development, AI will promote accuracy, objectivity, and specificity in the drug approval process to ensure that more pharmaceutical companies are developing novel, successful drugs.

Drawing Parallels

One rationale for AI-based diagnostics is increased accuracy and objectivity. In fact, the accuracy of physicians at diagnosing disease and coming up with care plans for patients can be highly variable. Some studies have strongly suggested an improvement in diagnosis accuracy with AI (13,16). Though most will be skeptical of AI alone in the clinic, the principle of accuracy is one to keep in mind. The other, more important consideration is the criteria physicians or computers, for that matter, use to assess patients. Presently, the appropriate ranges for various medical measurements originate from small populations. AI and plenty of medical data will allow bioinformaticians to utilize the same measurements from a much larger patient population. This would provide an objective measurement for physicians and patients to use. If combined with the accuracy of AI, patient diagnosis and management would drastically improve. Although not AI-based, disease risk modeling has shown glimpses of this.


The objectivity problem has pushed disease risk modeling toward large patient populations. Determined from nearly 40,000 patients, the MAGGIC score for heart failure risk overtook several other models relying on small, fairly homogenous patient cohorts (15). Additionally, the developers of the MAGGIC score were able to use several, accessible risk factors to robustly predict disease risk because of the large patient population (15). This raises two points. First, the most fundamental purpose of a risk score is to obtain a measurement that is more readily interpretable than several, individual results. Second, the predictive power of risk models increases when they are derived from large patient populations. By introducing more precision, validity and reliability in both aspects, risk models such as the MAGGIC score improve how scientists and clinicians evaluate disease risk.


A similar scenario exists in the drug approval process, which is marked by large amounts of data but arbitrary expectations for performance at every stage (9). Researchers will continue to improve the science that propels drug discovery, but improving how the field determines that a company should or should not move forward with a drug is crucial. The truth is that science is inherently difficult, much like disease diagnosis. There is too much room for drug failure, an outcome that is tough to curb purely from a business standpoint (8,12). Instead, both the development and approval processes would benefit from increased accuracy and objectivity.


To our advantage, the path to a new drug is fairly standard, just as are the tests used to diagnose patient disease. Once a potential drug target has been validated via preclinical studies in vitro and in vivo, scientists start with in vitro studies to determine the efficacy and specificity of a new compound. For example, kinases, which are a common cancer drug target, are highly similar; therefore, showing specific activity against the desired target kinase is essential (14). Then, they proceed to characterize drug efficacy and safety in vivo. At this stage, they need to demonstrate that the compound can be administered to humans by showing the desired outcome and few harmful side effects in model organisms. Then they perform efficacy and safety tests in patient cohorts. Whether numeric, biomarker measurements or qualitative imaging, the results of each step attempt to describe the overall potency, specificity, and efficacy of a drug candidate.


However, striking an unambiguous balance among these three metrics using these discrete data points seems challenging. For example, phase III trials involve crudely comparing a new drug to the best-in-class drug, often independently of all other results. Instead, this comparison should seek to incorporate all preclinical and clinical data for various successful and unsuccessful drugs. Returning to disease risk modeling, a functional but simple logistic regression like MAGGIC relies on well-correlated, long-term clinical outcome data, information that cannot always be obtained for a drug (15). This is where deep learning comes in. Developing an inductive model that integrates every available data point for every drug, that handles imbalances in the number of successful and unsuccessful drugs and that is readily updated seems to be better suited for drug development (7,10). AI brings more objectivity to the process by consolidating this critical information about a new drug and prior drugs.


AI also promises highly specific predictive power. Motivated by the fact that science continues to push researchers toward smaller patient populations, landmark studies have developed HLA allele-specific cancer neoantigen prediction pipelines using AI (5). An antigen is any molecule that induces an immune response, and neoantigens are novel antigens arising from mutations in tumor cells, that have not been previously recognized by the immune system. These are often used to develop vaccines. The goal is to come up with personalized vaccines for cancer patients based on the mutations they harbor. For this cancer immunotherapeutic strategy, two patients with chronic myeloid leukemia and the same set of mutations but different HLA alleles may require unique neoantigen targets and vaccines. For this reason, although the therapeutic strategy is the same, the actual therapy will differ. This granularity will enhance an AI-based drug approval process.


Now, immunotherapy is not the only therapy out there. In fact, ineffective neoantigen determinations continue to hold back breakthroughs in the field. Cancer patients still rely mostly on off-the-shelf chemotherapy. Nonetheless, everyone in the field knows that this is often not enough. Patients may or may not respond to off-the-shelf drugs because of their genetics. For example, glioblastoma patients are insensitive to the drug temozolomide if they lack the DNA-repairing enzyme MGMT (11). Others with certain breast cancer types have developed resistance to widely used chemotherapeutics through numerous escape pathways, and the mere diversity of tumor cells (2,4).


Therefore, combination therapies are more necessary than supplementary and, because of this, the drug landscape, especially for cancer treatment, is increasing in complexity. To meet this need, researchers might explore training drug class-specific models – even fast-track ones, based on preclinical and clinical data to derive objective approval standards for reviewers. The reason for this is that the experiments and results for a specific class of drugs are generally comparable because they have similar mechanisms of action and fall in the same therapeutic area. And, if data is limited for a drug class, a model from a related drug class with sufficient data can be applied, using transfer learning, to the small dataset to develop an effective model that can be updated as more data becomes available (10). Though standards have always been unique for different therapeutic areas, the hierarchy of differences is only going to grow and become more difficult to manage. For this reason, a more objective and specific approach to drug approvals will be especially critical.


The overall problem of applying AI to drug development is not too different from EMR-based deep learning for medical diagnosis which entails training neural networks using electronic medical record data, patient reports and patient images. Here, researchers will be training a network with various, discrete data relating to the drug’s performance, from test tubes to humans. Further, the goal is to integrate all important data for a drug at each approval stage to provide a standardized assessment of the drug’s performance and determine whether the company should continue to pursue development. The best way to reduce the risk that pharmaceutical companies often cite for high drug prices is to stop the process before it fails, something the FDA can more effectively facilitate with the help of AI. In addition to a national approach developed by the FDA, there is room in this information market for proprietary solutions developed by individual companies using their own proprietary data, as well as individual scientists or AI entrepreneurs using publicly available data sets. Notably, patients would greatly benefit from a prescription drug market system in which significantly more companies are developing drugs that are more likely to succeed.


I wish to thank my PI, Dr. Thomas M. Roberts of the Dana Farber Cancer Institute, for his helpful feedback.

Disclosure Statement

The author(s) have no relevant financial disclosures or conflicts of interest.


  1. Center for Drug Evaluation and Research. (2020, December 22). Economic Assistance and Incentives for Drug Development. FDA.


  1. Chen, X., Zhang, M., Gan, H., Wang, H., Lee, J. H., Fang, D., Kitange, G. J., He, L., Hu, Z., Parney, I. F., Meyer, F. B., Giannini, C., Sarkaria, J. N., & Zhang, Z. (2018). A novel enhancer regulates MGMT expression and promotes temozolomide resistance in glioblastoma. Nature communications9(1), 2949.


  1. Congressional Budget Office. (2021). Research and Development in the Pharmaceutical Industry. CBO.


  1. Dagogo-Jack, I., & Shaw, A. T. (2018). Tumour heterogeneity and resistance to cancer therapies. Nature reviews. Clinical oncology15(2), 81–94.


  1. De Mattos-Arruda, L., Vazquez, M., Finotello, F., Lepore, R., Porta, E., Hundal, J., Amengual-Rigo, P., Ng, C., Valencia, A., Carrillo, J., Chan, T. A., Guallar, V., McGranahan, N., Blanco, J., & Griffith, M. (2020). Neoantigen prediction and computational perspectives towards clinical benefit: recommendations from the ESMO Precision Medicine Working Group. Annals of oncology : official journal of the European Society for Medical Oncology31(8), 978–990.


  1. Dowden, H., & Munro, J. (2019). Trends in clinical success rates and therapeutic focus. Nature reviews. Drug discovery18(7), 495–496.


  1. Fotouhi, S., Asadi, S., & Kattan, M. W. (2019). A comprehensive data level analysis for cancer diagnosis on imbalanced data. Journal of biomedical informatics90, 103089.


  1. Galson, S., Austin, C. P., Khandekar, E., Hudson, L. D., DiMasi, J. A., Califf, R., & Wagner, J. A. (2021). The failure to fail smartly. Nature reviews. Drug discovery20(4), 259–260.


  1. Gyawali, Bishal & Hey, Spencer & Kesselheim, Aaron. (2020). Evaluating the evidence behind the surrogate measures included in the FDA’s table of surrogate endpoints as supporting approval of cancer drugs. EClinicalMedicine. 21. 100332.


  1. Hassanzadeh, H., Nguyen, A., Karimi, S., & Chu, K. (2018). Transferability of artificial neural networks for clinical document classification across hospitals: A case study on abnormality detection from radiology reports. Journal of biomedical informatics85, 68–79.


  1. Hegi, M. E., Diserens, A. C., Gorlia, T., Hamou, M. F., de Tribolet, N., Weller, M., Kros, J. M., Hainfellner, J. A., Mason, W., Mariani, L., Bromberg, J. E., Hau, P., Mirimanoff, R. O., Cairncross, J. G., Janzer, R. C., & Stupp, R. (2005). MGMT gene silencing and benefit from temozolomide in glioblastoma. The New England journal of medicine352(10), 997–1003.


  1. Hwang, T. J., Carpenter, D., Lauffenburger, J. C., Wang, B., Franklin, J. M., & Kesselheim, A. S. (2016). Failure of Investigational Drugs in Late-Stage Clinical Development and Publication of Trial Results. JAMA internal medicine176(12), 1826–1833.


  1. Patel, B. N., Rosenberg, L., Willcox, G., Baltaxe, D., Lyons, M., Irvin, J., Rajpurkar, P., Amrhein, T., Gupta, R., Halabi, S., Langlotz, C., Lo, E., Mammarappallil, J., Mariano, A. J., Riley, G., Seekins, J., Shen, L., Zucker, E., & Lungren, M. (2019). Human-machine partnership with artificial intelligence for chest radiograph diagnosis. NPJ digital medicine2, 111.


  1. Smyth, L. A., & Collins, I. (2009). Measuring and interpreting the selectivity of protein kinase inhibitors. Journal of chemical biology2(3), 131–151.


  1. Pocock, S. J., Ariti, C. A., McMurray, J. J., Maggioni, A., Køber, L., Squire, I. B., Swedberg, K., Dobson, J., Poppe, K. K., Whalley, G. A., Doughty, R. N., & Meta-Analysis Global Group in Chronic Heart Failure (2013). Predicting survival in heart failure: a risk score based on 39 372 patients from 30 studies. European heart journal34(19), 1404–1413.


  1. Liu, X., Faes, L., Kale, A. U., Wagner, S. K., Fu, D. J., Bruynseels, A., Mahendiran, T., Moraes, G., Shamdas, M., Kern, C., Ledsam, J. R., Schmid, M. K., Balaskas, K., Topol, E. J., Bachmann, L. M., Keane, P. A., & Denniston, A. K. (2019). A comparison of deep learning performance against health-care professionals in detecting diseases from medical imaging: a systematic review and meta-analysis. The Lancet. Digital health1(6), e271–e297.

About the Author

Cherubin Manokaran

Cherubin Manokaran is a full-time researcher studying the molecular mechanisms of cancer signaling in the lab of Dr. Thomas M. Roberts at the Dana Farber Cancer Institute.