Harnessing AI in biopharma: Scientific use cases

There is no shortage of data in the life sciences. Scientific data is exploding at an annual growth rate of 34.5 percent and is expected to exceed 30 million petabytes in 2030. Storing all this new data will be difficult, let alone extracting value from it. Biopharma leaders are increasingly looking to artificial intelligence (AI) to sift through the volumes of scientific data to find transformative insights that will bring better therapies to market, faster.

Virtually every stage in the pharmaceutical value chain, from discovery to commercialization, has the potential to be transformed by AI. Below, we highlight several scientific use cases that illustrate the promise of AI.

scientific use cases for AI in the biopharma value chain — Examples of scientific use cases for AI in biopharma. The list is not exhaustive.

Research and Development

The conventional drug R&D paradigm suffers from profound inefficiencies. The vast majority of leads and candidates fail, due to complex and often poorly understood reasons. The high attrition rate is tremendously expensive, especially for late-stage failures. A phase III trial costs $310 million on average. As such, improving the quality of drug candidates that enter preclinical and clinical development can dramatically reduce costs as well as enhance drug efficacy and safety.

To boost R&D success, biopharma companies are leveraging AI to tackle the main challenges in drug discovery and development. These include picking the right therapeutic target for a disease, designing the right molecule to modulate it, predicting how the drug will behave in the body, and identifying which patients are most likely to benefit. Companies can leverage historical data sets to train AI models initially and then feed new data, as it becomes available, for iterative improvements.

Target discovery

Selecting a drug target requires understanding the molecular mechanisms underlying a disease. This is no small feat, given that most disorders result from a complex interplay of genetic and environmental factors. To tease them apart, researchers need vast amounts of diverse biomedical data. AI algorithms can mine these data sets—including scientific literature, multi-omics (e.g., genomics, transcriptomics, proteomics, metabolomics), clinical records, and public databases—to find proteins, genes, pathways, or microbiomes correlated with specific diseases or conditions. The result could be a ranked list of promising targets for further investigation as well as the corresponding genetic profiles of patients expected to benefit from treatment.

Virtual screening

Once a target (usually a protein) is identified and validated, researchers aim to find molecules that can bind to and modulate the target. Conventional strategies have relied largely on high-throughput screening of large chemical libraries, that is, brute-force trial and error. Laboratory scientists prepare various concentrations of each compound and test them for activity against the target in vitro. These screens are costly, time-consuming, and highly inefficient—with hit rates typically below 1 percent.

Virtual screening, on the other hand, moves the initial rounds of assays in silico. It uses molecular modeling to predict how and where compounds will bind to target proteins. The chemistry and physics behind these interactions are wildly complex. Although computational techniques have been around for decades to simulate drug-target binding, AI has vastly improved the speed and accuracy of these calculations.

The development of AlphaFold, an AI program released in 2021, marked a breakthrough in these efforts. It effectively solved one of the thorniest problems in biochemistry—how amino acid sequences (linear protein chains) fold into functional three-dimensional structures. In initial tests, AlphaFold dramatically outperformed all other computational methods. Its accuracy rivals that of experimental approaches like X-ray crystallography, which are labor-intensive and lack scalability.

With the ability to generate high-quality 3D structures of virtually any target protein as well as extensive chemical and biological data sets, biopharma companies have the raw ingredients for AI-powered virtual screening. Thousands to millions of compounds in an existing library can be analyzed in silico. The most promising compounds are then validated in the lab. The upshot is much faster, less expensive screening.

3d structure of a protein binding to a small molecule drug — A small molecule drug (red) binds to its target protein (blue). Predicting molecular interactions is wildly complex, but the task is well suited for AI.

De novo design

The millions of compounds amassed by biopharma companies in their libraries represent only a sliver of the more than 10⁶⁰ species possible. In contrast, AI algorithms can explore the entire chemical space to find molecules with potentially favorable pharmacological properties. Both small molecules and biologics can be designed using this approach. The latter are more complex but more versatile. For example, the larger surface area of protein drugs enables them to interact with targets in ways that small molecules cannot.

Once identified in silico, the drugs are synthesized and validated experimentally. As with virtual screening, de novo design can greatly accelerate drug discovery by minimizing time in the lab. This approach can also yield superior drugs, as it's not limited by human imagination or what can be found in the natural world.

Already over 150 AI-designed drugs are in early-stage pipelines, with more than 15 in clinical trials. All have yet to clear regulatory approval. However, in 2023, INS018_055 became the first fully AI-generated drug to reach phase II trials. The drugmaker, Insilico Medicine, used its proprietary AI platform to identify a novel target and generate a small molecule inhibitor for idiopathic pulmonary fibrosis.

Drug repurposing

AI can identify drugs for specific targets and, conversely, uncover new targets for known, already-approved drugs. Using virtual screening, scientists can model interactions between an existing drug and a collection of protein structures to predict new therapeutic applications. This approach streamlines the discovery process while allowing companies to reap the downstream benefits of drug repurposing: faster time to market and reduced risk, since the drug’s safety and manufacturing are well established and the drug has already gone through the costly and time-consuming approval by regulatory agencies like the US Food and Drug Administration (FDA).

Knowledge reuse

With the vast amount of data accumulated from previous research, clinical trials, and other studies, AI can quickly sift through and analyze complex datasets, identifying relevant information that can be useful for current research. This process saves time and resources that would have been spent conducting new foundational research.

ADMET prediction

A successful drug needs to be effective within the complex environment of the body. It must reach its target cells with sufficient concentration to produce a therapeutic effect while eliciting minimal adverse effects elsewhere. Pharmacokinetic properties such as absorption, distribution, metabolism, excretion, and toxicity (ADMET) are critical to the safety and efficacy of therapies. Scientists measure these properties with in vitro and animal studies during preclinical development.

However, accurately forecasting how a drug will behave within human subjects remains a major challenge, given that more than 90 percent of clinical trials fail. AI can leverage existing and future data sets, including positive and negative results from clinical studies, to accurately model ADMET properties in patients. This enables a "fail fast, fail cheap" strategy that eliminates unfavorable candidates much earlier in development. It also leads to a more ethical approach by significantly reducing the need for animal testing.

Formulation development

Effective therapies combine active pharmaceutical ingredients with excipients to facilitate delivery, enhance efficacy, reduce side effects, and improve shelf life. Traditional formulation development is time-consuming and inefficient, requiring extensive testing and evaluation of candidate formulations. AI tools can significantly improve this process by rapidly probing a large parameter space and recommending optimal formulations to test.

Process development

AI can expedite process development and technology transfer by optimizing drug production for scale-up, reproducibility, and cost. Leveraging historical and newly generated data sets, ML models can rapidly identify critical process parameters (CPPs) and forecast the best conditions for drug synthesis. This is achieved by reducing the number of experiments needed to determine the CPPs or by simulating production processes to understand the impact of different variables.

Manufacturing and QC

The production of medicines demands precision and consistency to ensure batch-to-batch uniformity and purity. Minor deviations from specifications can compromise product quality, endanger patient safety, and run afoul of regulators. As biopharma companies increasingly focus on advanced therapeutics, such as cell and gene therapies, manufacturing becomes more complicated and less scalable.

In response, biopharma companies are adopting advanced digital technologies and data-driven approaches to boost the efficiency and robustness of the entire manufacturing process, from raw materials to finished products. This transformation, especially when coupled with AI, can yield substantial time and cost savings while improving quality control (QC).

scientist looking into a bioreactor — Manufacturing is becoming more complicated and less scalable as drug portfolios shift toward advanced therapies. AI can help make production more efficient and robust.

Process control and optimization

AI-driven systems can continuously monitor and analyze vast amounts of data from sensors, instruments, and production lines in real time. Predictive analytics and machine learning algorithms can anticipate process deviations, enabling proactive adjustments to maintain proper conditions. Moreover, AI can optimize complex bioprocess parameters to reduce the risk of product variability, increase yields, and ensure compliance.

Predictive maintenance

With AI, biopharma companies can shift from reactive to proactive maintenance strategies. AI-powered systems continuously collect and analyze data from sensors and equipment in manufacturing as well as in the laboratory, identifying subtle anomalies and wear patterns that may signal impending machinery faults or maintenance needs. This could make equipment failures a thing of the past.

Many pharmaceutical manufacturing plants operate around the clock, with two-week shutdowns scheduled twice a year for preventative maintenance. This approach, although standard in the industry, may not be optimal. If preventative maintenance is done too early, companies waste resources. If done too late, repairs become overly expensive. Machine learning algorithms can recognize complex patterns and predict when specific components or systems are likely to require attention. As a result, maintenance activities can be scheduled precisely when needed, reducing unplanned downtime, minimizing production interruptions, and optimizing resource allocation.

Digital quality control

Consistent and safe production of pharmaceuticals requires rigorous quality control. QC teams must evaluate batches across many dimensions, including potency, purity, and stability, to ensure they meet product specifications and regulatory standards. These processes generate massive amounts of heterogeneous data, especially for more complex products like biologics and cell-based therapies. QC personnel must pore over these data sets to detect and investigate deviations. Done manually, it's laborious and prone to errors.

Using AI, labs can automate the analysis of complex datasets, helping scientists identify patterns, trends, and anomalies more quickly. Predictive algorithms can preemptively flag potential out-of-specification results. Timely interventions can prevent production disruptions and compliance issues, potentially saving hundreds of millions of dollars. Should deviations occur, AI-enhanced root cause analysis can unearth patterns that may elude human operators. Corrective action, in turn, becomes faster and more effective.

Summary of AI benefits

By unlocking the full value of their scientific data troves, companies can expand their portfolios with drugs that deliver markedly higher ROI. The following sums up the main benefits of AI.

Faster time to market: It usually takes over a decade to bring a drug to market. Much of this time is spent gathering and analyzing scientific data to determine if a potential drug can advance to the next stage. AI can boost the efficiency of many of these steps by orders of magnitude, slashing development time.

Reduced costs: AI can help increase staff productivity and optimize the utilization of resources in labs and manufacturing plants, cutting costs. Moreover, by improving the quality of drug candidates in the pipeline, AI can minimize expensive late-stage failures.

Reduced risks: Organizations can use AI to minimize errors and deviations, address data integrity issues that threaten drug safety, and streamline compliance.

Better scientific outcomes: AI promises to open new avenues of scientific inquiry that can bring a wave of breakthroughs: new therapeutic targets, novel mechanisms of action and delivery, innovative manufacturing processes, and so on.

The Scientific AI gap

Eager to get started with AI? Your scientific data may be holding you back.

Read our white paper to understand why biopharma companies risk falling short of their AI goals. And learn how to close the gap between vision and present-day reality.

Example H4

Example H5

Reimagine Scientific Data Management

Transform your data. Enable lab data automation. Drive analytics and AI.

Explore how