AI-Powered Data Management Is the Key to Vaccine Discovery

As pharmaceutical and clinical research organizations around the world race to discover and inoculate the global population with a vaccine for COVID-19, artificial intelligence (AI) will play a key role in ensuring that trustworthy, relevant, and timely drug-discovery data can be used to understand and eradicate this global threat.

Drug discovery has traditionally been a slow process, but today AI is greatly shortening the time that such life-saving pharmacological research takes—from years to weeks in some cases. An important example: clinical pharmacologists are successfully using the power of AI to detect and interpret the features of cancer molecules in order to make predictions for new drugs that will successfully target those molecules, leading to the most effective treatments against this killer disease in decades.

To reach this stage in cancer treatment, it took a concerted effort to implement the following steps:

  • Enable a seamless data environment for patients, providers, and researchers
  • Unlock science through open computational tools and storage platforms
  • Develop a data science-aware workforce capable of using the connected data environment.

Since AI is successfully predicting what might work best for treating a cancer patient—and in far less time than ever before—the world is desperately hopeful that clinical pharmacologists will be able to use the same techniques to solve other really big problems, like stopping global pandemics with an effective vaccine.

Researchers have become adept at using the power of machine learning (ML) for automating clinical challenges—like predicting, anticipating, and proactively eliminating the inefficiencies and process breakdowns that often occur on the way to starting clinical trials and advising on best courses of action. This is accelerating the pathway to clinical trials and will lead to a quicker vaccine.

Powering Medical Breakthroughs with High-Quality Data

It is most likely that a combination of drugs will eventually be used to defeat coronavirus, requiring analysis of not only millions of possible drug pairs, but also billions of triple-drug combinations stemming from over 4,000 approved drugs on the market today. AI-driven analytics is the strongest tool we have to surmount this challenge, but this would ultimately fail without access to large, high-quality, clean, and trustworthy data sets.

Unfortunately, much of the data that is required by the AI models to accelerate coronavirus research to the clinical trial stage is tied up in silos across individual big pharmaceutical companies. Or, it’s buried deep inside the intellectual property within laboratory, university, and disjointed healthcare organization databases all over the world.

As in cancer research, the urgent need is to unify and integrate all of these disparate data sources together in one place. This will enable clinical pharmacologists to use the novel ML techniques perfected on cancer treatment to generate new drugs to attack and vanquish COVID-19.

An inspiring, real-world example of leveraging AI to gather and share clinical research data across healthcare ecosystems for maximum clinical value, speed, and efficiency is the Covid-19 Open AI Consortium (COAI). The consortium’s mission is to increase collaborative research and bring breakthrough medical discoveries and actionable findings to the fight against the pandemic.

COIA and other global efforts hold great promise—but we must acknowledge the role that trustworthy data plays in their success or failure. These endeavors need successful data cataloging, data integration, data quality, master data management, and data governance so that the AI models are provided with trusted data to work with. If trustworthy, reliable, standardized, and certified data is not provided, the models will simply magnify the impact of poor quality, or poorly understood data, exponentially.

Automating Pharmacological Data Management

AI models can crunch through millions of pharmaceutical compounds and predict what might work best far faster than a human researcher could. AI models require high-quality data sets to work with, so that the AI model can learn from the data, helping it make further predictions that develop from a foundation of sound science. Pharmacological researchers and clinicians can work smarter and more efficiently by using AI to automate data management tasks in much the same way they are using AI to automate the predictive analytics to overcome clinical challenges.

Our fervent hope is that human researchers at organizations like COIA will find trustworthy data to put into the AI models. However, if every time a COIA researcher receives a newer set of data, they must spend valuable time to define, inspect, profile, quality control, and otherwise “fix” the data before it can be used, then the ability to quickly move forward will be significantly impeded.

The world has a dearth of skilled pharmacological researchers; rather than have the limited supply of these important minds review and categorize imported data sets, let’s instead use AI to do the repetitive work. Not only is AI more efficient, scalable, and cost-effective; it also makes fewer errors and can complete more extensive cross-checking against more data than a human being ever could.

And also, let’s use AI-driven master data management (MDM) to enable clinical trials to go much faster by providing 360-degree views of clinical trial locations, names, patients, providers, and products. AI-powered integration and cataloging allows researchers to onboard new data sources faster, providing the capabilities to discover hidden relationships in data and expose powerful insights.

The good news is that the incorporation of AI into data management tools and applications is maturing rapidly. They can be seamlessly integrated within the data management capabilities of the pharmaceutical, healthcare, university, and government organizations that are actively searching for a vaccine. AI-driven enterprise data management is the best way to govern the volumes of structured and unstructured data at the heart of vaccine research, in order to provide researchers with clean, high-quality, and trustworthy data to deliver results that support a fast-track to clinical trials.