Automated Anomaly Detection and Correction

An interview with Randall Julian, Ph.D., CEO of Indigo BioAutomation
|
March 28, 2022
Randall Julian is CEO, president, and founder of Indigo BioAutomation
Randall Julian is CEO, president, and founder of Indigo BioAutomation

Randall Julian is CEO, president, and founder of Indigo BioAutomation (originally, Indigo Biosystems) located in Indiana. Randy earned a PhD in chemistry from Purdue University in 1993 and then worked in discovery chemistry at Eli Lilly for 14 years. He founded Indigo based on informatics technology developed in his research group and has led Indigo from its founding to profitability, building a world class management, engineering, and research team to commercialize laboratory data analysis software.

Dr. Julian is a frequent speaker in the mass spectrometry community and teaches short courses in statistics and data analysis. Randall is the past chairman of the Human Proteome Organization’s Proteomics Standards Initiative steering group where he co-authored two international standards for analytical data. He was also the chairman of the ASTM committee on analytical data standards. Dr. Julian also maintains an active research relationship with the faculty at Purdue University where he is an adjunct professor of chemistry.

We recently talked with Randy to find out more about Indigo and the value of the Tetra Partner Network.  We’re also hosting a webinar with Randy. Please register for AI/ML Based Anomaly Detection for Improved Scientific and Regulatory Outcomes.

Randall, before we dive into what Indigo accomplishes today, please tell us a bit about your background and why you founded the company.

While working on my analytical chemistry PhD in the early 90s, I noticed people trying to do computational research on new instrumentation using early PCs. Back then the difference in power between a PC and a high-performance computing system (HPC) was huge. Using a PC limited the scope of what could be done so I ended up using the Cray supercomputers at Purdue, and the Connection Machine at Los Alamos National Laboratories to build models that could be used to build new instruments. 

It was the analytical chemistry equivalent of designing an airplane with a computer model rather than with a calculator. It opened my eyes to how little impact our best computers had in chemistry outside of a few physical chemistry problems. At Eli Lilly, my team applied advanced hardware and software to analyze instrumental data to speed up natural products drug discovery. Those projects eventually grew into our present company, Indigo BioAutomation.

Indigo uses advanced algorithms and high-performance computing systems (now based in the cloud) to automatically analyze complex data from clinical and life science laboratories. Our products use models of the physics and chemistry of instrumentation to identify problems with instrument performance and individual samples that are not easy to detect with the human eye. Since we are not limited in computational power or storage, we've developed some very powerful, and explainable AI/machine learning (ML) solutions for high throughput clinical diagnostic tests and drug development processes.

For example, we accepted a customer challenge to help improve turn-around time for COVID-19 PCR testing. Since Indigo's flagship product, ASCENT, successfully uses a mathematical model of liquid chromatography and mass spectrometry (LCMS), we built a model that represents the PCR reaction. Within about a week, we were able to process PCR data with the same degree of automation as we are for LCMS. ASCENT processes about 40 million samples a year with no limit in sight. Now, the PCR product, called ARQ, has the same capability. In addition, ARQ has reduced result review time by 75 percent in one infectious disease laboratory.

We have also developed and delivered a mathematical model for the signals generated by a new multiple myeloma test. Again, using models of chemistry and physics, combined with advances in ways to use models, we believe the new system can detect cancer at sensitivity levels thousands of times better than what is possible today.

In clinical laboratory testing, everything is urgent. A poor laboratory result will affect people that day. How does Indigo help clinical labs and how can your experience with clinical transfer to biopharma research?

We have all learned from the COVID-19 pandemic the value of fast, reliable test results. The need has always been there, but now vast numbers of people are being directly affected by testing. Historically, if you were automating something, the focus was on cost savings which automatically makes people think of eliminating jobs. That’s not really the case in today’s labor market. It’s more important to help labs use their limited staff to deliver on unprecedented volume spikes. 

Automation makes tests we all need faster, better, and cheaper while making the lives of people in the lab better by shifting tedious, repetitive work to machines. Automation in health-related laboratories frees people to do what humans do best: work in teams to accomplish what no one person can do by themselves. That requires critical thinking, problem-solving abilities, and above all, the ability to communicate and collaborate across the lab with other experts. 

Automation also allows for data-driven improvement programs to solve problems in the lab that affect everything from the quality of the results to the quality of life in the lab. By having data that can be analyzed for overall system performance, labs can continuously improve in all the dimensions that are important to them.

DiagramDescription automatically generated
Figure 1. Transparent data collection at each step in the laboratory process combined with automated anomaly detection allows problems to be handled locally in real-time, and helps drive overall process improvement.

It seems like you understood early in your career how important computational science and technology would be to advancing healthcare. Has there been any downside to the digital revolution?

Analytical measurements are critical to every aspect of the life sciences, from basic research to primary health care. The proliferation of computers has led to significant advances in measurement technology and automation. Since then, we have moved away from dependence on paper recordings of measurements and documentation of everything from standard operating procedures to data analysis and summarization in paper notebooks. While not complete (the use of sticky notes at the bench is alive and well), the shift away from paper-only records is a good thing. 

"The digital revolution, however, is decidedly doubled-edged....We are now at a point where it has become challenging to tell what happened during an experiment or what an observation means."

Digital analysis of digital data is far superior to hand analysis with calculators. Recording results in digital files, storing them in databases accessed by electronic notebooks or laboratory information systems, and using powerful algorithms for data analysis has revolutionized all aspects of healthcare. The digital revolution, however, is decidedly double-edged

Given the complexity of biology, it was inevitable that this complexity would show up in the diversity of data, methods of analysis, and types of records we keep while working on biological systems. Data storage, computing power, and measurement complexity all grew to match the complexity of our questions. The more difficult the problem, the more complicated the disease, then the more complex the chain of events leading to any conclusion will be. 

We are now at a point where it has become challenging to tell what happened during an experiment or what an observation means. Experiments and measurements are so computerized that, ironically, it has become tough for scientists to record all the context needed for anyone to understand what they have done.

The stakes are high:

  • Diagnostic laboratories are detecting terrible diseases.
  • Drug companies are designing critical treatments and preventions.
  • Healthcare professionals work overtime to provide patients with the best care and advice.
  • Regulators are working to ensure the quality of diagnosis, treatment, and care, because lives are on the line.

Let me illustrate the situation using the measurement of the quantity of a compound in a complex mixture as an example. In the diagnostic setting, this could be a marker of the recurrence of cancer measured in blood. In drug discovery, it could be the drug levels in the body measured over time. In manufacturing, it could be the presence of a contaminate or side reaction product. 

All these measurements are critical to the delivery of healthcare today. All are done with complex instruments with sophisticated quality control processes and done under the oversight of internal and external regulatory groups who represent society's interest in getting the results right.

The situation is made much worse when many groups in different locations are part of a team performing this work. With centralized data aggregation and automated anomaly detection, the basic processes of clinical diagnostics and drug discovery and development can be streamlined.

DiagramDescription automatically generated
Figure 2. The audit process can now be streamlined through the aggregation of more than just data but of actions taken to address issues, no matter where they happen.

How do Indigo’s products alleviate the burden of decision making from the scientist?

The difficulties I mentioned have been known for quite some time. They have been the subject of tremendous work by dedicated measurement scientists, computer scientists, and experts on regulation and the use of evidence in law. 

The first necessary condition for understanding what happened during a measurement is to organize a wide range of data types in a highly trusted and easily usable system. The data must include information about all the devices involved, the actions of people, and the timing of everything. 

The second requirement is automatically finding anomalies in this highly diverse data. In real-world settings, anomaly detection is incredibly difficult for the simple reason that anomalies are rare events. That means that having deep expertise in detecting weak signals of trouble in the presence of normal variation (noise) is essential.

Indigo BioAutomation has been performing this type of signal analysis and anomaly detection for clinical diagnostic laboratories for over a decade. One of the elements of a successful system comes from machine learning: converting raw data into features that describe behaviors in a laboratory setting and using them to classify events. 

"One of the elements of a successful system comes from machine learning: converting raw data into features that describe behaviors in a laboratory setting and using them to classify events."

A system can use standard AI techniques to evaluate data against an SOP to determine data reliability. But the input into such a system requires extracting relevant features from sources like text audit logs, timestamps of calibration events on balances, instrument error codes, measurement variations on QC samples, robotic system message logs, and even human actions. 

Statistical analysis of historical data using machine learning algorithms is helpful for some features. In contrast, others require natural language processing of electronic lab notebooks, audit trails, and system logs. No single approach will detect rarely occurring problems in a laboratory or manufacturing process.

How can anomaly detection improve the drug development process? 

Once an anomaly is detected, the most important thing is to prevent the unexpected situation from causing harm. Actions might need to be immediate, like shutting down a system. Or the problem may be best handled by alerting the appropriate people so they can decide what to do. Hopefully, simply connecting the right people to the right data at the right time can prevent a cascade of damage and cost because the later a problem is found and addressed usually, the more damage and cost it inflicts.

We now have the tools needed to allow a complex system to monitor itself. First-order automation enables computers to control devices. Second-order automation monitors an entire system. Our new technology gives everyone from the bench technician to scientists to regulators confidence that when we match the complexity of our tools to the complexity of the problems we are trying to solve, we can still trust the answers and move forward with confidence.

How does being a part of the Tetra Partner Network solve some of the problems you’ve outlined?  

Detecting anomalies in any life science laboratory or process is problematic because it requires capturing enough context to detect rare events and distinguish them from noise. By partnering with TetraScience, Indigo’s algorithms now have access to a much richer data collection allowing even more precision and accuracy in detecting problems. Further with the scope of the Tetra R&D cloud, partnering with Tetra broadens the types of issues that we can pick up. Indigo can now support scientists through every phase of their work by automatically checking to ensure the results they record are supported.

What’s the value to customers?

Regardless of where a person works in healthcare, they want their results, decisions, treatments, and processes to be correct. If work isn't done right, someone could get hurt. As a result, there are harsh consequences for bad work or for not being able to explain or reproduce good work credibly. By catching problems early, the immense and painful costs in time, energy, and money to correct them later are avoidable. 

With so much focus on the safety and efficacy of pharmaceuticals, it is critical that companies can answer all the questions along the drug development and manufacturing path before they slow a new drug submission when patients are waiting. Since it can take years to get a drug from development to the approval stages, everything needs to be in order when the work is done, not recreated for submission. Finally, if deviations are corrected and documented in real-time, internal and external auditors will have confidence that all regulated work is being done according to the required procedures. The benefit to the customer can be years of additional revenue from getting to market faster. The benefit to the patient can be years of their life.

"The benefit to the customer can be years of additional revenue from getting to market faster. The benefit to the patient can be years of their life."

Register for the Indigo BioAutomation and TetraScience webinar: AI/ML Based Anomaly Detection for Improved Scientific and Regulatory Outcomes.

Share this article

Previous post

There is no previous post
Back to all posts
June 27, 2022

Barrier Busting: Bringing ELN and LIMS Scientific Data Together

Read Blog
May 31, 2022

Committed to Curing Diabetes

Read Blog
May 23, 2022

New Frontiers: World’s First Community-Driven AI Store for Biology

Read Blog
May 18, 2022

Tetra Blasts Off at Boston’s Bio-IT World

Read Blog
May 9, 2022

Give Your in vivo Data the Attention it Deserves

Read Blog
May 2, 2022

Customizing Digital Lab Experiences With Ease

Read Blog
April 14, 2022

Sharing a Vision and Deep Customer Commitment

Read Blog
April 11, 2022

Escaping the Scientific Data Quagmire

Read Blog
April 1, 2022

Innovating with a HoloLens and Drones

Read Blog
April 6, 2022

Digital Twins: Seeing Double with a Predictive Eye

Read Blog
March 28, 2022

Automated Anomaly Detection and Correction

Read Blog
March 30, 2022

Making Labs More Efficient

Read Blog
March 4, 2022

Introducing Tetra Data Platform v3.2

Read Blog
March 2, 2022

Are you prepared to utilize ML/AI and Data Visualization?

Read Blog
February 22, 2022

SLAS 2022: The Industry’s “Hyped” for Accessible and Actionable Scientific Data

Read Blog
February 21, 2022

BIOVIA partners with TetraScience

Read Blog
February 16, 2022

Tetra Partner Network: An Interview with Klemen Zupancic, CEO, SciNote

Read Blog
February 4, 2022

Closing the Data Gap in Cancer Research

Read Blog
January 27, 2022

Waters & The Tetra Partner Network: Making Data Science Possible

Read Blog
December 16, 2021

Announcing Acquisition of Tetra Lab Monitoring Business by Elemental Machines

Read Blog
November 29, 2021

Move From Fractal to Flywheel with The Tetra Partner Network

Read Blog
March 26, 2021

How an IDS Complements Raw Experimental R&D Data in the Digital Lab

Read Blog
July 30, 2021

What is an R&D Data Cloud? (And Why Should You Care?)

Read Blog
March 26, 2021

What is a True Data Integration, Anyway?

Read Blog
June 1, 2020

Data Science Use Cases for the Digital Lab: Novel Analyses with Waters Empower CDS Data

Read Blog
April 20, 2022

Unlock the Power of Your ELN and LIMS

Read Blog
July 23, 2020

The Science Behind Trash Data

Read Blog
August 20, 2021

The 4 Keys to Unlock the Lab of the Future

Read Blog
September 29, 2021

TetraScience Achieves SOC 2 Type 2 Validation, Advances R&D Data Cloud GxP Compliance Capabilities

Read Blog
April 20, 2020

Round-up of Semantic Web thought leadership articles

Read Blog
May 11, 2021

R&D Data Cloud: Moving Your Digital Lab Beyond SDMS

Read Blog
September 10, 2021

Principles of Deep Learning Theory

Read Blog
July 8, 2020

Powering Bioprocessing 4.0 for Therapeutic Development

Read Blog
March 30, 2022

Why Biopharma Needs an End-to-End, Purpose-Built Platform for Scientific Data — Part 2

Read Blog
August 19, 2021

Part 2: How TetraScience Approaches the Challenge of Scaling True Scientific Data Integrations

Read Blog
March 23, 2022

Why Biopharma Needs an End-to-End, Purpose-Built Platform for Scientific Data — Part 1

Read Blog
January 18, 2021

New Matter: Inside the Minds of SLAS Scientists Podcast

Read Blog
June 29, 2020

Enabling Compliance in GxP Labs

Read Blog
May 14, 2020

LRIG-New England: Lunchtime Virtual Rapid Fire Event - May 26, 2020

Read Blog
June 10, 2020

Remote Lab Scheduling is No Longer Optional, it is a Requirement

Read Blog
August 2, 2020

Incident Reporting for GxP Compliance

Read Blog
October 15, 2020

Protein Purification with Cytiva UNICORN: Enhanced Analytics through Harmonization and Integration

Read Blog
July 29, 2020

Cloud-based Data Management with Lab Automation: HighRes Biosolutions Cellario + TetraScience

Read Blog
August 20, 2020

Understanding Why Freezer Temperatures May Not Be Uniform

Read Blog
July 14, 2021

Find Experimental Data Faster with Google-Like Search in Tetra Data Platform 3.1 Release

Read Blog
July 22, 2021

Experimental Data in Life Sciences R&D — It’s How Many Copies of Jaws?!

Read Blog
April 26, 2020

The Digital Lab Needs an Intermediate Data Schema (IDS): a First Principle Analysis

Read Blog
April 6, 2020

TetraScience ADF Converter -- Delivering on the Promise of Allotrope and a Startup’s Journey

Read Blog
August 6, 2020

"Data Plumbing" for the Digital Lab

Read Blog
June 8, 2020

Data Automation for High-Throughput Screening with Dotmatics, Tecan, and PerkinElmer Envision

Read Blog
May 15, 2020

Applying Data Automation and Standards to Cell Counter Files

Read Blog
June 11, 2020

AWS Healthcare & Life Sciences Web Day | Virtual Industry Event

Read Blog
February 12, 2021

AWS Executive Conversations: Evolving R&D

Read Blog
April 15, 2021

Announcing Our Series B: The What, When, Why, Who, and Where

Read Blog
April 15, 2021

Announcing our Series B: The DNA Markers of Category Kings and Queens

Read Blog
April 15, 2021

Announcing our Series B: Tetra 1.0 and 2.0 | The Noise and the Signal

Read Blog
March 29, 2020

Allotrope Leaf Node Model — a Balance between Practical Solution and Semantics Compatibility

Read Blog
March 13, 2020

Choose the right alert set points for your freezers, refrigerators, and incubators

Read Blog
August 27, 2020

99 Problems, but an SDMS Ain't One

Read Blog