Data Automation for High-Throughput Screening with Dotmatics, Tecan, and PerkinElmer Envision

Spin Wang
|
June 8, 2020

Automation is more than robots.

High-throughput screening (HTS) methods are used extensively in the pharmaceutical industry, leveraging robotics and automation to quickly test the biological or biochemical activity of a large number of chemical and/or biological compounds. The primary goal of HTS is to identify high-quality 'hits' that are active at a fairly low concentration and that have a novel structure. Hits generated during the HTS can then be used as the starting point for following ‘hit to lead’ drug discovery effort.

Automation has played a huge role in the development of HTS to date. Tools like the automatic liquid handler and robotic, high-throughput plate readers significantly improve compound screening efficiency and consistency. Robotic automation has transformed the physical aspect of the process. However, experiment data sets remain isolated from one another, requiring manual data acquisition and handling for data storage and analysis. Experiments are automated, but data flow is not.

In order to truly reap the benefits of HTS, biopharma companies need to automate the accompanying data flow. They also need to connect the data with the rest of the Digital Lab to make it accessible and actionable. Otherwise, the vast volumes of generated data will be stuck in yet another silo.

This blog post identifies opportunities for improvement in an example HTS data workflow, based on our experience with biopharma customers, and offers our approach to evolving the HTS data flow to be as efficient and consistent as the physical process.

Analyzing Today's Manual Data Flow

The following diagram illustrates an example biopharma customer's HTS assay workflow for small molecule compound libraries (edited for confidentiality), leveraging best-in-class instruments and tools in the market.

Diagram 1: Example High-Throughput Screening Workflow

diagram-1

Let's break down this example HTS data flow in R&D labs today, based on our experience working with top pharma and biotech companies:

Step 1: Scientists register the compounds in a compound registry, like Dotmatics Register, generating the experiment ID and compound ID. Scientists also enter compound information, such as molecular weight and initial amount, into an ELN like Dotmatics Studies.

Step 2: Scientists create child samples by sample dissolution or aliquoting. Each child sample is identified by a unique barcode in the sample inventory software like Titian Mosaic. The sample inventory system contains extensive information such as parent sample ID, batch ID, amount, solvent type, sample volume, location in the freezer, etc.

Step 3A: After compound registration and child sample preparation, scientists enter the compounds’ information into the liquid handler software to set up the assay plate in the liquid handler workstation, like Tecan. The sample plate output file information is saved locally to the lab Windows computer.

Step 3B: The HTS assay is incubated based on assay design protocol and detected on a plate reader, like Perkin Elmer Envision. The assay result is also saved locally to the lab Windows computer.

Step 4: After the experiment finishes, the scientist manually moves the files back to the office to analyze the assay data with GraphPad to give either IC50 or target binding information.

Step 5: The analysis result is manually updated in the ELN, like Dotmatics Studies, to complete the experiment.

The key takeaway here is that while the robots are automating the physical component of the experiment, there are multiple instruments and software systems involved in the process and data needs to flow seamlessly into and out of all of them. Today, this is not the case. The Dotmatics components may integrate together since they are part of the same product portfolio. The others may have some point-to-point integrations that you can set up, or APIs that you can use to write your own integrations, but this is real work, needs to be maintained, and ultimately is not scalable. We need an easy way to connect all of these instruments and software systems to a common network to knock down the data silos and get the data flowing, both within this work process and across the broader R&D data ecosystem.

Optimizing HTS Data Flow

The Tetra Data Platform is that common network. Connecting the instruments and software systems needed to conduct HTS transforms the manual steps of data entry, processing, and transfer into an automated solution, saving time, reducing errors, and increasing throughput. It also harmonizes and transforms the data, preparing it for data science, AI, and other advanced analytics - we'll get to this at the end.

Let's take a look at how it works.

Diagram 2: Automating the High-Throughput Screening Data Workflow

diagram-2

Steps 1 and 2: Scientists register the compounds in the compound registry, as before. Except now, the TetraScience Dotmatics connector automatically detects new or modified compounds in the Dotmatics Register and triggers a pipeline to automatically push the information to the inventory management software. As part of this process, the data is also now available to query via RESTful API in the TetraScience Data Lake.

Steps 3A, 3B, 4, and 5: After setting up the assay plates with the liquid handler, scientists run the assay and read the assay readout with various types of plate readers. The liquid handler output file contains sample plate information, including the sample concentration of each well. The plate reader file contains the assay readout. The TetraScience File connector automatically detects the files produced by Tecan and the Envision plate reader, moves the raw instrument files into the Data Lake and then triggers pipelines to parse, merge, and push to Dotmatics Studies. IC50 is then automatically calculated.

Image: Automatically generated IC50 calculation results

Screen-Shot-2020-05-06-at-2.43.28-PM

In this optimized workflow, only one manual data workflow step remains - initiating the experiment by registering the compound. Scientists also have to physically set up the experiment, but once the experiment is complete, all the data automatically appears in the ELN, with the calculation results shown above completed, as well as in the Data Lake, ready for further querying and analysis.

The TetraScience Data Integration Platform automates the HTS data workflow, providing greater efficiency by removing painful manual data handling and processing from scientists' daily work and by improving data integrity.

Let's compare the two processes side-by-side:

Beyond Data Automation: Data Science

Now that the data workflow accompanying the high-throughput screening process is automated, what's next? Compound information, sample information, type of assays performed, and screening results are now centralized and harmonized in the TetraScience Platform. This seems like a prime opportunity to apply some data science! Check out a related blog post about our Intermediate Data Schema (IDS) to learn more about how we harmonize disparate data, knocking down the data silos. IDS is the open standards method we use to seamlessly move data between and across all the different HTS instruments and software systems, unifying the unique data structure and format from each.

A benefit of the centralized, harmonized data is that it is also prepared for use with various data science and data analytics tools such as Spotfire, Tableau, or Dotmatics Vortex. Our open standards approach means that scientists and data scientists can use the software, platforms, and languages they already know and use - no need to install or learn something new.

Diagram 3: Applying Data Science to High-Throughput Screening Data

diagram-3

You can now fully utilize your HTS data, including querying and visualization of data sets, using your existing data science and analytics tools. For example, scientists can easily query and visualize all active compounds at a certain threshold level in a particular screen, or the behavior of all compounds of similar structure across different screens. Scientists can derive insights that develop more efficient HTS assays, design more active compound libraries, and significantly speed up the drug discovery process.

Watch this video to see the optimized data flow in action, enabled by the TetraScience Platform.

Share this article

Previous post

There is no previous post
Back to all posts
June 27, 2022

Barrier Busting: Bringing ELN and LIMS Scientific Data Together

Read Blog
May 31, 2022

Committed to Curing Diabetes

Read Blog
May 23, 2022

New Frontiers: World’s First Community-Driven AI Store for Biology

Read Blog
May 18, 2022

Tetra Blasts Off at Boston’s Bio-IT World

Read Blog
May 9, 2022

Give Your in vivo Data the Attention it Deserves

Read Blog
May 2, 2022

Customizing Digital Lab Experiences With Ease

Read Blog
April 14, 2022

Sharing a Vision and Deep Customer Commitment

Read Blog
April 11, 2022

Escaping the Scientific Data Quagmire

Read Blog
April 1, 2022

Innovating with a HoloLens and Drones

Read Blog
April 6, 2022

Digital Twins: Seeing Double with a Predictive Eye

Read Blog
March 28, 2022

Automated Anomaly Detection and Correction

Read Blog
March 30, 2022

Making Labs More Efficient

Read Blog
March 4, 2022

Introducing Tetra Data Platform v3.2

Read Blog
March 2, 2022

Are you prepared to utilize ML/AI and Data Visualization?

Read Blog
February 22, 2022

SLAS 2022: The Industry’s “Hyped” for Accessible and Actionable Scientific Data

Read Blog
February 21, 2022

BIOVIA partners with TetraScience

Read Blog
February 16, 2022

Tetra Partner Network: An Interview with Klemen Zupancic, CEO, SciNote

Read Blog
February 4, 2022

Closing the Data Gap in Cancer Research

Read Blog
January 27, 2022

Waters & The Tetra Partner Network: Making Data Science Possible

Read Blog
December 16, 2021

Announcing Acquisition of Tetra Lab Monitoring Business by Elemental Machines

Read Blog
November 29, 2021

Move From Fractal to Flywheel with The Tetra Partner Network

Read Blog
March 26, 2021

How an IDS Complements Raw Experimental R&D Data in the Digital Lab

Read Blog
July 30, 2021

What is an R&D Data Cloud? (And Why Should You Care?)

Read Blog
March 26, 2021

What is a True Data Integration, Anyway?

Read Blog
June 1, 2020

Data Science Use Cases for the Digital Lab: Novel Analyses with Waters Empower CDS Data

Read Blog
April 20, 2022

Unlock the Power of Your ELN and LIMS

Read Blog
July 23, 2020

The Science Behind Trash Data

Read Blog
August 20, 2021

The 4 Keys to Unlock the Lab of the Future

Read Blog
September 29, 2021

TetraScience Achieves SOC 2 Type 2 Validation, Advances R&D Data Cloud GxP Compliance Capabilities

Read Blog
April 20, 2020

Round-up of Semantic Web thought leadership articles

Read Blog
May 11, 2021

R&D Data Cloud: Moving Your Digital Lab Beyond SDMS

Read Blog
September 10, 2021

Principles of Deep Learning Theory

Read Blog
July 8, 2020

Powering Bioprocessing 4.0 for Therapeutic Development

Read Blog
March 30, 2022

Why Biopharma Needs an End-to-End, Purpose-Built Platform for Scientific Data — Part 2

Read Blog
August 19, 2021

Part 2: How TetraScience Approaches the Challenge of Scaling True Scientific Data Integrations

Read Blog
March 23, 2022

Why Biopharma Needs an End-to-End, Purpose-Built Platform for Scientific Data — Part 1

Read Blog
January 18, 2021

New Matter: Inside the Minds of SLAS Scientists Podcast

Read Blog
June 29, 2020

Enabling Compliance in GxP Labs

Read Blog
May 14, 2020

LRIG-New England: Lunchtime Virtual Rapid Fire Event - May 26, 2020

Read Blog
June 10, 2020

Remote Lab Scheduling is No Longer Optional, it is a Requirement

Read Blog
August 2, 2020

Incident Reporting for GxP Compliance

Read Blog
October 15, 2020

Protein Purification with Cytiva UNICORN: Enhanced Analytics through Harmonization and Integration

Read Blog
July 29, 2020

Cloud-based Data Management with Lab Automation: HighRes Biosolutions Cellario + TetraScience

Read Blog
August 20, 2020

Understanding Why Freezer Temperatures May Not Be Uniform

Read Blog
July 14, 2021

Find Experimental Data Faster with Google-Like Search in Tetra Data Platform 3.1 Release

Read Blog
July 22, 2021

Experimental Data in Life Sciences R&D — It’s How Many Copies of Jaws?!

Read Blog
April 26, 2020

The Digital Lab Needs an Intermediate Data Schema (IDS): a First Principle Analysis

Read Blog
April 6, 2020

TetraScience ADF Converter -- Delivering on the Promise of Allotrope and a Startup’s Journey

Read Blog
August 6, 2020

"Data Plumbing" for the Digital Lab

Read Blog
June 8, 2020

Data Automation for High-Throughput Screening with Dotmatics, Tecan, and PerkinElmer Envision

Read Blog
May 15, 2020

Applying Data Automation and Standards to Cell Counter Files

Read Blog
June 11, 2020

AWS Healthcare & Life Sciences Web Day | Virtual Industry Event

Read Blog
February 12, 2021

AWS Executive Conversations: Evolving R&D

Read Blog
April 15, 2021

Announcing Our Series B: The What, When, Why, Who, and Where

Read Blog
April 15, 2021

Announcing our Series B: The DNA Markers of Category Kings and Queens

Read Blog
April 15, 2021

Announcing our Series B: Tetra 1.0 and 2.0 | The Noise and the Signal

Read Blog
March 29, 2020

Allotrope Leaf Node Model — a Balance between Practical Solution and Semantics Compatibility

Read Blog
March 13, 2020

Choose the right alert set points for your freezers, refrigerators, and incubators

Read Blog
August 27, 2020

99 Problems, but an SDMS Ain't One

Read Blog