{"@context":"https://schema.org","@type":"BlogPosting","mainEntityOfPage":{"@type":"WebPage","@id":"https://www.tetrascience.com/blog/escaping-the-scientific-data-quagmire"},"headline":"Escaping the Scientific Data Quagmire","image":"https://assets.website-files.com/61425307194b43f47e3c27a2/61805ff6ab7b3692ef835fb2_611d1d9e4a5d335ee83ac430_Data-Exchange-Between-Instruments-and-ELN_LIMS_800w-02.jpeg","author":{"@type":"Person","name":"Mary-Ann Moore"},"publisher":{"@type":"Organization","name":"","logo":{"@type":"ImageObject","url":""}},"datePublished":"2022-04-11"}

Escaping the Scientific Data Quagmire

AI/ML and advanced analytics offer biopharma the tantalizing prospect of transforming drug discovery by identifying and bringing game-changing therapeutics to market faster. But these accelerating technologies can't be applied efficiently so long as scientific data remain a quagmire.
Mary-Ann Moore, VP Industry Marketing, TetraScience
|
April 11, 2022

The scientific data quagmire – siloed, fragmented data in hundreds of proprietary formats, manually retrieved from instruments, edited in spreadsheets – wastes enormous time. Worse, that time is stolen from some of the industry’s most important players: the scientists, data scientists, data engineers, informaticians and analysts who might otherwise use their valuable skills to generate fresh knowledge and improve processes to better discover, develop and deliver new life-saving therapeutics.

Worse yet, even as data volumes continue to grow at astronomical rates, the data quagmire prevents biopharma organizations from effectively applying transformative technologies — like automation, analytics, and AI/ML (Artificial Intelligence and Machine Learning). Technologies that can help them keep up, stay ahead, and generate new and valuable insights. Technologies that unblock new and promising avenues of research, like genome studies, whose vast search spaces can only be effectively explored with machine assistance.

The problem, as CSO Mike Tarselli made clear in a recent interview published by AZO Life Sciences, is that analytics software can only consume what he calls "arranged data" – data in a standardized format, structured, normalized, and contextualized according to well-understood, Big Data rules.

What “Lab of the Future” Would You Build?

Those envisioning, leading, and implementing the next generation of biopharma laboratory evolution are asking themselves how to solve this problem. How do you make scientific data findable, accessible, interoperable, and reusable (FAIR), so that science-accelerators like AI/ML can be put to work in earnest? They're also asking the follow-on question: If we could make the data quagmire go away, what would we build?

Manual No More: Automating the Scientific Data Lifecycle, a new whitepaper from TetraScience CTO Spin Wang and CSO Mike Tarselli, outlines a practical solution to the data quagmire problem. Their proposed method applies well-understood automation and data transformation technology to speed up iterative experimental workflows, increase predictability, eliminate manual data handling, and save scientists’ time. They then expand the strategy – showing how analytics and AI/ML can be inserted into the loop to provide deeper and more complete insights to the organization.

Automating the DMTA Loop with a Data-centric Scientific Platform

The white paper opens with this simple question – originally asked of TetraScience customers, partners, and advisors: "If you built a lab from scratch to orchestrate the free flow of information and data across your laboratory instruments and informatics applications to perform data analysis … what would you build?"

Those asked responded with a tight wish-list focused on automating parts of the so-called Design/Make/Test/Analyze loop (Figure 1).

Figure 1 - The DMTA loop is a basic model for how experimental iteration holds clues for those seeking to use automation to accelerate science.

Customers sought to create connectivity among lab informatics systems like ELNs and LIMS, control software for automated instruments (e.g., HPLC, chromatography data systems), robotics, and standalone instruments like balances and pH meters. This would let scientists design experiments in their preferred digital workbench applications, then export these designs to automate experimental runs. Results of experiments, and measurements made on non-networked instruments would all be returned to the informatics platforms; and become accessible from centralized storage.

Customers had requirements for the data too. They wanted raw data extracted, transformed and harmonized into a vendor-neutral format, and stored in an “arranged” way: prepared and formatted to simplify modeling and advanced analytics, REST API-accessible, able to flow among and be used to update multiple systems (e.g., sample management, inventory, etc.), and perhaps already integrated with popular analytics platforms like SpotFire or Tableau, making custom analytics fast and simple.

As the whitepaper makes clear, this requires a scientific data cloud: a data-centric platform designed to extract, transform, and harmonize data, store it, make it accessible, and deliver it to targets reliably, serving the requirements of complex workflows. Part of the benefit of such a system is that it can be used to break scientific data out of "walled garden" silos (e.g. instruments and their control software, which may integrate with popular ELNs, but are seldom easily integrated with other software), separate the data from instrument command/control messaging, and ultimately make the data FAIR, compliant, harmonized, liquid, and actionable.

Two Models for Scientific Data Automation

The whitepaper follows this wish-list to specify two, similar models for basic laboratory automation. In the first model – simpler – scientists, perhaps working with visualization and/or analytics tools, make decisions to guide each Design/Make/Test/Analyze iteration. In the second, software accelerates decision-making. In some scenarios, a scientist may interact with a decision-tree to choose among alternatives suggested by analytics or AI/ML. In others, AI/ML or heuristic software takes over and runs the DMTA loop to a stopping-point.

The models differ, but can be taken to represent successive steps in a lab's progress towards reduced scientist labor, and greater speed and efficiency. The first step makes maximal use of scientist participation. In the second, the scientist (or a number of scientists) serves as the model (for decision-trees and heuristics) or as the de facto trainer of machine-learning models – in effect, porting essential, but low-value work into software to free up staff time for higher-value work.

Moving from Scientific Data Quagmire to the Future of Laboratory Automation

Reimagining scientific data management leads to a new world where FAIR data flows freely between instruments, informatics and analytics systems to accelerate scientific innovation across biopharma discovery, development and delivery. Learning how this works in greater detail can be a first step towards helping your organization navigate out of the data quagmire and into the fast-moving future of laboratory automation.

Share this article

Previous post

There is no previous post
Back to all posts
June 27, 2022

Barrier Busting: Bringing ELN and LIMS Scientific Data Together

Read Blog
May 31, 2022

Committed to Curing Diabetes

Read Blog
May 23, 2022

New Frontiers: World’s First Community-Driven AI Store for Biology

Read Blog
May 18, 2022

Tetra Blasts Off at Boston’s Bio-IT World

Read Blog
May 9, 2022

Give Your in vivo Data the Attention it Deserves

Read Blog
May 2, 2022

Customizing Digital Lab Experiences With Ease

Read Blog
April 14, 2022

Sharing a Vision and Deep Customer Commitment

Read Blog
April 11, 2022

Escaping the Scientific Data Quagmire

Read Blog
April 1, 2022

Innovating with a HoloLens and Drones

Read Blog
April 6, 2022

Digital Twins: Seeing Double with a Predictive Eye

Read Blog
March 28, 2022

Automated Anomaly Detection and Correction

Read Blog
March 30, 2022

Making Labs More Efficient

Read Blog
March 4, 2022

Introducing Tetra Data Platform v3.2

Read Blog
March 2, 2022

Are you prepared to utilize ML/AI and Data Visualization?

Read Blog
February 22, 2022

SLAS 2022: The Industry’s “Hyped” for Accessible and Actionable Scientific Data

Read Blog
February 21, 2022

BIOVIA partners with TetraScience

Read Blog
February 16, 2022

Tetra Partner Network: An Interview with Klemen Zupancic, CEO, SciNote

Read Blog
February 4, 2022

Closing the Data Gap in Cancer Research

Read Blog
January 27, 2022

Waters & The Tetra Partner Network: Making Data Science Possible

Read Blog
December 16, 2021

Announcing Acquisition of Tetra Lab Monitoring Business by Elemental Machines

Read Blog
November 29, 2021

Move From Fractal to Flywheel with The Tetra Partner Network

Read Blog
March 26, 2021

How an IDS Complements Raw Experimental R&D Data in the Digital Lab

Read Blog
July 30, 2021

What is an R&D Data Cloud? (And Why Should You Care?)

Read Blog
March 26, 2021

What is a True Data Integration, Anyway?

Read Blog
June 1, 2020

Data Science Use Cases for the Digital Lab: Novel Analyses with Waters Empower CDS Data

Read Blog
April 20, 2022

Unlock the Power of Your ELN and LIMS

Read Blog
July 23, 2020

The Science Behind Trash Data

Read Blog
August 20, 2021

The 4 Keys to Unlock the Lab of the Future

Read Blog
September 29, 2021

TetraScience Achieves SOC 2 Type 2 Validation, Advances R&D Data Cloud GxP Compliance Capabilities

Read Blog
April 20, 2020

Round-up of Semantic Web thought leadership articles

Read Blog
May 11, 2021

R&D Data Cloud: Moving Your Digital Lab Beyond SDMS

Read Blog
September 10, 2021

Principles of Deep Learning Theory

Read Blog
July 8, 2020

Powering Bioprocessing 4.0 for Therapeutic Development

Read Blog
March 30, 2022

Why Biopharma Needs an End-to-End, Purpose-Built Platform for Scientific Data — Part 2

Read Blog
August 19, 2021

Part 2: How TetraScience Approaches the Challenge of Scaling True Scientific Data Integrations

Read Blog
March 23, 2022

Why Biopharma Needs an End-to-End, Purpose-Built Platform for Scientific Data — Part 1

Read Blog
January 18, 2021

New Matter: Inside the Minds of SLAS Scientists Podcast

Read Blog
June 29, 2020

Enabling Compliance in GxP Labs

Read Blog
May 14, 2020

LRIG-New England: Lunchtime Virtual Rapid Fire Event - May 26, 2020

Read Blog
June 10, 2020

Remote Lab Scheduling is No Longer Optional, it is a Requirement

Read Blog
August 2, 2020

Incident Reporting for GxP Compliance

Read Blog
October 15, 2020

Protein Purification with Cytiva UNICORN: Enhanced Analytics through Harmonization and Integration

Read Blog
July 29, 2020

Cloud-based Data Management with Lab Automation: HighRes Biosolutions Cellario + TetraScience

Read Blog
August 20, 2020

Understanding Why Freezer Temperatures May Not Be Uniform

Read Blog
July 14, 2021

Find Experimental Data Faster with Google-Like Search in Tetra Data Platform 3.1 Release

Read Blog
July 22, 2021

Experimental Data in Life Sciences R&D — It’s How Many Copies of Jaws?!

Read Blog
April 26, 2020

The Digital Lab Needs an Intermediate Data Schema (IDS): a First Principle Analysis

Read Blog
April 6, 2020

TetraScience ADF Converter -- Delivering on the Promise of Allotrope and a Startup’s Journey

Read Blog
August 6, 2020

"Data Plumbing" for the Digital Lab

Read Blog
June 8, 2020

Data Automation for High-Throughput Screening with Dotmatics, Tecan, and PerkinElmer Envision

Read Blog
May 15, 2020

Applying Data Automation and Standards to Cell Counter Files

Read Blog
June 11, 2020

AWS Healthcare & Life Sciences Web Day | Virtual Industry Event

Read Blog
February 12, 2021

AWS Executive Conversations: Evolving R&D

Read Blog
April 15, 2021

Announcing Our Series B: The What, When, Why, Who, and Where

Read Blog
April 15, 2021

Announcing our Series B: The DNA Markers of Category Kings and Queens

Read Blog
April 15, 2021

Announcing our Series B: Tetra 1.0 and 2.0 | The Noise and the Signal

Read Blog
March 29, 2020

Allotrope Leaf Node Model — a Balance between Practical Solution and Semantics Compatibility

Read Blog
March 13, 2020

Choose the right alert set points for your freezers, refrigerators, and incubators

Read Blog
August 27, 2020

99 Problems, but an SDMS Ain't One

Read Blog