Cloud-based Data Management with Lab Automation: HighRes Biosolutions Cellario + TetraScience

Kai Wang
July 29, 2020

HighRes Biosolutions Cellario and Tetra Data Platform optimize data management and data flows in the cloud

HighRes Biosolutions designs and builds innovative laboratory automation systems, dynamic scheduling software, and lab automation instruments. Cellario, industry's state-of-the-art lab automation software, enables instrument and robotics scheduling in the lab. Lab automation systems and software generate massive volumes of R&D data that can accelerate therapeutic discovery.

The integration between Cellario and TetraScience automates and streamlines the collection of R&D data generated from Cellario into downstream data science applications and other informatics applications such as ELN and LIMS.

Cellario, with its coupled API layers, is designed to support a wide range of upstream and downstream data integration requirements. Cellario’s RESTful API can be used to integrate with any other software platform. In this case, TetraScience is leveraging Cellario’s publisher/subscriber event APIs to receive data events. In addition to standard data events that every reader creates, end users can easily customize the data stream by using scripts to create data events.

Data is centralized in the Tetra Data Platform and harmonized into the Intermediate Data Schema (IDS)-JSON, which is a structured, vendor-neutral format. Once R&D data is harmonized, it is directly queryable in web API or SQL, and can be further transformed into any format needed. Cellario produced data is now accessible by your favorite data science tools. It can also be combined with other R&D data that customers store in the TetraScience platform for further analysis.

How the integration works

Step 1: Configuration
Our connector was developed in collaboration with HighRes Biosolutions as part of our integration. You can simply configure connection to the Cellario software on the TetraScience platform web interface. No need to kick-off a multi-month customization project, write code from scratch, and then spend even more effort and money to maintain the connection [1].

The product roadmap for this integration includes more features in future releases, such as, filter by event/data types, data selection with a determined time frame. These are based on use cases crowdsourced from Life Sciences companies within the Tetra Network. Continuous platform innovations, like new features and capabilities, are made available to customers regularly.

IMAGE: Configuration of HighRes Biosolutions Cellario software to the Tetra Data Platform


Step 2: Collect RAW Files and Attach Metadata
The first step after configuration is collecting the RAW files. The files are automatically extracted by our Cellario connector and uploaded into the Tetra Data Lake. The connector also collects and attaches important metadata to the files. Metadata and tags are customizable, and often include information about the order, request, plate, Cellario protocol, and/or other relevant metadata. Tagging metadata provides powerful context to integrate data with ELN/LIMS and perform advanced data science and analytics. Sufficient context is one of the foundational steps in FAIR data principles.

IMAGE: Metadata attached to Cellario extracted files


IMAGE: Select Cellario Metadata & Tags


Step 3: Data Engineering + Data Science in the Cloud
After the automatic data collection process, a data pipeline parses the data into IDS-JSON format. The data is now harmonized in the cloud-native data lake. Two immediate benefits of this data engineering are:

  1. Data is is queryable, which means scientists and data scientists can find it
  2. Once your data is accessible, queryable, and in a common format like JSON, it can be imported into a myriad of data science tools to discover actionable insights

IMAGE: Visualization of HighRes Biosolutions lab automation systems usage


IMAGE: Visualization of data produced during a plate reader run


This process enables data and workflows that are accessible and scalable in a secure, cloud-native environment.

Step 4: Inline Cloud-based Data Analysis and DOE
This integration establishes a closed-loop in lab automation and Design of Experiment (DOE) software. There are also APIs to place orders and manage automation systems and Cellario-controlled instruments and devices.

IMAGE: Data automation feedback loop between HighRes Biosolutions and TetraScience

To take it one step further, you can introduce cloud-based in-line analysis and calculation, using interactive data science tools like Jupyter Notebook. The lab automation system now benefits from in-line cloud computation. You can also leverage historical data sets - contributing to the model and instructing the next point to search in the parameter space.


TetraScience + HighRes Biosolutions best-in-class workflow, control, and orchestration solution will no doubt allow scientists to unlock discoveries in life sciences faster. The manual effort associated with data integration and subsequent human error is virtually eliminated - allowing for high integrity and consistency of the data. Leveraging an enterprise-grade and cloud-native platform enables actionable insights in the entire development and discovery process.


Share this article