Our objective was to outline a reproducible and scalable process of design and technology utilization to support the automation of cell line development at a major U.S. pharmaceutical company.
The hypothesis was that automation can bring more accurate process modeling at scale and result in better optimization through analytics of that modeled process.
Bioprocess workflows such as cell line development, on average, require 15+ instruments for the collection of process parameters & result data. We present repeatable steps to implement a high-throughput system. We found component considerations in connection methods, cloud technology selection, data model design, ontology selection and data consumption mechanisms.
We review our:
We leveraged a scientific workshop to outline existing cell development processes. Identifying available data through integration we engineered a technology solution to focus on methods of acquisition, storage, and processing data in the cloud. Overall method throughput was contrasted with manual process data. General flow presented in Fig-1.
1. Process Mapping Workshop
2. Technology Architecture
3. Modeling & Throughput Analysis
Fig-1. Stepwise method categories flow chart.
In person interviews of each scientist responsible for every step in the process help determine an inventory of systems, identify data fields important for analysis and determine where data is stored at present. Output seen in Fig-2.
Fig-2. Resulting process flow mapping from workshop, identifying stepwise data sources, data formats and storage locations.
Fig-3. Connection mechanisms chosen for each instrument data source.
Getting data to the cloud to preform parsing, standardization & facilitate consumption means connections to each system must be setup. We identify methods of integrating each system in the process in Fig-3.
Fig-4. Implemented architecture diagram including connections, cloud platform and services, and data consumption elements. ML & BI tools were key focus areas for providing the high-throughput analytic capabilities.
During the first several months of operation to date, only BI tools have been leveraged to process data. In a given week of process development work, the system saved 4-6 hours of time spent on manual data manipulation vs. the automated system.
The system exposed several new fields via direct integration and provided search-ability time savings advantages. Standardized ontologies and an unified data model enabled one set of filters executed in BI tools to display data across instrument types, which was not before possible without extensive manual data manipulation.
AWS Services including S3, Lambda, and Athena provide critical performance and cost advantages at scale. Providing low cost storage, on-demand processing and flattened data via SQL respectively.