Automate data engineering and integration
Data pipelines are a core feature of the Tetra Data Platform, performing key event-driven extract-transfer-load (ETL) functions. Tetra Data Platform allows you to implement complex multi-step processes in the programming language of your choice, quickly configure pipelines by leveraging our library of pipeline components and integrations with common informatics applications, and manage everything from a centralized web UI.
How data pipelines work
Triggers are a powerful way to control when and under what circumstances a data pipeline is initiated. Tetra Data Platform supports sophisticated trigger logic allowing you to tailor your data flow based on your business logic.
Tasks are the atomic function to perform in each pipeline. You can also use any languages, packages and binaries by configuring your own Docker image. Build your tasks programmatically or directly from Jupyter Notebook.
Protocols determine what tasks are run and the sequence of execution. They natively support branching, loop, if-else, and complex data flow logic. Easily run tasks in parallel and control concurrency, to accommodate different data flows requirements.
Create pipelines, configure triggers, view pipeline statuses, logs, and set up automatic notifications.
Expedite workflows by leveraging common tasks and protocols to process scientific data and integrate with common informatics applications.
Built-in auto scaling provides high throughput for your data flow. Our cloud-native platform can dynamically allocate more computation resources and maintain elasticity.
Choose Your Own Programming Language
Use your favorite programming environment, your tool of choice to run continuous tests, build the artifact, and then deploy to Tetra Data Platform.
Don’t reinvent the wheel: Tetra Data Platform is compatible with your existing pipelines, leveraging TetraScience SDK.