Meet Kestra, a startup that has been working on an open-source project focused on data orchestration across several services, databases, files, repositories and warehouses. The open source project has attracted thousands of stars on GitHub, proving that there’s some interest and potential behind a new data orchestration platform.
But first, why would you need a data orchestration product in your big company? At some point, big companies start to have data spread out across several storage locations. Some client data can be stored in a legacy ERP, new orders might appear in a database in your cloud infrastructure, etc.
With a data orchestrator, you can extract, transform and load data (or extract, load and transform data) so that all your data is unified and stored in a single location, such as a data warehouse (Snowflake, Google BigQuery, etc.). Many data engineers have been using tools like Airbyte as their data integration platform and an orchestrator to coordinate and create triggers.
The best way to describe Kestra is by defining why it is different from what’s out there. If you are familiar with Apache Airflow, Kestra can be used as an alternative to Airflow with some major differences.
Instead of using Python code, Kestra is based on YAML configuration files — and if you’ve played with Docker images at some point, you may already be familiar with YAML.
Kestra’s API is treated as a first-class citizen, meaning that it integrates well with other tools and systems. In other words, Kestra has been designed to be language-agnostic thanks to YAML usage for orchestration and the ability to change workflows and create tasks using Kestra’s API.
And the reason why Kestra is bringing the infrastructure-as-code model to data orchestration is because the startup thinks data management should be handled by all engineers and business users instead of a specific team of data engineers.
On top of this opinionated approach, Kestra has a solid library of integrations with official plugins for major cloud providers (AWS, Azure and Google Cloud), data warehouses (Snowflake and BigQuery), dbt for data transformation, Airbyte for data integration, and more.
Kestra also has a user interface that makes it easy to create scheduled and event-driven workflows (“if this happens, do that”). With this UI, business users can also rely on Kestra to create SQL queries and build internal tools for internal reporting.
Originally from France and co-founded by Emmanuel Darras and Ludovic Dehon, Kestra has raised $3 million in a seed round co-led by ISAI and Axeleo Capital. Several angel investors also participated in the round, such as Olivier Pomel from Datadog, Stan Christians from Collibra, Pierre Burgy from Strapi and Olivier Bonnet from BlaBlaCar.
In addition to the open-source orchestrator, Kestra also has an enterprise edition and several big clients relying on Kestra to handle millions of orchestration events per month, such as Leroy Merlin, Huawei, Acxiom, Tencent, Gorgias, Sophia Genetics and Decathlon.
Kestra’s end goal is to create an orchestration tool that can be used for all orchestration needs — not just data orchestration. Many companies end up creating speficic teams specialized on data orchestration, microservice orchestration, infrastructure and more. Kestra wants to build a single platform that can be used in a versatile way by everyone working on those tasks.