Why the Need for Standardizing Data Contracts?
This article describes the genesis of Bitol, the open-source data contract standard and solutions that are an essential part of modern data engineering. In an era where trusting data is becoming increasingly important, data contracts act as an agreement between multiple parties, specifically, a data producer and its consumer(s).
Note that this article focuses on the early standardization process and does not review what a data contract is. To learn more about data contracts and how you can help to build them, check the resources at the end.
The Risk of Fragmentation
Data contracts are a relatively nascent concept for many. However, their origin comes back from (at least) the 90s, as earlier versions were used in CASE tools for code generation. The lack of standardization and the many unfruitful attempts impaired innovation.
Having disparate and non-regulated formats slows innovation, increases vendor lock-in, and, ultimately, sabotages the standard. Imagine if, back in 1994, Netscape had its proprietary version of HTML; neither Internet Explorer nor Safari could have taken off (I hear the badmouthers in the back). This lack of standardization would have created a fragmented market, and the web would have been vendor-based. Which is indeed what Compuserve or AOL were at that time.
Bitol Is All About Open Standards

Bitol is a Linux Foundation AI & Data project that creates and maintains open standards for modern data engineering, starting with data contracts through Open Data Contract Standard (ODCS).
As our teams built an implementation of Data Mesh, we realized the need for a resource descriptor. The number of elements needed in this descriptor kept growing. That’s when we decided to restructure the format and adopt a data contract approach. A few months later, we open-sourced a version of the template. I later took the template to a broader community, AIDA User Group, where it started its incubation process. Although AIDA User Group is a fantastic organization, it is not suited for developing open source & open standards. That’s where the Linux Foundation came into the dance.
Governance Is Key
Building the technical steering committee (TSC) was the next step. I set the bar high to get some of the world’s experts in data contracts and data products. However, the committee needed a variety of people of various backgrounds. We wanted participants to be users, vendors, and service providers to ensure we had good coverage. We also wanted experts and learners from around the world. The TSC has reached those goals. More about the TSC will come shortly.

Bitol Is a God Of Creation
A Mayan sky god, Bitol is one of the creator and destroyer deities who participated in the last two attempts at creating humanity. At the beginning of days, they attempted to form sentient creatures with their associates: Alom, Qaholom, and Tzacol. In the third creation (or iteration), Bitol was transformed into Ixmacane.
Bitol, as a god, is a perfect analogy for this project. However, you will never know what iteration number we are.
Join Bitol’s Slack and Help Shape the Future of Data Contracts!
Data contracts are a critical part of modern data engineering, acting as agreements between data producers and consumers to ensure trust, quality, and interoperability. However, the lack of standardization has historically led to fragmentation, vendor lock-in, and slowed innovation — similar to how the early web could have been locked into proprietary formats.
That’s why Bitol, a Linux Foundation AI & Data project, is leading the charge with the Open Data Contract Standard (ODCS). Our mission is to establish open, vendor-neutral standards that enable organizations to implement Data Mesh and modern data architectures with confidence.
Why Join Bitol’s Slack?
- Connect with industry experts and contributors shaping open data standards.
- Stay up-to-date on the latest developments in ODCS and related initiatives.
- Share insights, ask questions, and collaborate with the Technical Steering Committee (TSC) and the broader data community.
- Help drive the next evolution of data engineering by contributing your expertise, whether as a user, vendor, or service provider.
Let’s build the future of data contracts together! Join the Bitol Slack today.
AbeaData Provides Support
You may have yet to hear more about AbeaData, but we are a team of senior data & software professionals, veterans, some may say. One way to know more is to sign up for our countdown on our website.
Existing & Additional Resources
A lot of resources are popping around to explain the concept and implementations. As often with a new technology or concept, you have piggybackers who half understand the idea but want in or commercial ventures trolling about their products. So be careful; however, I am happy to add to this list.
First, as we say in France, charitĂ© bien ordonnĂ©e commence par soi-mĂŞme (charity begins at home). The first time I mention data contracts it is in the popular article The next generation of Data Platforms is the Data Mesh. However, Data Contract 101 is really where I dive into more details. In What is Data QoS, and Why is it Critical, I dig more into the idea of Data Quality and service levels and how they relate to data contracts. Don’t forget Implementing Data Mesh at O’Reilly, which is being written with my great friend Eric Broda.
Andrew Jones covers data contracts in his book Driving Data Quality with Data Contracts: a comprehensive guide to building reliable, trusted, and effective data platforms. It’s a very good book. I’m afraid I have to disagree with some of the things Andrew writes, but that’s more at the detail level and a discussion for the pub. Buy it confidently.
Chad Sanderson also has a few articles on his substack. Chad believes strongly in data contracts as well.
Please suggest other resources in the comments, and I will gladly add them here.
Updates:
- 2025–01–31: Dissociation from AIDA. Added Slack link.
Why the need for standardizing Data Contracts? was originally published in AbeaData on Medium, where people are continuing the conversation by highlighting and responding to this story.