AIDA (Artificial Intelligence, Data, and Analytics) User Group has released version 2.2 of the Open Data Contract Standard (ODCS). Let’s investigate this evolution contract and opportunities to discuss, face to face, the standard and general data contracts.
The data contract proposes an agreement between data producers and consumers around eight sections. Data product owners own the contract. The former data contract template (discussed in early May) is evolving based on needs & discussions with the community.
Upcoming round table & talk
Let’s meet and discuss data contracts. Two events will occur during IBM TechXchange in Las Vegas, NV, between September 11th and September 14th, 2023.
During Community Day (September 11th, 2023), I will animate a round table on data contracts, going more in-depth, exchanging, and more, in a similar fashion as Scott Hirleman and I are doing with our weekly round tables (check out our data contract chat). Community Day is sponsored by AIDA User Group.
I will also have a dedicated session around data contracts, titled Put to work the eight sections of a data contract. In this session, I will explain the various sections and how you can leverage them in the complete independence of any product, making them the perfect companion in modern data engineering. I will follow the usage of the data contract as I have been using them in the context of a Data Mesh. The targeted audience is data and software professionals curious about modern data engineering. There is no need to know what a data mesh is.
Version 2.2 of the Open Data Contract Standard (ODCS) was released on July 27th, 2023. It’s a minor evolution from the previous version, v2.1.1. You can find the updated version at GitHub.com/AIDAUserGroup/open-data-contract-standard.
Let’s go through the main changes.
Authoritative definitions are the methodology to link your data contracts to various external systems, creating authoritative links that you can use for data governance.
Authoritative definitions were allowed at the column level; now, they are also available at the table level in the contract.
In the following example, you can see that the table tbl has two definitions. The first one is a business definition, which has a link to the data.gov website. The second is a video tutorial pointing to YouTube. Of course, those links could be pointing to internal systems in an enterprise setting.
- table: tbl authoritativeDefinitions: - url: https://catalog.data.gov/dataset/air-quality type: businessDefinition - url: https://youtu.be/jbY1BKFj9ec type: videoTutorial
As a reminder, authoritative definitions were allowed at the column level to indicate the definition of transformations or column-level business transformation.
- table: tbl - column: rcvr_cntry_code authoritativeDefinitions: - url: https://collibra.com/asset/742b358f-71a5-4ab1-bda4-dcdba9418c25 type: businessDefinition - url: https://github.com/myorg/myrepo type: transformationImplementation - url: jdbc:postgresql://localhost:5432/adventureworks/tbl_1/rcvr_cntry_code type: implementation
The next version, currently in development, will normalize the types of authoritative definitions.
Hosting & governance
As a non-profit organization, AIDA User Group is a perfect vehicle for hosting and developing the data contract standard. The data contract template has been renamed Open Data Contract Standard as part of this fork.
Many examples have been added to a dedicated folder. Check them out!
Calling for contributions
The Open Data Contract Standard is a community effort. You can start a discussion or report issues. Of course, as with any other Open Source project, feel free to do pull requests as you see typos, errors, or additional information. Examples are also welcome.
Featured photo by Pedro Figueras on Pexels.com