There are a few conversations I have had too many times.
A frequent one is about the completion of the data contract. “Do I have to fill every section and property?” Another time, someone shows me their data contract, visibly proud, and asks: “Is this good?” I smile, nod, and think: compared to what?
Until recently, “good” was subjective. Either you had a contract, or you did not. That is not a maturity framework; that is binary thinking dressed up in business casual.
So I built one—measurable, quantifiable, no subjective nonsense. But still customizable because every company is its own little snowflake.
Why Maturity Matters for Data Contracts
The problem is not adoption. Teams are writing data contracts. The problem is consistency. One team produces a rigorous, governed, testable contract. Another produces a named schema with a status field and calls it done. Both claim to “do data contracts.” Only one is actually reducing risk, accelerating trust, and enabling automation.
The ODCS Maturity Model creates a common yardstick. Every criterion is factual and quantifiable. If you cannot measure it, it is not in the model. But it is customizable by the governance and architect teams.
The Five Levels in Detail
Those are my five levels. You can have three or nine, but I usually have to find five.
Level 1: Structural
A Level 1 contract is valid, machine-readable, and contains the minimum fields required for any tooling to process it: an identifier, a name, a version, a status, and a schema with typed fields.
This sounds basic. And yet, a surprising number of contracts in the wild fail here: invalid status values, missing identifiers, and unnamed fields. The cost is silent pipeline failures and catalog ingestion errors that take hours to trace back to their source.
Business impact: Prevents silent failures in data pipelines and catalog ingestion.
Level 2: Descriptive
A Level 2 contract has a business context. Fields are described. Data is classified (as in, organized; the contract is not in the Epstein files). Is this sensitive? PII? public? The purpose of the dataset is documented. Tags make it discoverable in your data marketplace.
This is where data contracts start delivering value beyond the engineering team. Onboarding accelerates. Tribal knowledge ceases to be the only path to understanding a dataset. A business analyst should be able to open a Level 2 contract and know what the data is, who it is for, and whether it fits their use case, without asking anyone.
Business impact: Quicker onboarding, reduced dependency on tribal knowledge, better discoverability in marketplaces, and faster time-to-market of data science projects.
Level 3: Governed
Level 3 is where accountability comes into play. A governed contract defines who owns the data, what roles can access it, where to get support, and what operational commitments exist in terms of service levels and infrastructure.
This is one of the differences between a data asset and a data product. Without Level 3, you have documentation. With it, you have a commitment: one that can be audited, enforced, and pointed to when regulators ask questions. Data in production without an owner and an SLA is a liability that has not yet sent you an invoice.
Business impact: Clear accountability, enforceable SLAs, auditability for regulatory requirements, and reduced escalation cost when data quality degrades.
Level 4: Quality-Enforced
A Level 4 contract does not just describe data. It specifies expectations that quality engines can execute: which fields must never be null, which rules run on a schedule, what constitutes a breach of contract, and how severe each violation is.
This transforms the data contract from a passive document into an active quality gate. Data quality stops being a post-hoc discovery and becomes a designed-in guarantee. A contract without quality rules is a promise with no verification mechanism. The data contract jumps into the Ops world.
Business impact: Fewer incidents reaching consumers, faster detection of upstream issues, foundation for data quality SLAs, measurable data reliability.
Level 5: Optimized
Level 5 is full coverage. Every field is described. Every field has a quality rule. Quality is measured across multiple dimensions (completeness, accuracy, timeliness, uniqueness, conformity, consistency, and coverage, see Data QoS). The contract is not a snapshot of intent; it is a continuously measured truth.
Is a Level 5 contract your AI-readiness certificate? Probably not, but getting dam close to it. Models trained or fed on data governed at this level have a traceable, testable provenance. In a regulated industry, that is not a nice-to-have. It is a requirement. Think of it as the audit trail your legal team dreams about and your data science team actually uses.
Business impact: Continuous data reliability, AI-ready data with traceable provenance, chargeback-ready usage tracking, and reduced compliance risk.
Can I Skip…
And by the way, levels are cumulative. There are no shortcuts.
Not every contract needs to be at Level 5. Think of it as a portfolio: not every stock is a blue chip, but your portfolio needs some. You can definitely measure your entire organization’s maturity by assessing the maturity of its contracts. You can average your contracts’ level and eventually weigh them by usage.
You can set some hard rules too, like:
-
Level 2 for any contract entering a data catalog or marketplace. If it is not descriptive, it will not be reused.
-
Level 3 for anything in production. No exceptions.
-
Level 5 for your most critical, most-consumed datasets, the ones your business decisions and AI models actually depend on.
Having Contracts Is No Longer the Goal
Knowing how mature data contracts are is. Maturity turns documentation into accountability, quality into enforcement, and data into something your business (and your AI) can actually trust. So where do your contracts stand?
Let’s measure them.
Let’s raise the bar.
Let’s talk about it.


