Start of Main Content

With all of the whirlwind news surrounding our favorite tools in the last few months, you might have missed this notable announcement: Snowflake launched the Open Semantic Interchange Initiative (OSI) in collaboration with companies like dbt Labs, Sigma, Omni, Hex, and a dozen others. On November 13th, they added 12 more members, including Google and AWS, and held their first working group session.

It's the biggest effort yet to get all our data tools to speak the same language, essentially creating one common semantic standard that works across your entire tech stack.

Before OSI: Fragmented

Before OSI, each data tool had (or didn't have) its own iteration of a semantic layer – also known as a metrics layer, context layer, Headless BI, or my personal favorite, the pedantic layer. If you're not familiar, a semantic layer is essentially the translation between your raw data (e.g., int_cust_ltv_m) and business users (e.g., Customer Lifetime Value). It's where you might define what revenue means, how churn is calculated, and which filters matter for your metrics.

The trouble is, every vendor built their own. Looker has LookML, dbt has MetricFlow, even Airbnb built its own proprietary metrics platform called Minerva. In the words of a Brooklyn Data analytics engineer, "It makes zero sense for vendors to have their own semantic model. It is simply an artifact from all of these companies moving quickly, at the same time."

This system allowed data companies to own their clients’ business logic, but at the expense of creating “AI’s most fundamental bottleneck.” AI agents need consistent definitions to reason across your data stack. Fragmenting those definitions across proprietary formats makes intelligent automation nearly impossible.

After OSI: Collaborative

A diagram of OSI architecture models and flow
OSI architecture

In a move that's both pragmatic and strategic, each OSI participant decided they'd rather focus on building the best driving experience rather than reinventing the wheel. OSI's vendor-neutral semantic layer will enable data companies to invest in user experience and agentic analytical capabilities. Snowflake is not shy about their enthusiasm for this initiative – they state it will “[promote] unparalleled interoperability, efficiency, and collaboration among all participants.” Core to this initiative is dbt Lab’s announcement that it will open-source MetricFlow under the Apache 2.0 license.

At Brooklyn Data, we’re both very excited and healthily skeptical. As one Brooklyn Data engineer put it, "having fiddled with AI agents some, I'm a huge believer that this is the future of BI. If you were writing a Sci-Fi novel, your characters would be interacting with data exactly this way."

Imagine you’re Tony Stark, interacting with your own Jarvis. Asking it questions, like “how much energy did my suit use yesterday?” And knowing that the answer includes energy from your suit, your arc reactor, and the re-charge you did at 12pm EST. In real life, your AI agent might reason across your transformation logic in dbt, your visualization in Sigma, and your experimentation platform in Hex, all using the same understanding of what a metric like ‘active customer’ actually means. That's the promise.

What This Means for You

For now, not much – the standard is still being built. When it arrives, it will look like your typical YAML configuration with space for definitions, type parameters, and filters. It should live wherever you do your data transformations, likely sitting alongside your dbt models or Snowflake transformations.

But as a data leader, you should start thinking about your V1. If you have the investment to get started now, you can begin the foundational work of metric governance. Document your metrics, identify owners, and start writing one canonical definition for each. This cross-team coordination work won't magically disappear when OSI arrives. Wrangling stakeholders and reconciling competing definitions is often the hardest part. Getting ahead of that now means you'll be ready to adopt OSI when it goes live.

One note: if your metrics are built from simple sums, averages, or counts, you might not need to concern yourself with a semantic layer at all. Well-documented and defined transformations may get you 90% of the way there. But if you're managing complex calculations across multiple teams and tools, eyeing those agentic BI capabilities on the horizon, or require 99.99% AI-accuracy, OSI’s semantic layer will be worth watching closely.

The timeline for adoption is still unclear, and past standardization efforts in data tooling have had mixed success (cough cough SQL-flavors). But with this level of industry alignment, as well as the pull of AI use cases, this one feels different.

Additional Recommended Reads:

Snowflake Unites Industry Leaders to Unlock AI's Potential with the Open Semantic Interchange Initiative
Author: Snowflake (with Christian Kleinerman, EVP of Product)

The $1 trillion AI problem: Why Snowflake, Tableau and BlackRock are giving away their data secrets
Author: VentureBeat

Why Semantic Layers Matter — and How to Build One with DuckDB
Author: Simon Späti (MotherDuck Blog)

Rise of the Semantic Layer
Author: Simon Späti

[Feature] dbt should know about metrics
Author
: dbt Labs

Beyond YAML: Why Semantic Layers Need Real Programming Languages
Author
: Carlin Eng & Lloyd Tabb (Malloy Data)

The $1 trillion AI problem: Why Snowflake, Tableau and BlackRock are giving away their data secrets
Author: VentureBeat

The missing piece of the modern data stack
Author: Benn Stancil

The context layer
Author: Benn Stancil

The Pedantic Layer
Author: Joe Reis

Published:
  • Data Strategy and Governance
  • Data Reporting and Dashboarding
  • Business Intelligence
  • Data Governance
  • Metric Strategy
  • Snowflake

Take advantage of our expertise on your next project