On the Knife’s Edge in Data and AI: Takeaways from Data Council 2025

May 09, 2025

Last month in Oakland, Data Council 2025 felt like an early unveiling of the next gen, AI‑native data stack. Over three packed days, database legends, AI researchers, and bleeding-edge startup founders converged on a single storyline: our monolithic warehouses are being deconstructed into Lego bricks, and those bricks are being animated by self-driving logic. In this new era, sub-second interactivity/responsiveness has gone from luxury to table stakes. Soon, agents won’t only patrol BI dashboards, but run entire data pipelines, with ironclad observability and self-healing capabilities. For those that missed this awesome conference, here are some of my major takeaways.

The Future Isn’t Just Real Time, It’s Responsive

There was a big turnout by companies that provide low-latency data infrastructure. As I talked to their reps about features like Firebolt’s sub-second queries, ClickHouse’s focus on real-time analytics, MotherDuck’s instant SQL previews, and local-first DuckDB in the browser, it became clear that users can now expect interactive, immediate results, whether from dashboards, apps, or AI assistants. And not just over analytics data – as Aaron Katz (CEO @ Clickhouse) noted in Day 2’s keynote fireside chat, OLAP will remain core, but OLTP and OLAP are unbundling and blending. While Hybrid Transactional/Analytical Processing databases (HTAP) are not a new concept, advancements in technology and architecture have made them increasingly compelling.

Still, the perception of latency is relative. In one talk by Ethan Brown (Twitch), their new internal analytics slackbot handles masses of natural language analytics requests from several teams, at all hours of day and night. These requests used to be handled manually by analysts, becoming an unsustainable drain on resources. Even with 60 seconds of latency from the AI slackbot, it still feels real time. Teams who were used to waiting hours are thrilled, and now analysts can finally sleep through the night. To hammer this point home, Nikunj Handa (OpenAI) spoke on OAI’s evolving Responses API, citing an observed preference of trading latency for quality as the decision to make the API agentic by default. A single API call triggers an agentic loop that plans, iterates, and uses provided (and preloaded) tools to produce the best possible results.

Overall, developer platforms can now prioritize asynchronous, near-real-time architectures. Whether it’s supporting WebSocket query streams, or building UIs that update incrementally (like streaming SQL results), the tools that make data work feel instantaneous will set the new standard. This also means data engineers should get comfortable with event-driven patterns and local caching.

The Semantic Layer is Back: AI and Self Service Demand It

In the last decade, the idea of a separate semantic abstraction faded as the industry chased ELT and “schema on read” in data lakes. Sure, tools like Looker had some popularity, but most teams leaned into dbt/Jinja for transformations, treating “metrics” as just another SQL recipe.

(As you can imagine, this resurgence of interest has Looker’s creator, Lloyd Tabb, hyped! Infectiously enthusiastic, he was there to show off his new project, Malloy, that takes the semantic layer to the next level by embedding reusable, object‑oriented query definitions)

I promise I won’t bring up my company, Twing Data, repeatedly, but it’s important to support the point. From what we see, even with DBT, data fragmentation happens easily and is extremely difficult to pinpoint and fix. As data consumption diversifies (from humans in a notebook, to AI agents in Slack, and well beyond), the risk of fragmentation increases. So does the need for a unified semantic layer to minimize that risk.

I saw great presentations from the aforementioned Lloyd Tabb (Malloy) and Michael Driscoll (Rill Data), representing the next-gen semantic layer and metric store, respectively.

Semantic Layer: Malloy

When every dashboard or AI bot defines “revenue” in its own SQL, you get drift. Malloy instead lets you declare “revenue” once, with explicit grain and business logic built in, so a report in Looker, a DuckDB notebook, or an LLM‑powered Slackbot all compute the identical number

Metric Store: Rill

Bridging OLTP, OLAP, and AI, Rill Data’s SQL‑based metrics layer similarly centralizes KPI definitions, then automatically monitors them for anomalies and powers real‑time dashboards. That same layer can be queried by AI agents or BI UIs, ensuring every consumer sees the same semantics and alert rules, all without custom code per tool.

Also worth mentioning, and as a segue to the next takeaway, after hearing Julien Le Dem’s (co-creator of Parquet and Arrow) talk on open and composable data systems, it seems plausible to contain semantic definitions within an Arrow-based in-memory layer, or simply an Iceberg table representing the single source of truth. With emerging deep interoperability, these can work with a variety of query engines as needed.

In this AI-native era, semantic layers/metric stores are the glue that holds together various databases, self-service analytics and autonomous agents. Define your metrics once, govern them centrally, and watch as every front‑end, whether human or machine, speaks the same language.

Interoperability, Modularity, and the Deconstructed Stack

Thanks to talks by Julien Le Dem and others, it became clear that the modern data stack is comparable to Lego bricks. Want a new ML feature store? Plug it into your lake via Parquet. Need a faster query engine? Swap it in! Your data is in Iceberg.

This open source modularity is enabling small team to do big things. It’s also shifting value away from proprietary systems/storage formats and towards workflow orchestrated and user experience on top of common building blocks. Here are some examples:

DuckDB is being embedded everywhere (Malloy, Hex, and Rill use it under the hood, Rill uses it in-browser for local development, and MotherDuck does cloud DuckDB). DuckDB + Parquet/Arrow is like the SQLite of data.
Iceberg, Arrow and Parquet have hit escape velocity. From python and Spark to BigQuery and Snowflake, these formats tie every engine and language together. Support for Iceberg isn’t universal quite yet, but cloud providers are racing to bake this in natively. As Julien Le Dem noted, “blob storage is becoming table‑aware.” With transactions, schema evolution, time travel and catalog services, Iceberg turns S3/GCS into a fully-featured table store.
The stack is becoming syntax-agnostic, mixing SQL, Python, and more. With Bauplan, you can define models and pipelines in python. On the analytics side, Hex’s multi-modal workbooks support a combination of languages in a single workflow.

Apache Arrow was the unequivocal hero of nearly every session on modular data architectures. I learned it underpins the data-movement for products at every scale. On the larger end, Ganesh from Hex did a medium-depth dive on how its DAG‑based “tablestore” persists results as Arrow tables so SQL and Python cells can execute in parallel. Elias from SDF (recently acquired by DBT) showed how Arrow enables the generated query plan to execute against any engine that supports it, so a model validated on Snowflake can be test-run locally in DuckDB, for instance. Earlier in its journey, Okta uses arrow to transfer data from shards to s3/Iceberg for analytics. And the aforementioned Bauplan also uses Arrow when moving Iceberg tables in and out of a lambda.

Disclaimer: seizing this new composability comes with a caveat: ripping out a legacy warehouse overnight is probably suicide. Migrating decades of ETL, dashboards, and access controls into Iceberg tables or a DuckDB‑backed semantic layer takes planning and incremental roll‑outs. But for any new initiatives, it certainly seems like the ideal bandwagon to hop on.

Agents All The Way Down: AI Across the Stack

The term “agent” was everywhere. From building reliable agentic systems, to OpenAI’s agentic API, to RAG/testing/benchmarking frameworks, to infrastructure applications, the entire community is clearly excited about the potential, and also sober about the challenges.

AI as the Data Consumption Layer

As a baseline, RAG is the standard for agents. Nearly every AI use case used RAG to ground the model with knowledge. The unanimous advice among Tengu Ma (Voyage AI), Skylar Thomas (Cake.ai), a panel of AI leaders, and several other talks, was to skip naive vector search entirely, and to go straight to hybrid (embeddings + keywords), and to use rerankers. An agent is only as good as its knowledge base, and building that base is a first-class engineering task. Lots of great techniques for data processing/retrieval were discussed. Skylar from Cake.ai gave an excellent (and fast, and dense) talk on RAG at scale, where he suggested beefing up retrieval with LiteRAG and knowledge graphs, suggesting > 2 types of retrieval across multiple databases. And Tengu from Voyage AI suggested various ways to process long documents with metadata like neighboring chunk summaries to increase retrieval quality.

In the spectrum of chains/loops vs end to end models, it seems the trend is moving away from fully-scripted chains of LLM calls toward giving the model more autonomy to plan, use tools, evaluate, and iterate dynamically. For instance, the internal Twitch slackbot used a selector agent to pick from several relatively deterministic chains. Meanwhile, OpenAI is rolling out the Responses API that allows dynamic loops with tools from a single call. And, somewhere in between, the presentation on Harvey AI, a specialized agentic platform for legal work, demonstrated a delicate balance between humans in the loop with dynamic agents.

The strongest point of consensus, as ever, was observability and evals. An example of a simple check was to run EXPLAIN on a query before running it within a text to sql system. Deeper, more elaborate systems were a major topic of Skylar’s talk, like running regression tests against eval datasets, with the idea being that the qualifying metric for “good” results must be defined and tested against to measure progress and alert to issues. Lastly, human review is an inevitability, especially for new or domain-specific systems.

AI as System Architecture

Considering the catastrophic risk of mangled data, the idea of agents building and running pipelines might sound like sci‑fi. After all, AI still hallucinates now and then, right? Believe it or not, we’re closer to solving that than most realize. If you’ve built a RAG system, you know the power of well-targeted context. The challenge has always been acquiring that context.

Historically, your data “context” – the rich metadata, semantic rules, transformation logic and business definitions that make raw tables meaningful – has been scattered across silos: pipeline orchestration tools like Airflow only capture task dependencies and schedules, while transformation logic lives as standalone SQL files in dbt or custom scripts without any machine‑readable declaration of input/output schemas or business intent; semantic layers in BI platforms such as LookML, Power BI datasets or Tableau models define dimensions and measures behind proprietary GUIs with no unified registry; cloud warehouses (Redshift, Snowflake, BigQuery) expose schema and lineage via vendor‑specific system tables and APIs that know nothing of downstream semantics; and any hand‑written documentation – in Confluence pages, data dictionaries or spreadsheets – quickly grows stale and disconnected from the code that actually runs. Because all these pieces live in different systems, in different formats, and often behind closed‑source UIs or proprietary APIs, an AI agent can’t simply “look up” the context it needs.

Today, though, that context can all exist as normal code – version‑controlled, semantically defined, and instantly retrievable by an AI agent. This is because end to end data systems are being fully redesigned as composable code artifacts by several cutting edge companies. At Rill Data, DuckDB and ClickHouse power instant, streaming dashboards, in‑browser, that reflect metrics defined both semantically and as code. Further, Bauplan takes the version‑controlled, data‑as‑code paradigm beyond modeling and metrics to the entire pipeline infrastructure.

Pair this with self-healing over deep observability, along with the ever-increasing model quality and tool usage abilities, and the future is unmistakable: the tools of tomorrow won’t just store or query data—they’ll reason over it, monitor themselves, and act in real time. Considering the unparalleled exponentiality of growth we’ve seen so far, it’s clear to me that tomorrow is right around the corner.

The Future is What We Make It

In his talk, Julian Le Dem reminisced about the past excitement around the potential of native data applications as the next wave of value creation. He pointed out that AI’s emergence as the new consumption layer tempered that wave. Whether native data apps are swallowed completely by ubiquitously embedded AI or not is too early to tell. But the industry isn’t holding their breath to find out. The trend towards modularity, a unified semantic layer, hyper-fast databases and responsive front ends all help enable AI to more deeply integrate throughout and across systems.

Of course, many see AI as a threat – to their jobs, to humanity even – but at Data Council 2025, the vibe was infectious, energetic excitement. Perhaps the biggest takeaway of all is that it’s still so early that no one really knows what AI’s role in our world will look like, in any of its applications. I can certainly see the negative viewpoint, but Data Council 2025 was an affirmation that there’s nowhere I’d rather be than on the bleeding edge, helping to shape what this awesome technology becomes.

Twing Data