Lakehouse

Retiring legacy BI with an agentic-built Streamlit app on SPCS

Build-vs-buy got rewritten by agentic coding. So I retired the legacy BI tool. 😎

TL/DR: Killing the legacy BI workbooks. Replacement is a Streamlit app on Snowflake Container Services, mostly typed by Claude Code on the orchestration side and Cortex Code on the Snowflake side. Build-vs-buy just got rewritten by agentic engineering.

For years, “build a custom BI app” kept losing to “buy a BI license” for one reason: engineering time. Per‑seat licensing is annoying, but maintaining a custom dashboard stack as a side job is worse, so I (like most others) kept paying the per‑seat tax.

Then agentic engineering got good. Suddenly the math flipped. One engineer with Claude Code handling orchestration / Streamlit / dlt / Dockerfile, and Cortex Code introspecting the Snowflake account to generate canonical SQL against semantic views, can ship a multi-domain dashboard app in days. What used to cost three person‑months now costs three person‑days, and the cost keeps tilting further with every model release.

And apparently I am not alone with this vision:

This is the story of what replaced my legacy BI stack: an internal Streamlit app, very creatively called "Dashboards", running as one SPCS service inside our own Snowflake account, behind Snowflake SSO.

Then again: why do we need dashboards anyway? If agents are so good at answering questions, do we really still need those rather static artifacts? I personally believe there is a place for non‑ephemeral data visualizations: for monitoring, as part of operational processes, or to get a quick view of the status quo in a board meeting, maybe as a starting point for further questions and analysis that then can be taken on by agentic analytics. Still, the first piece I added to my dashboards app was a link to Snowflake Intelligence, exemplarily used here:

Why kill the legacy BI tool?

The pain list before the build was the usual one for standalone BI:

Per-seat licensing scaling linearly with audience.
Slow turnaround on changes (workbooks living outside git, personnel bottleneck on the publishing flow).
Opinionated chart library that doesn't speak our visual language.
Hard to embed organization‑specific UI conventions: Swiss number formatting with apostrophe thousands separators (CHF 1'234'567), generic feminine across all UI text (Kundin, never Kunde), our brand colors, our logo in the right slot.
No good story for embedding ad-hoc Python computations next to a chart.

None of these are individually fatal. Together they're a constant low-grade operational tax.

The agentic-coding shift: one-year-ago-me could not have justified the build. I would have written ~4'000 lines of Streamlit by hand and given up on iteration speed by week three. With agents handling most of the typing, each domain page I add (operational and financial overviews as well as drill-downs on operational processes) is a less-than-one-day affair.

Architectural note: each tool stays in its lane. Claude Code does not touch the Snowflake account directly. Cortex Code does not write Streamlit code (CoCo CLI could, of course, do this just as well, but is priced differently using ❄️ credits 😉). The .sql files Cortex Code emits are committed to the repo for auditability; raw tool traces are gitignored for PII safety. Snowflake’s own guidance applies: if the answer depends on your Snowflake account, use Cortex Code.

Why Streamlit, and not a React app

Honest version: agentic coding doesn't make every framework equally easy. Streamlit hits a sweet spot for this kind of work for a few reasons.

One language, one process. A React app usually means a TypeScript / Vite / probably Tailwind / probably Recharts or D3 frontend, plus a Python or Node API tier, plus a build pipeline, plus auth wiring on both sides. Each of those is a place where the agent can pick a different convention, drift the dependency versions, or fight with a config file. Streamlit collapses all of it: one Python process, server-rendered widgets, no API tier, no bundler, no separate auth flow. The agent only has one mental model to keep coherent, which directly improves reliability and speed. And token usage 😉

State is implicit, not invented. React forces an explicit state model (hooks, context, maybe a store). When an agent writes React, every new feature is a small architectural decision. Streamlit's rerun-the-script-on-event model is dumb in the good sense: there is exactly one way to handle state, and the agent doesn't have to invent one.

The widget vocabulary is small and stable. st.selectbox, st.dataframe, st.plotly_chart, st.cache_data, st.cache_resource. That's most of it. Compare with the React ecosystem where the agent has to choose between five chart libraries, four form libraries, three data-fetch libraries, two routers, etc... Smaller surface area = fewer wrong turns.

Native Snowflake citizenship. Snowpark sessions, Streamlit-in-Snowflake, SPCS deployments, OAuth via the injected token at /snowflake/session/token: None of this needs glue code. The same is not true for a React app: I would have had to stand up a separate auth-aware backend just to broker queries on Snowflake.

What I lose: pixel-perfect UI control, true component reuse across non-Streamlit surfaces, real-time updates without a manual rerun. None of those matter for an internal dashboard app where the audience is ~50 people behind SSO and the data is usually refreshed daily.

What about Plotly Dash, Panel, Hex, Mode? Dash and Panel are the closest comparables for this use case. I picked Streamlit mostly because the agentic-coding ecosystem around it is the densest right now (Streamlit publishes its own agent-skills repo, the Snowflake guides assume Streamlit, the docs are LLM-friendly). For a tool that lives or dies by how fast I can iterate with an agent, ecosystem density matters more than feature parity. Hex and Mode are notebooks-as-a-service: different shape. They solve "share an analysis", not "ship an internal app".

The shape of the app

One Streamlit app. One container image. One SPCS service. Authentication to data living in Snowflake is effectively free: SPCS injects an OAuth token at /snowflake/session/token, and Snowpark plus the Snowflake connector both pick it up transparently. No external IAM and no third‑party SaaS in the user‑facing path.

streamlit_app.py
pages/
  ABC_Overview.py
  ABC_Operations.py
  DEF_Overview.py
  DEF_Operations.py
sql/
lib/
  snowflake.py       # snowpark + snowflake-connector factories
  populate.py        # dlt pipeline: snowflake -> local duckdb
  local_duckdb.py    # read-only conn + schema rewrite
  cache.py           # routing + fallback + ttl cache
  boot.py            # cold-start overlay
  components.py      # KPI cards, CHF formatting, sidebar nav, ...
  auth.py            # role-based page gating
Dockerfile
dashboards-app.yaml

A scheduled ❄️ task resumes the service Mon-Fri at 07:00 Europe/Zurich and suspends it at 19:00. Cost guard: if the dashboards-app (and other apps running on the same compute pool) is idle, the compute pool itself auto-suspends.

Architectural note: I never CREATE OR REPLACE SERVICE. That mints a new public-endpoint hash and breaks every bookmarked URL. All deploys go via ALTER SERVICE ... FROM @stage SPECIFICATION_FILE = '...', which is metadata-only and hash-preserving.

The MotherDuck detour, and why I retired it

The first version of the app had a cache tier in front of Snowflake to avoid spinning up a Snowflake Warehouse every time someone opens a dashboard: a MotherDuck mirror of the relevant objects (tables and views), populated daily by a separate SPCS service running a dlt pipeline. Why? Unless running on an interactive Warehouse, KPI tile aggregations on Snowflake felt sluggish in cold-cache cases, and MotherDuck gave sub-100 ms queries for small data and offers (significantly) lower flat-pricing (even a free tier) when running 24/7.

It worked. But added another operational tax: a second SPCS service, a network rule, an external access integration, a token rotation chore, two images to maintain, two failure modes to monitor.

Why a cloud DWH as cache, before the rethink

Worth being honest about why MotherDuck was my first instinct: It's a paradigm thing.

Coming from a data-warehouse-first background, "cache layer" defaults to "another database, in the cloud, with replication". MotherDuck fit that mental model perfectly: managed DuckDB, accessible over HTTPS, with a connection string that looks just like a warehouse. Set up a replication pipeline, point the app at it, done. Familiar.

What I missed for too long: that's the wrong shape for a Streamlit app on SPCS. A cloud DWH cache assumes the app is far from the data and many readers share the cache. Neither is true here. The app is one container. The "cache" is read by one process. Network round-trips to a managed DuckDB across the public internet are pure overhead compared to a local file on the same machine.

The latency numbers tell on the paradigm. A query against a local DuckDB file is sub-10 ms for the same KPI tile. The MotherDuck round-trip was sub-100 ms (already great compared to a cold Snowflake warehouse), but a tenth of that, with no token to rotate and no EAI to maintain, is just better.

Paradigm shift: when the application is the only consumer, the cache should live next to the application, not in the cloud. Cloud DWHs make sense when many consumers share state, or when the data exceeds local disk, or when persistence across processes is required. None of those apply to a 1-2 GB internal dashboard app whose container restarts daily anyway.

This is not a knock on MotherDuck. The same DuckDB engine, hosted, is exactly right when readers are distributed (other pipelines might very well still write there because their consumers are external). But for an embedded-cache role inside one container, a /tmp/dashboards.duckdb file just wins on every axis: latency, operational surface, cost, blast radius.

The general lesson: when you adopt a new app shape, default instincts about where state lives can be a generation behind. I had the right tools. I was just composing them in the shape of the previous decade's architecture 🤦

Folding the populate into the app

Then the realization (a duh moment in hindsight). SPCS already restarts the dashboards container every weekday at 07:00. That's a perfect natural sync point. What if the pipeline (using dlt, of course 😎) simply ran inside the dashboards container at startup, writing to a local DuckDB file in /tmp? The container is ephemeral anyway. tmpfs lives exactly as long as I need.

So I folded the populate into the app. Wall time on the SPCS container with requests: 1 CPU / 2 Gi:

11 modeled tables, ~16 million rows (including a 3.79M fact and an 8.33M bridge): 52.36 seconds.
3 reporting views, 301 rows: 0.34 seconds.

Snowflake to local DuckDB in roughly one minute. Streaming via cur.fetch_arrow_batches(), dlt with loader_file_format="parquet", write_disposition="replace" and the same OAuth token at /snowflake/session/token (simply using get_snowflake_connector_connection() from lib.snowflake). No incremental cursor logic, just a full refresh on every container start. For a 1-2 GB dashboard dataset, that trade‑off is well worth the simplicity. And the dashboards are still available at 07:01 in the morning 😜

Routing inside the app is one regex. Queries that hit large fact tables (hundreds of millions of snapshots, not worth mirroring), Snowflake table functions (not quite expressible as a mirrored table), or data out of canonical scope go straight to Snowpark. Everything else hits the local DuckDB file, with a Snowpark fallback on any exception. SQL templates literally say CONSUME.DOMAIN.T_FACT_ABC, and the routing function local_duckdb._rewrite_to_local() rewrites them to domain.t_fact_abc for the local path. One SQL source per query, two backends.

What I kept: the same .sql files for both paths. What I deleted: an entire SPCS service, a token, an EAI binding, and ~250 lines of Python.

The casing rabbit hole

Another honest lesson, no marketing gloss.

After deploying the new local-DuckDB build, the Donors page crashed with KeyError: 'LEADSOURCE' on filter_df["LEADSOURCE"].dropna(). Pages tried accessing columns by Snowflake-style UPPERCASE.

Root cause: dlt's default naming_convention="snake_case" lowercases every identifier on the way to the destination. Storage tables had leadsource, not LEADSOURCE. DuckDB's lookup is case-insensitive, so SELECT d.LEADSOURCE FROM t resolves fine, but result.df(), used to pass the data from DuckDB to Streamlit, returns column names with the case of the underlying storage.

The mind-bender: it had been broken on MotherDuck the same way all along. Pages were silently falling back to Snowpark for those specific queries then Snowpark returned UPPERCASE columns. Pages were happy. Once the local-DuckDB path executed some rewritten (to DuckDB specific syntax) queries successfully on a fresh local file (different cache effects, possibly different dialect strictness), the lowercase columns surfaced and pages broke.

Two fixes were on the table. Either uppercase the columns at the boundary in run_local(), or switch dlt to a case-preserving naming convention (SCHEMA__NAMING=direct env var, set before the dlt import). I picked the latter: the mirrored objects use plain ASCII identifiers, so direct naming is safe. Storage now matches what Snowflake actually has, which makes ad-hoc inspection of the local file via the DuckDB CLI intuitive.

The line worth quoting: I didn't realize the casing was load-bearing until I removed the dialect-rejection failure mode that had been masking it. The “fall back to Snowpark on any error” pattern is generous, but it hides bugs.

What I got

One thing to deploy, one thing to monitor, one image, no token, no outbound HTTPS from the dashboards container at all.

And none of this would have shipped under the old build‑vs‑buy economics. Roughly three days of part‑time engineer time, with agents typing most of the lines, gave me a BI replacement that I’ll iterate on for years. Adding a new dashboard to the app takes about 1 hour (with 80% of the work happening autonomously while I am sitting in meetings with other humans) and works particularly well if I can provide a previously existing dashboard (XML, JSON, screenshot) as a template. Updating existing assets in the app is usually just a prompt away.

If the dashboard data fits in a DuckDB file and the container restarts daily, a separate sync service might just be over‑engineering 🦆

Retiring legacy BI with an agentic-built Streamlit app on SPCS

Why kill the legacy BI tool?

Why Streamlit, and not a React app

The shape of the app

The MotherDuck detour, and why I retired it

Why a cloud DWH as cache, before the rethink

Folding the populate into the app

The casing rabbit hole

What I got

Read more

Exporting Snowflake data to Excel

Snowflake interactive warehouses: cheaper, always-warm, and yes, the SQL API works

Two AI agents, one repo: delegating Snowflake work from Claude Code to Cortex Code

Snowflake Intelligence on Snowflake Cost: let the platform watch itself