MotherDuck Flights: pipelines an agent can fly, and why dlt belongs in the cockpit

MotherDuck shipped agent-native Python pipelines. The obvious thing to run inside one is dlt.

Share
MotherDuck Flights: pipelines an agent can fly, and why dlt belongs in the cockpit

TL/DR: πŸ¦† MotherDuck shipped Flights yesterday: agent-native data pipelines that run arbitrary Python next to the warehouse. My early take: dlt should be the library inside them.


For years the data-stack pitch was constraint as a feature: here are forty connectors, write some SQL, stay inside the lines. Jordan Tigani's framing of Flights flips that. LLMs don't need the lines. They thrive with a broader palette, so MotherDuck gave them one: a Flight can run virtually any Python, call out to an LLM, drive a headless browser, shell out to git. The constraint that made pipelines safe for humans is the constraint that makes them boring for agents.

What a Flight is

A Flight is a Python program that MotherDuck schedules and runs with direct access to the database. Under the hood there are two compute planes over the same storage. Ducklings are the per-user SQL compute we already know (they spin up in milliseconds). The Python runtime is new: it's per-run, isolated, and it executes the source as a plain script, then exits. The file ends the boring way:

if __name__ == "__main__":
    main()

That script talks to the data through a normal DuckDB connection to md:, which routes through a Duckling. MotherDuck injects an access token into the run's environment automatically, so there's no token handling in your code. Sounds a lot like ❄️ Snowflake SPCS? That's because this mechanism is very similar there.

"Anything you can pip install, you can build" is the actual design goal, and it mostly holds: list dependencies in a requirements.txt and they're there at runtime.

There are three ways to create and manage Flights, and they're the same surface wearing different clothes:

  • An MCP server. Any MCP agent (Claude, Cursor, OpenCode, ...) gets the full Flights surface as tools: create, run, schedule, update, inspect logs, version, delete. There's a built-in get_flight_guide instruction set so the agent knows the house rules. This is the marquee path, and it's where the "vibe-codeable pipeline" claim comes from.
  • SQL table functions. Every Flight operation has a matching SQL function (MD_CREATE_FLIGHT, MD_RUN_FLIGHT, MD_UPDATE_FLIGHT, MD_FLIGHTS). A pipeline is a SELECT away, callable from the DuckDB client, a BI tool, dbt, or even another Flight.
  • The Flights UI. Paste Python, set a schedule, trigger runs, read logs and run history, manage versions and env vars.
Flights is in public preview, and the Python runtime is shared compute across tenants (databases are not). MotherDuck is explicit: during preview, don't process ePHI, payment-card data, or regulated personal data in a Flight!

How Flights bill

Pricing is runtime consumption: you pay for the time a Flight's Python runtime is actually running. Tigani put a number on it in his SiliconAngle interview: it starts at about 60 cents per hour. He was also candid that revenue isn't the point here. The goal is to get data into MotherDuck with as little friction as possible, and the pricing reflects a feature meant to remove a barrier, not to be a profit center.

The practical consequence: a Flight is billed like a job, not a (always-on) worker (Hello ❄️ Openflow! 😜). It runs to completion and exits, so a tight ingestion script that finishes in ninety seconds costs ninety seconds. That immediately makes me think about scheduling. Frequent, short, idempotent runs are cheap. Long-running in-memory crunching is where the meter hurts - keep heavy compute in the warehouse and let the Flight just move data.

How to build and maintain one

The launch shipped with a set of Flight Plans (templates πŸ˜‰) on GitHub, and the dlt ingest one is the most instructive, because it shows the pattern intended by MotherDuck. The idea: adapt a Flight by changing config, not by editing the code. Every knob (destination database, dataset, table, write disposition, primary key, schedule) is read from Flight config at the top of flight.py. The only thing to rewrite is the source function.

Before deploying anything, smoke-test the exact file locally, because a Flight is just a Python script:

export MOTHERDUCK_TOKEN=<your_token_here>
uv run --with-requirements requirements.txt flight.py

That single run creates the database, loads the demo data, and writes one row to a run ledger. The run ledger is the maintenance story: each run appends an audit row capturing the load-package summary, written with bound parameters. Identifiers that can't be parameterized (database name, ledger table) are validated against ^[A-Za-z_][A-Za-z0-9_]*$ before any SQL runs. It's a small thing, but it's the difference between a template and a toy.

Deployment is a SQL call. Create the Flight without a schedule, trigger one manual run, confirm the tables and ledger row appear, then attach a cron schedule once it's green:

-- create, then verify a manual run, then schedule
call md_create_flight(
  name := 'dlt_ingest',
  source_code := $$ ...contents of flight.py... $$,
  requirements_txt := $$ duckdb
dlt[motherduck]
httpx $$,
  config := { 'DESTINATION_DATABASE': 'analytics', 'WRITE_DISPOSITION': 'merge', 'PRIMARY_KEY': 'repo' }
);

-- one manual run to confirm it's green
select md_run_flight(flight_id := '...');

-- only then add a daily schedule (07:15 UTC)
select md_update_flight(flight_id := '...', schedule_cron := '15 7 * * *');

The token is attached to the Flight automatically and injected as MOTHERDUCK_TOKEN - it's never in the config. Source credentials (for a private APIs or a Postgres) go in a MotherDuck Flights secret, also injected as env vars at runtime, never in the config blob.

Where dlt fits

MotherDuck names dlt as the recommended ingest library, and that's the right call, but the reason runs deeper than "it has a MotherDuck destination."

dlt's whole design premise is that it runs where Python runs. No server, no control plane (unless you want one): it's an imported library and executes whatever process handed to it. I've poked at the edges of that claim before, trying to run dlt inside a Snowflake UDF and then inside SPCS, precisely because "runs where Python runs" is a strong claim worth testing:

Can you run dlt inside Snowflake? (Part 2/2: SPCS)
In my last post I explored whether dlt could run in a Snowflake UDF... and failed. But can it still run inside Snowflake using a Container? TL/DR: yes, it does 😎

A Flight is (second to dltHub Pro, of course) the friendliest possible version of that environment. It's a clean Python process with a writable /tmp, a requirements.txt, and a database connection already wired up. dlt drops straight in. Declarative pipelines, automatic schema evolution, incremental loading, and state tracking, none of which an LLM has to come up with from scratch alongside the raw INSERT statements. The Flight handles the scheduling, retries, and run history; dlt handles the schema and the load. Clean division of labor.

The single-file pattern is pretty tidy:

import os
import dlt

def main():
    os.environ.setdefault("HOME", "/tmp")  # dlt writes working files under HOME
    pipeline = dlt.pipeline(
        pipeline_name="github_stats",
        destination="motherduck",
        dataset_name="analytics",
    )
    pipeline.run(
        repo_rows(),                 # swap this for any dlt source
        table_name="repos",
        write_disposition="merge",
        primary_key="repo",
    )

if __name__ == "__main__":
    main()

Swap repo_rows() for any source dlt supports (a REST API, Postgres, BigQuery, S3, or anything in the verified-source catalog or any custom source alike) and the pipeline ingests it without touching its mechanics. One default worth keeping (in most cases): the template sets loader_file_format="parquet" so larger sources stay on a bulk-loading path instead of row-wise inserts.

And this is where the agent angle stops being marketing. An LLM is good at writing a dlt source function - I've done this dozens of times by now. It's good at picking a write disposition and a primary key. It's much worse at hand-writing idempotent merge SQL with schema drift handling. dlt is the guardrail that lets the agent paint with the broad palette while the boring correctness stays handled. That's a better story than "the agent writes raw SQL and hopes."

Add dltHub's workbench on top and transformations can be shifted left and already happen in-Flight:

Agents can write dlt pipelines. Now they can run & deploy them.
Introducing the dltHub AI Workbench: an infrastructure layer for dltHub that makes AI-generated dlt pipelines trustworthy for production.

Yesterday's launch is a Python runtime bolted onto a warehouse, billed by the second, drivable by an agent... and dlt is the obvious thing to run inside it.