Lakehouse

Snowflake Iceberg in 2026: open format, now without the storage chores

Three Iceberg GAs in one week advancing the old tradeoff: open format without babysitting a bucket.

TL/DR: in the first week of June 2026, ❄️ Snowflake shipped three 🧊 Iceberg GAs that, taken together, further advance the old tradeoff in the user's favor: you can have an open table format without babysitting your own cloud storage. Snowflake-managed Iceberg storage, a connector to Google's BigLake Metastore catalog, and a parameter that makes Iceberg the default table format without typing the ICEBERG keyword.

For a while, choosing Iceberg on Snowflake meant signing up for ops work. I created an external volume, pointed it at an S3 or ADLS bucket, granted Snowflake access, picked a catalog, and only then got a table. That was fine, I already ran that storage anyway, but it is a lot of yak-shaving if all I want was an open format I can read from other engines later. Three GAs in early June change the shape of that decision.

1. Snowflake-managed storage: Iceberg without a bucket

The headline GA (June 1) is Snowflake storage for Iceberg tables. Snowflake stores and manages the Parquet and metadata files for me. No external volume, no IAM grant. I create the table and Snowflake handles the rest:

create iceberg table my_events (
  event_id   string,
  event_ts   timestamp_ntz,
  payload    variant
)
  catalog = snowflake
  external_volume = snowflake_managed;

That external_volume = snowflake_managed is the whole trick. It is a reserved value, not an external volume object created. I never run create external volume for this path. If my account or schema default catalog is already Snowflake and the default volume is SNOWFLAKE_MANAGED, I can drop both lines and just write create iceberg table ....

One detail that genuinely matters for production: these tables get fail-safe. A permanent Snowflake-managed Iceberg table has the same X-day recovery window as a regular Snowflake table. That has not been true for customer-managed Iceberg, where recovery was the customer's problem. If I want to avoid the fail-safe storage cost, I can always make the table transient:

create transient iceberg table my_scratch (col1 int)
  catalog = snowflake
  external_volume = snowflake_managed;

Transient Iceberg tables are only supported with Snowflake storage - Iceberg tables on a customer-managed volume cannot be made transient.

Availability caveat: Snowflake-managed Iceberg storage is AWS and Azure commercial regions only (so no GCP or gov regions). If you are on GCP, the BigLake integration below is your story instead. Also: Tri-Secret Secure accounts may need Snowflake Support to enable the feature, since these tables only support server-side encryption (no customer-managed keys).

2. BigLake Metastore: the GCP interop gap closes

The second GA (June 2) is Snowflake's catalog integration to Google Cloud BigLake Metastore. BigLake Metastore is Google's catalog, not a Snowflake feature. What Snowflake shipped is the connector: an Iceberg REST catalog integration that lets Snowflake discover and query Iceberg tables registered in BigLake. This is the GCP answer to the "no Snowflake-managed storage on GCP" limitation. Set up a workload identity federation so Snowflake authenticates to Google without long-lived service account keys, then create a catalog-linked database that reads the BigLake Iceberg tables.

The part worth calling out is the auth model. No service account JSON key files sitting in a secret somewhere. Workload identity federation means Snowflake presents its own identity to Google and Google trusts it. One less long-lived credential to rotate and leak.

Configure a catalog integration for Google Cloud BigLake Metastore | Snowflake Documentation

Snowflake Documentation

3. Default metadata write format: stop typing ICEBERG

The third GA (June 5) is the quiet but practical one: the DEFAULT_METADATA_WRITE_FORMAT parameter. Set it at account, database, or schema level and plain CREATE TABLE starts producing Iceberg tables. No ICEBERG keyword, no per-statement catalog and volume options.

alter schema icing.stage
  set default_metadata_write_format = 'iceberg';

-- now this is an Iceberg table, no keyword needed
create table skates (
  skate_id    string,
  skated_at  timestamp_ntz
);

Why does this matter beyond saving a keyword? Because it lets me make Iceberg the house style for a schema or a whole database without rewriting every CREATE TABLE in dbt projects or other scripts. Flip the parameter, and existing DDL produces open-format tables. My personal hesitance to migrate all my DDLs blocked quite a few use cases 😅 The release also brings automatic DDL promotion in catalog-linked databases, so CREATE TABLE against those creates Iceberg tables too.

The billing gotcha nobody puts in the demo

Snowflake-managed storage is convenient, but the bill has a shape worth understanding before pointing external engines at it: Querying these tables from the Snowflake engine in the same account is not separately charged. The moment an external query engine reads them through Horizon Catalog, I pay per request.

External engine = per-request fees. Any non-Snowflake engine (Spark, DuckDB, Trino) or even a Snowflake engine in a different account that reads a Snowflake-managed Iceberg table via Horizon Catalog is billed a per-request fee. PUT/COPY/POST/PATCH/LIST are class 1, GET/SELECT are class 2. Cross-region or cross-cloud access adds data transfer charges on top. Keep STORAGE_REQUEST_HISTORY in Account Usage in mind to see the request counts. The "open format, query from anywhere" pitch is real, but "anywhere" has a meter.

There is a related compaction nuance: while a table is written only by Snowflake, compaction is bundled (zero credits in ICEBERG_STORAGE_OPTIMIZATION_HISTORY). Once an external engine does DML or DDL through the Iceberg REST Catalog, that table starts accruing compaction charges. So the cost model rewards keeping writes in Snowflake and treating external engines as readers.

When to use which path

The three GAs do not replace each other, they cover different situations. Roughly how I would pick:

Situation	Path
Iceberg on AWS/Azure and it doesn't matter where the bytes live	Snowflake-managed storage (`SNOWFLAKE_MANAGED`)
Iceberg on GCP and the catalog of record is BigLake	BigLake Metastore catalog integration
Must keep files in dedicated bucket (regulatory, existing lake, cost control)	Customer-managed external volume

And of course: When I want a whole schema or database to default to Iceberg, I now use DEFAULT_METADATA_WRITE_FORMAT.

Snowflake-managed storage removes a major reason to avoid Iceberg on Snowflake for teams that wanted open format but did not want to run storage ops. The catch is the external-engine billing, so the convenience is best when Snowflake is the primary engine and other engines are occasional readers, not when you plan to hammer the tables from a fleet of external Spark jobs 😎

Snowflake Iceberg in 2026: open format, now without the storage chores

1. Snowflake-managed storage: Iceberg without a bucket

2. BigLake Metastore: the GCP interop gap closes

3. Default metadata write format: stop typing ICEBERG

The billing gotcha nobody puts in the demo

When to use which path

Read more

I migrated a Fivetran pipeline to dltHub for $0.65

I needed 2,110 municipal contacts. So I built a crawler that refuses to guess.

Backing up Microsoft Planner with dlt and Azure Blob

My GTM container had 282 undocumented objects