Snowflake Iceberg in 2026: open format, now without the storage chores
Three Iceberg GAs in one week advancing the old tradeoff: open format without babysitting a bucket.
TL/DR: in the first week of June 2026, βοΈ Snowflake shipped three π§ Iceberg GAs that, taken together, further advance the old tradeoff in the user's favor: you can have an open table format without babysitting your own cloud storage. Snowflake-managed Iceberg storage, a connector to Google's BigLake Metastore catalog, and a parameter that makes Iceberg the default table format without typing the ICEBERG keyword.
For a while, choosing Iceberg on Snowflake meant signing up for ops work. I created an external volume, pointed it at an S3 or ADLS bucket, granted Snowflake access, picked a catalog, and only then got a table. That was fine, I already ran that storage anyway, but it is a lot of yak-shaving if all I want was an open format I can read from other engines later. Three GAs in early June change the shape of that decision.
1. Snowflake-managed storage: Iceberg without a bucket
The headline GA (June 1) is Snowflake storage for Iceberg tables. Snowflake stores and manages the Parquet and metadata files for me. No external volume, no IAM grant. I create the table and Snowflake handles the rest:
create iceberg table my_events (
event_id string,
event_ts timestamp_ntz,
payload variant
)
catalog = snowflake
external_volume = snowflake_managed;That external_volume = snowflake_managed is the whole trick. It is a reserved value, not an external volume object created. I never run create external volume for this path. If my account or schema default catalog is already Snowflake and the default volume is SNOWFLAKE_MANAGED, I can drop both lines and just write create iceberg table ....
One detail that genuinely matters for production: these tables get fail-safe. A permanent Snowflake-managed Iceberg table has the same X-day recovery window as a regular Snowflake table. That has not been true for customer-managed Iceberg, where recovery was the customer's problem. If I want to avoid the fail-safe storage cost, I can always make the table transient:
create transient iceberg table my_scratch (col1 int)
catalog = snowflake
external_volume = snowflake_managed;Transient Iceberg tables are only supported with Snowflake storage - Iceberg tables on a customer-managed volume cannot be made transient.
2. BigLake Metastore: the GCP interop gap closes
The second GA (June 2) is Snowflake's catalog integration to Google Cloud BigLake Metastore. BigLake Metastore is Google's catalog, not a Snowflake feature. What Snowflake shipped is the connector: an Iceberg REST catalog integration that lets Snowflake discover and query Iceberg tables registered in BigLake. This is the GCP answer to the "no Snowflake-managed storage on GCP" limitation. Set up a workload identity federation so Snowflake authenticates to Google without long-lived service account keys, then create a catalog-linked database that reads the BigLake Iceberg tables.
The part worth calling out is the auth model. No service account JSON key files sitting in a secret somewhere. Workload identity federation means Snowflake presents its own identity to Google and Google trusts it. One less long-lived credential to rotate and leak.
3. Default metadata write format: stop typing ICEBERG
The third GA (June 5) is the quiet but practical one: the DEFAULT_METADATA_WRITE_FORMAT parameter. Set it at account, database, or schema level and plain CREATE TABLE starts producing Iceberg tables. No ICEBERG keyword, no per-statement catalog and volume options.
alter schema icing.stage
set default_metadata_write_format = 'iceberg';
-- now this is an Iceberg table, no keyword needed
create table skates (
skate_id string,
skated_at timestamp_ntz
);Why does this matter beyond saving a keyword? Because it lets me make Iceberg the house style for a schema or a whole database without rewriting every CREATE TABLE in dbt projects or other scripts. Flip the parameter, and existing DDL produces open-format tables. My personal hesitance to migrate all my DDLs blocked quite a few use cases π
The release also brings automatic DDL promotion in catalog-linked databases, so CREATE TABLE against those creates Iceberg tables too.
The billing gotcha nobody puts in the demo
Snowflake-managed storage is convenient, but the bill has a shape worth understanding before pointing external engines at it: Querying these tables from the Snowflake engine in the same account is not separately charged. The moment an external query engine reads them through Horizon Catalog, I pay per request.
There is a related compaction nuance: while a table is written only by Snowflake, compaction is bundled (zero credits in ICEBERG_STORAGE_OPTIMIZATION_HISTORY). Once an external engine does DML or DDL through the Iceberg REST Catalog, that table starts accruing compaction charges. So the cost model rewards keeping writes in Snowflake and treating external engines as readers.
When to use which path
The three GAs do not replace each other, they cover different situations. Roughly how I would pick:
| Situation | Path |
|---|---|
| Iceberg on AWS/Azure and it doesn't matter where the bytes live | Snowflake-managed storage (SNOWFLAKE_MANAGED) |
| Iceberg on GCP and the catalog of record is BigLake | BigLake Metastore catalog integration |
| Must keep files in dedicated bucket (regulatory, existing lake, cost control) | Customer-managed external volume |
And of course: When I want a whole schema or database to default to Iceberg, I now use DEFAULT_METADATA_WRITE_FORMAT.
Snowflake-managed storage removes a major reason to avoid Iceberg on Snowflake for teams that wanted open format but did not want to run storage ops. The catch is the external-engine billing, so the convenience is best when Snowflake is the primary engine and other engines are occasional readers, not when you plan to hammer the tables from a fleet of external Spark jobs π
