Lakehouse

A PII-safe web search tool for Snowflake agents

A PII-safe alternative to Brave web search for Snowflake Cortex Agents, using Perplexity and AI_REDACT

Martin Seifert

25 Mar 2026 — 4 min read

TL/DR: Snowflake Intelligence can search the web using Brave... but it sends the agent's query out as-is. Here's a custom tool that uses Perplexity and redacts PII before leaving Snowflake.

Snowflake Cortex Agents are pretty great at answering questions about data. They can query structured data using Cortex Analyst and semantic views, unstructured data using Cortex Search, call custom tools, and (since recently) even search the web using the built-in Brave integration. But here's the thing: sometimes an agent needs external context to answer a "why" question, and the query it constructs may very well contain sensitive information: Company names, people's names, account identifiers...

The built-in Brave web search (though the search engine itself is privacy-first) doesn't care about that. It just sends the query. And that's where a custom tool with AI_REDACT comes in.

The problem with unfiltered web search

Say there's a Cortex Agent that helps a team investigate anomalies in customer data. Someone asks: "Why did Acme Corp's billing address change to a PO box in Delaware?"

If the agent decides it needs web context and calls the built-in Brave search, the query might include the company name, the address, maybe even a contact name. That's PII leaving the Snowflake perimeter, sent to an external search engine, logged, cached, who knows.

This may or may not be a dealbreaker depending on compliance requirements. But in a regulated industry (or for the privacy-conscious), it's worth thinking about.

The alternative: Perplexity + AI_REDACT

The idea is simple: instead of using Brave, give the agent a custom web search tool that (1) redacts PII from the query before it leaves Snowflake, and (2) uses Perplexity's search API to get high-quality (though I don't want to criticize the quality of Brave here, I have no experience with Brave whatsoever!), source-cited results.

I already wrote about using Perplexity in Snowflake in a previous post:

Back then, it was a notebook experiment. Since then, I've turned it into a proper UDF... and added a PII-safe wrapper using AI_REDACT.

The search function

The core function sends a prompt to the Perplexity Search API and returns an array of results. It uses the official perplexityai Python package from PyPi:

create or replace function meta.cortex.f_prompt_to_perplexity_search(
    prompt string
)
returns array
language python
runtime_version = 3.11
artifact_repository = snowflake.snowpark.pypi_shared_repository
packages = ('perplexityai')
handler = 'main'
comment = 'Search Perplexity AI with a given prompt'
external_access_integrations = (i_perplexity)
secrets = ('api_key' = meta.integration.se_perplexity)
as
$$
import _snowflake
from perplexity import Perplexity

def main(prompt: str) -> dict:
    client = Perplexity(
        api_key=_snowflake.get_generic_secret_string('api_key')
    )

    search = client.search.create(
        query=prompt,
        country="CH",
        max_results=5,
        max_tokens_per_page=1024
    )
    
    search_results = []
    for result in search.results:
        search_result = {
            'doc_title': result.title,
            'doc_id': result.url,
            'text': result.snippet
        }
        search_results.append(search_result)
    
    return search_results
$$;

Why Python? Because that's what the Perplexity SDK requires. The function returns an array of objects (title, URL, snippet): a format that a Cortex Agent can easily work with.

This requires an external access integration for api.perplexity.ai and a secret containing the Perplexity API key. Cf. my previous post for the setup steps.

The PII-safe wrapper

This is the part that makes the difference. The wrapper is dead simple (a one-liner, really):

create or replace function meta.cortex.f_safe_perplexity_search(
    prompt string
)
returns array
as
$$
    select meta.cortex.f_prompt_to_perplexity_search(
        ai_redact(prompt)
    )
$$;

AI_REDACT is a Cortex function that detects and replaces PII in text (names, addresses, phone numbers, emails, and more) with placeholder tokens like [NAME] or [ADDRESS]. So if the agent constructs a query like "Why did Acme Corp change their address to 123 Main St?", what actually leaves Snowflake is something like "Why did [ORGANIZATION] change their address to [ADDRESS]?".

AI_REDACT has been generally available since December 2025. It works best with English text and supports up to 4,096 tokens per call.

Is the redacted query less precise? Sure, sometimes. But i.m.h.o. that's a perfectly acceptable trade-off: web context without leaking sensitive data. And in many cases, the "why" behind an anomaly is generic enough that redaction doesn't hurt the search quality at all.

Using it as a Cortex Agent tool

The beauty of this approach is that f_safe_perplexity_search is just a regular Snowflake function... which means it can be registered as a custom tool in a Cortex Agent. When the agent's internal context (Cortex Analyst/Search) isn't sufficient to answer a question, it can fall back to this tool for external knowledge.

A typical use case: the agent knows what happened in the data (because it can query the tables), but it doesn't know why. A web search can provide that missing context (a regulatory change, a market event, a product announcement) without exposing the specifics of the data.

Why not just use Brave?

To be fair, the Brave web search integration in Snowflake Intelligence is convenient. It's built-in, requires no setup, and works out of the box. But it has one blind spot: there is no PII redaction layer between the agent and the search engine.

With the custom Perplexity tool:

PII is redacted before the query leaves Snowflake
Full control over which external endpoint is called
Search parameters are tunable (country, result count, token limits)
Perplexity's source-cited results tend to be well-suited for analytical follow-up questions

The trade-off is setup effort (external access integration, API key, the UDF itself) and cost (Perplexity API credits). But if privacy matters (and it probably should), it's worth it.

And that's really it: a web search tool for Snowflake agents that doesn't leak data on the way out 😎

A PII-safe web search tool for Snowflake agents

Martin Seifert

The problem with unfiltered web search

The alternative: Perplexity + AI_REDACT

The search function

The PII-safe wrapper

Using it as a Cortex Agent tool

Why not just use Brave?

Read more

Finding Snowflake tasks that silently stopped running

Snowflake Cortex Code

A simple first-party tracking approach

Ad-hoc data visualization using Claude + MotherDuck + mviz