❄️
Data Flakes

Back

If you read my predictions for 2025, you know I’m bullish on Snowflake Cortex. It represents the democratization of AI for data teams. You don’t need a GPU cluster, you don’t need to manage Python environments (unless you want to!), and you don’t need to be a prompt engineer. You just need SQL.

In this guide, we’ll walk through the basics of enabling and using Snowflake Cortex to perform Generative AI tasks directly on your data.

What is Snowflake Cortex?#

Snowflake Cortex is a fully managed service that provides access to industry-leading Large Language Models (LLMs) like Meta’s Llama 3, Mistral, and Snowflake’s own Arctic. It runs entirely within the Snowflake security perimeter, meaning your data never leaves Snowflake to go to an external API like OpenAI.

Prerequisites#

Before we start, ensure:

  1. You are in a region where Cortex is available (check the docs, but by 2025 most major AWS/Azure regions are covered).
  2. Your role has the CORTEX_USER database role or appropriate privileges.
-- Check if you have access
GRANT DATABASE ROLE SNOWFLAKE.CORTEX_USER TO ROLE your_role;
sql

Your First LLM Query#

The beauty of Cortex is its simplicity. The primary function you’ll use is SNOWFLAKE.CORTEX.COMPLETE.

Syntax: SNOWFLAKE.CORTEX.COMPLETE(<model>, <prompt>)

Let’s try a simple “Hello World”:

SELECT SNOWFLAKE.CORTEX.COMPLETE(
    'snowflake-arctic',
    'Tell me a joke about data engineers.'
) as ai_response;
sql

You should get back a JSON response (or plain text depending on the model version) with a joke.

Practical Use Case: Sentiment Analysis#

Generating jokes is fun, but let’s do real work. Imagine you have a table product_reviews with a column review_text. You want to know if customers are happy.

Traditionally, you’d export this data, run it through Python NLTK, and load it back. With Cortex:

SELECT
    review_id,
    review_text,
    SNOWFLAKE.CORTEX.SENTIMENT(review_text) as sentiment_score
FROM product_reviews
LIMIT 10;
sql

Wait, SENTIMENT is a specific function! Cortex offers specialized functions for common tasks:

  • SENTIMENT(): Returns a score from -1 to 1.
  • SUMMARIZE(): Creates a summary of long text.
  • TRANSLATE(): Translates text between languages.

Advanced: Prompt Engineering in SQL#

For more complex tasks, we go back to COMPLETE. Let’s say we want to extract key entities from unstructured feedback.

SELECT
    review_id,
    SNOWFLAKE.CORTEX.COMPLETE(
        'llama3-70b',
        CONCAT(
            'Analyze the following customer review. Extract the product mentioned and the specific feature they disliked. ',
            'Return the output as JSON with keys "product" and "dislike". ',
            'Review: ', review_text
        )
    ) as analysis_json
FROM product_reviews;
sql

Notice the CONCAT. We are dynamically building the prompt for each row in the database. This allows you to process millions of rows in batch.

Performance and Cost Tips#

  1. Choose the right model: snowflake-arctic or mistral-7b are cheaper and faster for simple tasks. Save llama3-70b for complex reasoning.
  2. Batch size: While Cortex scales well, sending 100 million rows at once might hit throughput limits or cost budgets. Start with LIMIT clauses to test.
  3. Output token limits: Be specific in your prompt about length (“Summarize in 1 sentence”) to save on generation tokens.

Conclusion#

Snowflake Cortex removes the infrastructure barrier to AI. As a data engineer, you now have the most powerful NLP tools in the world available as a simple SQL function.

Start small—add a sentiment column to your customer view or try summarizing error logs. The barrier to entry has never been lower.

Disclaimer

The information provided on this website is for general informational purposes only. While we strive to keep the information up to date and correct, there may be instances where information is outdated or links are no longer valid. We make no representations or warranties of any kind, express or implied, about the completeness, accuracy, reliability, suitability, or availability with respect to the website or the information, products, services, or related graphics contained on the website for any purpose. Any reliance you place on such information is therefore strictly at your own risk.