“Vector Search” is the buzzword of the AI era. But for many SQL-native data engineers, it can feel like black magic.
We’re used to exact matches (WHERE id = 123) or fuzzy string matches (LIKE '%text%'). Vector search is fundamentally
different: it searches for semantic similarity.
The Concept: Embeddings
Computers don’t understand words; they understand numbers. An Embedding Model transforms a piece of text (sentence, paragraph, or document) into a list of floating-point numbers (a vector).
Example:
- “The cat sat on the mat” ->
[0.1, 0.5, -0.3, ...] - “The feline rested on the rug” ->
[0.12, 0.48, -0.29, ...]
In the vector space, these two arrays are mathematically “close” to each other (using Cosine Similarity), even though they share very few common words.
Native Vector Data Type
Snowflake added the VECTOR data type to support this.
CREATE TABLE docs (
id int,
content text,
embedding VECTOR(FLOAT, 768) -- 768 dimensions is common for standard models
);sqlYou can generate these chunks using SNOWFLAKE.CORTEX.EMBED_TEXT_768('snowflake-arctic-embed-m', content).
Cortex Search Services
While you can manage vectors manually, managing the index for fast retrieval at scale is hard. Enter Cortex Search.
Cortex Search is a managed service. You point it at a table, tell it which column contains the text, and it handles the indexing, embedding updates, and retrieval.
-- Conceptual service creation
CREATE CORTEX SEARCH SERVICE my_search_svc
ON description
ATTRIBUTES products
WAREHOUSE = my_wh
TARGET_LAG = '1 hour'
AS SELECT * FROM product_catalog;sqlWhy not just use ElasticSearch?
Integration and Governance. By keeping the vectors in Snowflake:
- Zero ETL: Data doesn’t leave the platform.
- Security: Row-level security on the base table applies to the search results.
- Simplicity: It’s just SQL.
Conclusion
Vector search enables capabilities like “Find me products similar to this image” or “Find clauses in contracts that
mention liability.” It’s a new primitive in the data engineer’s toolkit, unlocking use cases that LIKE '%...%' could
never dream of.