Streamlit in Snowflake: Advanced Caching
Stop your Streamlit apps from re-running heavy queries. A deep dive into st.cache_data and session state management.
Happy Valentine’s Day! Is there anything more romantic than a dashboard that loads instantly? I didn’t think so.
Streamlit in Snowflake (SiS) has revolutionized how we build internal data apps. But a common complaint is performance. “It’s slow when I change the filter.” This is almost always a caching issue.
Understanding the Execution Model#
Streamlit runs the entire python script from top to bottom every time a user interacts with a widget. If you have a
query session.sql("SELECT * FROM BILLION_ROW_TABLE").collect() at the top of your script, you are re-scanning that
table every time someone clicks a button.
The Solution: st.cache_data#
In 2025, st.cache_data is the standard decorator for data fetching.
import streamlit as st
from snowflake.snowpark.context import get_active_session
session = get_active_session()
@st.cache_data(ttl=3600) # Cache for 1 hour
def get_sales_data(region):
# This string generation happens fast
query = f"SELECT * FROM sales WHERE region = '{region}'"
# This expensive network/compute call happens once per hour per region
return session.sql(query).to_pandas()
selected_region = st.selectbox("Region", ["NA", "EMEA", "APAC"])
df = get_sales_data(selected_region)
st.dataframe(df)pythonKey Parameters#
- ttl: Time To Live. Set this! Default is forever (until app restart).
- show_spinner: Toggle the “Running…” spinner.
Caching vs. Session State#
Caching is global (shared across users in some contexts, or persistent across re-runs). Session State is per-user-session.
Use st.session_state to store user inputs or intermediate calculation results that don’t need to be re-computed but
are specific to this user’s workflow.
if 'counter' not in st.session_state:
st.session_state.counter = 0
if st.button('Count'):
st.session_state.counter += 1pythonLarge Dataframes: Mutability Warning#
Remember that st.cache_data stores a copy of the data. If your function returns a massive DataFrame, Streamlit has
to pickle/unpickle it.
For massive datasets, consider pushing the aggregation down to Snowflake SQL and only caching the results. Don’t pull raw data into the Streamlit app unless necessary.
Conclusion#
A well-cached Streamlit app feels like a native compiled application. A poorly cached one feels like a sluggish web page
from 1999. Use st.cache_data aggressively for all SQL interaction.