❄️
Data Flakes

Back

Introduction: The Intersection of Two Worlds#

Building on the fundamentals we explored previously, this article delves into more advanced considerations that emerge when data architecture and enterprise architecture truly intersect. These aren’t theoretical concepts—they’re the challenges and opportunities that arise when you’re designing solutions at scale within complex organisational contexts.

As I’ve progressed in my own journey, I’ve come to appreciate how the seemingly abstract concepts of enterprise architecture directly impact the practical reality of building data platforms. The questions become more nuanced: How do we create data models that serve the entire enterprise? How do we govern data across organisational boundaries? What integration patterns actually scale? How do we isolate data for compliance whilst enabling collaboration?

Recently, celebrating my son’s first birthday reminded me that life—and architecture—is about growth, evolution, and learning through experience. Just as parenting constantly presents new challenges that textbooks don’t fully prepare you for, architectural work requires adapting principles to real-world complexity. This article shares some of those real-world considerations.

If you’ve grasped the fundamentals of both data and enterprise architecture, you’re ready to explore how these domains interconnect in sophisticated ways. Let’s examine what matters as you advance in architectural thinking.

Data Modelling in Enterprise Context#

Data modelling at the enterprise level is fundamentally different from modelling for a single application or team. You’re not just optimising for one use case—you’re creating structures that serve multiple business capabilities, support varied access patterns, and evolve over time without breaking existing consumers.

Canonical Data Models Across Enterprise#

A canonical data model provides a shared, standardised representation of core business entities across the enterprise. Rather than every system having its own definition of “Customer” or “Product”, the canonical model defines these concepts once, serving as the lingua franca for integration and data sharing.

In Snowflake, you might implement this as a shared schema within your data warehouse:

The canonical model becomes the reference point for understanding what these entities mean across the organisation. Application-specific models can extend or specialise, but they align with this shared understanding.

Domain-Driven Design for Data#

Domain-driven design (DDD) principles, originally from software engineering, apply powerfully to data architecture. Rather than creating one massive enterprise data model, you identify bounded contexts—domains with clear boundaries and ownership.

For example:

  • Customer domain: Everything about who customers are
  • Order domain: Everything about purchase transactions
  • Product domain: Everything about what you sell
  • Fulfilment domain: Everything about delivering orders

Each domain has its own models optimised for its purpose, with clearly defined integration points between domains. This aligns naturally with enterprise architecture’s capability thinking—each domain supports specific business capabilities.

In Snowflake, you might structure this as domain-specific databases:

-- Domain-specific databases with clear ownership
CREATE DATABASE IF NOT EXISTS CUSTOMER_DOMAIN;
CREATE DATABASE IF NOT EXISTS ORDER_DOMAIN;
CREATE DATABASE IF NOT EXISTS PRODUCT_DOMAIN;

-- Each domain has layers: raw, curated, published
CREATE SCHEMA IF NOT EXISTS CUSTOMER_DOMAIN.RAW;
CREATE SCHEMA IF NOT EXISTS CUSTOMER_DOMAIN.CURATED;
CREATE SCHEMA IF NOT EXISTS CUSTOMER_DOMAIN.PUBLISHED;

-- Published schemas are the contract with other domains
GRANT USAGE ON SCHEMA CUSTOMER_DOMAIN.PUBLISHED TO ROLE ORDER_DOMAIN_READER;
sql

This structure makes ownership explicit, supports autonomy within domains, and clarifies integration contracts.

Enterprise Data Taxonomies#

Taxonomies provide consistent classification and organisation of data across the enterprise. They answer questions like: What categories of data do we have? How do we classify sensitivity? What are our standard business terms?

A mature enterprise data taxonomy includes:

  • Business glossary: Agreed definitions for business terms
  • Data classification scheme: Sensitivity levels (public, internal, confidential, restricted)
  • Subject areas: Logical groupings of related data
  • Lineage categories: How data relates across the enterprise

In Snowflake, tag-based governance enables taxonomy enforcement:

-- Create enterprise taxonomy using tags
CREATE TAG IF NOT EXISTS GOVERNANCE.DATA_CLASSIFICATION
    ALLOWED_VALUES 'PUBLIC', 'INTERNAL', 'CONFIDENTIAL', 'RESTRICTED';

CREATE TAG IF NOT EXISTS GOVERNANCE.DATA_DOMAIN
    ALLOWED_VALUES 'CUSTOMER', 'PRODUCT', 'FINANCIAL', 'OPERATIONAL';

CREATE TAG IF NOT EXISTS GOVERNANCE.PII
    ALLOWED_VALUES 'NONE', 'STANDARD', 'SENSITIVE';

-- Apply taxonomy to data assets
ALTER TABLE CUSTOMER_DOMAIN.PUBLISHED.CUSTOMER
    SET TAG GOVERNANCE.DATA_CLASSIFICATION = 'CONFIDENTIAL',
        TAG GOVERNANCE.DATA_DOMAIN = 'CUSTOMER',
        TAG GOVERNANCE.PII = 'SENSITIVE';
sql

These taxonomies become the foundation for automated governance policies, access controls, and compliance reporting.

Governance Architecture#

Governance at the intersection of data and enterprise architecture is about creating frameworks that balance control with enablement. You need enough governance to ensure quality, security, and compliance, but not so much that you stifle innovation and agility.

Data Governance Within Enterprise Governance Framework#

Data governance doesn’t exist in isolation—it’s part of the broader enterprise governance framework that includes IT governance, risk management, and compliance. Understanding this relationship is crucial for creating governance that actually works.

Key principles for effective governance:

Federated accountability: Rather than a central team trying to govern all data, push accountability to domain owners who understand their data best. The central governance function sets standards, provides tooling, and ensures consistency.

Policy-based automation: Manual governance doesn’t scale. Use tag-based policies in Snowflake, automated data quality checks, and programmatic access controls. Governance should be largely invisible to users—it just works.

Proportional controls: Not all data requires the same level of governance. Public reference data needs different controls than customer financial records. Match governance overhead to actual risk.

Metadata Management as Enterprise Capability#

Metadata—data about data—becomes critical at enterprise scale. You need to know: What data exists? Where did it come from? What does it mean? Who owns it? How is it used? What’s its quality?

Metadata management is an enterprise capability that serves multiple stakeholders:

  • Data consumers need to discover and understand available data
  • Data engineers need lineage to troubleshoot issues
  • Governance teams need to enforce policies
  • Architects need to understand dependencies

Modern metadata platforms (like Atlan, Alation, or Snowflake’s built-in metadata features) provide:

Organisational Structure Impact: Conway’s Law#

Conway’s Law states: “Organisations design systems that mirror their communication structure.” This has profound implications for data architecture.

If your organisation has separate teams for CRM, ERP, and analytics with limited cross-team communication, your data architecture will naturally fragment along these boundaries. Conversely, if you create cross-functional domain teams with clear ownership, your data architecture can be more coherent.

This is why data mesh has gained traction—it explicitly recognises that data architecture should align with organisational structure. Domain-oriented ownership, data as a product, and federated governance are architectural patterns that work with Conway’s Law rather than against it.

Data Mesh as Enterprise Architecture Pattern#

Data mesh is fundamentally an enterprise architecture pattern that happens to apply to data. It proposes:

  1. Domain-oriented decentralisation: Organise data by business domains, not technical layers
  2. Data as a product: Treat data like a product with clear ownership, quality standards, and consumer focus
  3. Self-serve data infrastructure: Provide platforms that enable domains to build and operate their own data products
  4. Federated computational governance: Set global standards, execute them locally in each domain

This pattern aligns data architecture with enterprise architecture principles around modularity, clear interfaces, and distributed ownership. In Snowflake, data mesh might look like:

-- Each domain has its own database with clear product structure
CREATE DATABASE CUSTOMER_DOMAIN_PRODUCTS;

-- Data products have clear interfaces (published schemas)
CREATE SCHEMA CUSTOMER_DOMAIN_PRODUCTS.CUSTOMER_PROFILE_PRODUCT;
CREATE SCHEMA CUSTOMER_DOMAIN_PRODUCTS.CUSTOMER_BEHAVIOUR_PRODUCT;

-- Share data products across domains using Snowflake Data Sharing
CREATE SHARE CUSTOMER_PROFILE_SHARE;
GRANT USAGE ON DATABASE CUSTOMER_DOMAIN_PRODUCTS TO SHARE CUSTOMER_PROFILE_SHARE;
GRANT USAGE ON SCHEMA CUSTOMER_DOMAIN_PRODUCTS.CUSTOMER_PROFILE_PRODUCT
    TO SHARE CUSTOMER_PROFILE_SHARE;

-- Consumer domains access through shares, not direct database access
-- This creates clear interfaces and ownership boundaries
sql

Integration Architecture Patterns#

How systems and data connect is where enterprise architecture and data architecture most visibly intersect. The integration patterns you choose have profound implications for both domains.

API-First Architecture for Data Products#

API-first design treats data products like application APIs—well-defined interfaces with clear contracts, versioning, and lifecycle management. Rather than consumers directly querying your database tables, they access data through documented, stable APIs.

In Snowflake, this might be:

  • Secure views that provide stable interfaces whilst hiding implementation details
  • Stored procedures that encapsulate business logic
  • External functions that expose Snowflake data through REST APIs
  • Snowpark applications that provide programmable interfaces to data

The view becomes your API contract. Underlying tables can be restructured, data sources can change, but the API remains stable.

Event-Driven Architecture and Data Streaming#

Event-driven architecture (EDA) enables real-time data integration and loosely coupled systems. Rather than batch extracts and synchronous API calls, systems publish events that other systems consume asynchronously.

For data architecture, this means:

  • Streaming data ingestion: Kafka, Kinesis, or Snowflake Streams capture events as they happen
  • Change data capture (CDC): Detect changes in source systems and propagate them in near-real-time
  • Event-sourced data models: Store the history of changes, not just current state

Event-driven patterns align well with modern enterprise architecture’s move towards loosely coupled, scalable systems.

Data Sharing Across Enterprise Boundaries#

Snowflake’s Data Sharing capability is a powerful enterprise architecture enabler. It allows secure, governed data sharing across organisational boundaries without copying data or complex integration:

This pattern revolutionises enterprise data architecture by enabling:

  • Zero-copy data access: No data duplication, always current
  • Clear ownership boundaries: Producer controls data, consumers read through secure interface
  • Simplified integration: No complex ETL pipelines between domains
  • Secure collaboration: Share with external partners whilst maintaining control

Multi-Tenancy and Isolation#

Enterprise requirements often demand isolating data for different customers, regions, or business units whilst maintaining operational efficiency. Multi-tenancy architecture balances isolation with shared infrastructure.

Enterprise Requirements for Data Isolation#

Why isolate data? Common drivers include:

  • Regulatory compliance: GDPR, CCPA, industry-specific regulations requiring data residency or separation
  • Security boundaries: Different sensitivity levels or customer contractual requirements
  • Performance isolation: Ensuring one tenant’s workload doesn’t impact others
  • Cost allocation: Tracking costs by business unit or customer

Snowflake Account Strategies for Multi-Tenancy#

Snowflake offers multiple patterns for multi-tenancy:

Pattern 1: Shared database, separate schemas

-- Each tenant has own schema in shared database
CREATE DATABASE MULTITENANT_APP;
CREATE SCHEMA MULTITENANT_APP.TENANT_ACME;
CREATE SCHEMA MULTITENANT_APP.TENANT_GLOBALCORP;

-- Role-based access ensures tenants only see their data
CREATE ROLE TENANT_ACME_ROLE;
GRANT USAGE ON SCHEMA MULTITENANT_APP.TENANT_ACME TO ROLE TENANT_ACME_ROLE;
GRANT SELECT ON ALL TABLES IN SCHEMA MULTITENANT_APP.TENANT_ACME TO ROLE TENANT_ACME_ROLE;
sql

Pros: Efficient, easy to manage Cons: Less isolation, potential for access control mistakes

Pattern 2: Separate databases per tenant

-- Each tenant has dedicated database
CREATE DATABASE TENANT_ACME_DB;
CREATE DATABASE TENANT_GLOBALCORP_DB;

-- Stronger isolation through database-level separation
GRANT USAGE ON DATABASE TENANT_ACME_DB TO ROLE TENANT_ACME_ROLE;
sql

Pros: Better isolation, clearer cost attribution Cons: More overhead to manage

Pattern 3: Separate Snowflake accounts per tenant

For highest isolation, provision separate Snowflake accounts per major tenant. Use Snowflake Organizations to manage multiple accounts centrally.

Pros: Complete isolation, dedicated resources, independent billing Cons: Highest management overhead, less resource sharing

Cross-Functional Access Patterns#

Sometimes you need cross-tenant analytics or shared services. Snowflake enables this through:

  • Secure data sharing between accounts
  • Cross-database joins within an account
  • Aggregated views that provide cross-tenant insights whilst hiding individual tenant data
-- Create aggregated view for cross-tenant analytics
CREATE OR REPLACE SECURE VIEW ANALYTICS.CROSS_TENANT_SUMMARY AS
SELECT
    'MASKED' AS TENANT_ID,  -- Don't expose individual tenant IDs
    DATE_TRUNC('month', TRANSACTION_DATE) AS MONTH,
    COUNT(*) AS TRANSACTION_COUNT,
    SUM(AMOUNT) AS TOTAL_AMOUNT
FROM (
    SELECT * FROM TENANT_ACME_DB.TRANSACTIONS.TRANSACTIONS
    UNION ALL
    SELECT * FROM TENANT_GLOBALCORP_DB.TRANSACTIONS.TRANSACTIONS
)
GROUP BY TENANT_ID, DATE_TRUNC('month', TRANSACTION_DATE);
sql

Technology Standards and Portfolio Management#

Enterprise architecture plays a crucial role in technology selection and standardisation. For data architecture, this means navigating the tension between leveraging approved platforms and adopting best-of-breed solutions.

Enterprise Architecture’s Role in Technology Selection#

Enterprise architects typically maintain a technology portfolio—an inventory of approved platforms, tools, and patterns. This serves several purposes:

  • Reduce complexity: Fewer platforms mean less integration overhead and lower skills requirements
  • Negotiate better pricing: Enterprise agreements for widely-used platforms
  • Ensure support: Approved technologies have established support and operations
  • Enable reuse: Standardised platforms enable sharing capabilities across teams

For data professionals, this means your technology choices need to align with enterprise standards, or you need to make a compelling case for exceptions.

Build vs Buy Decisions Through Both Lenses#

Build vs buy decisions benefit from both data and enterprise perspectives:

Data architecture perspective considers:

  • Does this solution meet our technical requirements (performance, scalability, features)?
  • Can we integrate it with our data sources and consumers?
  • Does it fit our data modelling and processing patterns?

Enterprise architecture perspective considers:

  • Does this align with our strategic platform choices?
  • What’s the total cost of ownership including support, training, and maintenance?
  • How does this affect our architectural complexity?
  • Can we reuse this capability for other use cases?

For example, choosing Snowflake as an enterprise data platform is an enterprise architecture decision with broad implications. It standardises on cloud data warehousing, influences integration patterns, and defines governance capabilities available to all teams.

Platform Thinking: Snowflake as Enterprise Platform#

Modern data architecture increasingly adopts platform thinking—rather than building point solutions, you create reusable platforms that enable multiple use cases. Snowflake exemplifies this approach:

  • Multi-workload platform: Supports data warehousing, data lake queries, data engineering, data science, and application development
  • Secure data sharing: Enables enterprise data mesh patterns
  • Governed by default: Built-in features for access control, encryption, and compliance
  • Elastic and scalable: Grows with enterprise needs without architectural overhaul

Treating Snowflake as an enterprise platform means:

These standards enable teams to move faster because patterns are established and reusable.

Evolution and Transformation#

Architecture isn’t static—it evolves as business needs change, technology advances, and organisations grow. Managing this evolution requires both data and enterprise perspectives.

How Enterprise Architecture Guides Data Transformation#

Enterprise architecture provides the roadmap for transformation. Rather than ad-hoc changes, you have a structured approach:

Current state architecture: Document what exists today Target state architecture: Define where you want to be Gap analysis: Identify what needs to change Transition architecture: Define intermediate states and sequencing Implementation roadmap: Prioritise and execute changes

For data transformation, this might look like:

flowchart LR
    A[Current State:<br/>Fragmented Data Silos] --> B[Transition 1:<br/>Consolidate to Snowflake]
    B --> C[Transition 2:<br/>Implement Domain Structure]
    C --> D[Target State:<br/>Federated Data Mesh]

    style A fill:#ffcccc
    style D fill:#d4edda
    style B fill:#fff4e1
    style C fill:#fff4e1
mermaid

The enterprise architecture discipline helps ensure this transformation aligns with broader organisational change, has executive sponsorship, and doesn’t create new problems whilst solving old ones.

Migration Patterns and Roadmaps#

Practical migration patterns for data transformation:

Strangler fig pattern: Gradually replace old systems by building new capabilities alongside and slowly migrating consumers Dual-run pattern: Run old and new systems in parallel until confidence is established Big bang with rollback: Migrate everything at once, with solid rollback plan (high risk, sometimes necessary)

In Snowflake, zero-copy cloning supports safe migration:

-- Create zero-copy clone of production for testing migration
CREATE DATABASE PROD_MIGRATION_TEST CLONE PROD_DATABASE;

-- Test your migration transformations
-- No risk to production, no storage cost for duplicate data

-- Once validated, implement in production
-- Can rollback using Time Travel if needed
sql

Managing Technical Debt at Enterprise Scale#

Technical debt accumulates in data architecture just like application code—quick fixes, workarounds, and outdated patterns that create friction over time. At enterprise scale, managing this debt requires intentionality.

Strategies include:

  • Regular architectural reviews: Assess alignment with current standards
  • Deprecation policies: Sunset old patterns with clear timelines
  • Modernisation budget: Allocate engineering time specifically for paying down debt
  • Make it visible: Track architectural debt in your backlog alongside features

For data architecture, common debt includes:

  • Legacy ETL patterns that don’t leverage modern ELT
  • Poorly modelled tables that require complex joins
  • Duplicate data pipelines across teams
  • Ungoverned data proliferation

Balancing Standardisation with Innovation#

The tension between standardisation and innovation is perpetual. Too much standardisation stifles innovation; too little creates chaos. Finding balance requires:

Innovation zones: Designate specific areas where teams can experiment with new approaches, with clear criteria for promoting successful experiments to standards

Exception process: Clear process for justified exceptions to standards, with architectural review

Evolving standards: Standards themselves should evolve as you learn. Successful innovations should inform updated standards.

In practice, this might mean:

  • Snowflake is the standard data platform, but teams can request exceptions for specific use cases (e.g., real-time streaming requiring Kafka)
  • Standard data modelling patterns exist, but domains can adapt them for their specific needs
  • Governance policies are enforced, but the policies themselves are reviewed and updated quarterly

Conclusion: Maturity Through Integrated Perspective#

As I reflect on this journey—both in my professional growth and in the humbling experience of parenthood—I’m struck by how much architectural maturity comes from embracing complexity rather than trying to eliminate it. The intersection of data architecture and enterprise architecture is inherently complex, and that’s okay.

What matters is developing the perspective to see these different facets clearly: understanding how canonical data models serve enterprise integration, how governance frameworks balance control with autonomy, how multi-tenancy patterns align with business requirements, and how technology standards enable whilst constraining innovation.

The transition I’m navigating professionally has taught me that expertise isn’t about knowing everything—it’s about knowing how to ask the right questions, where to look for answers, and how to synthesise different perspectives into coherent solutions. The advanced considerations we’ve explored in this article aren’t a checklist to complete; they’re lenses through which to view problems.

My advice as you navigate these complexities? Stay curious. Engage with people outside your immediate domain—talk to application architects about their integration challenges, discuss governance with your compliance team, understand how enterprise architects think about transformation. Every conversation deepens your perspective.

Most importantly, recognise that this is an ongoing journey. I’m learning alongside you, making mistakes, discovering patterns that work, and continually adjusting my mental models. That process of continuous learning and adaptation is what keeps this work engaging and valuable.

Whether you’re designing a data mesh, implementing governance frameworks, or evaluating multi-tenancy patterns, remember: it’s all connected. The magic happens at the intersections, where data architecture meets enterprise architecture and where technical solutions serve real business needs.

Here’s to the journey of continuous growth, both in architecture and in life.

Key Takeaways#

  • Canonical data models and domain-driven design create enterprise-wide coherence whilst enabling domain autonomy
  • Governance works best when federated—set standards centrally, execute them in domains
  • Integration patterns (API-first, event-driven, data sharing) directly impact architectural flexibility
  • Multi-tenancy requires balancing isolation requirements with operational efficiency
  • Platform thinking (like Snowflake as enterprise platform) enables reusable capabilities
  • Technology standardisation should enable innovation, not stifle it
  • Transformation requires structured roadmaps that sequence changes appropriately
  • Architectural maturity comes from understanding multiple perspectives and their intersections

Additional Resources#

  • Books: “Building Data Mesh” by Zhamak Dehghani, “Enterprise Integration Patterns” by Hohpe & Woolf
  • Snowflake: Architecture Best Practices Guide, Data Governance Framework documentation
  • Frameworks: TOGAF for enterprise architecture, DAMA-DMBOK for data governance
  • Patterns: Martin Fowler’s articles on domain-driven design and integration patterns
  • Communities: Data Mesh Learning Community, Snowflake user groups, Enterprise Architecture forums

Disclaimer

The information provided on this website is for general informational purposes only. While we strive to keep the information up to date and correct, there may be instances where information is outdated or links are no longer valid. We make no representations or warranties of any kind, express or implied, about the completeness, accuracy, reliability, suitability, or availability with respect to the website or the information, products, services, or related graphics contained on the website for any purpose. Any reliance you place on such information is therefore strictly at your own risk.