The Medallion Architecture Is Not What You Think It Is

If you've been anywhere near data engineering in the last five years, you've seen the diagram. Three colorful boxes stacked vertically: bronze at the bottom, silver in the middle, gold on top. Arrows flowing upward. Maybe a lakehouse logo in the corner. The blog post explains that bronze is raw, silver is cleaned, gold is ready for analytics. Simple. Elegant. Done.

Except it's not done. It's barely started.

The medallion architecture is the most misunderstood pattern in modern data engineering. Not because it's complicated — it isn't — but because the way it's taught strips away everything that makes it useful. What you're left with is three folder names and a vague sense that your data should flow "upward." That's not an architecture. That's a filing system.

I've reviewed dozens of data platforms that claim to implement medallion. Most of them use the terminology — bronze, silver, gold — without implementing the actual pattern. They have three schemas in their warehouse, three prefixes in their object store, three directories in their transformation project. But when you look at what's in those layers, the boundaries are arbitrary, the contracts are undefined, and the guarantees are nonexistent. The layers are decoration, not engineering.

This matters because medallion, when implemented correctly, solves a real and important problem: it gives you a reliable, debuggable, evolvable data supply chain. When implemented as cargo cult — copied from blog posts without understanding the underlying purpose — it gives you three folders and a false sense of order.

Let me explain what the pattern actually is, the five mistakes almost everyone makes, and when you should ignore medallion entirely.

What medallion actually is

Strip away the marketing and the pretty diagrams. Medallion is a layered data quality contract. Each layer makes a specific guarantee about the data it contains. The guarantee at each layer is strictly stronger than the one below it. That's the entire pattern. Everything else is implementation detail.

Bronze: the raw layer

Bronze is an immutable copy of source data, exactly as it arrived. No transformations. No cleaning. No renaming columns. No casting types. The guarantee is simple: this data is identical to what the source system produced. If the source sent nulls, bronze has nulls. If the source sent duplicate records, bronze has duplicate records. If the source sent a string where you expected an integer, bronze has a string.

This immutability is the entire point. Bronze is your insurance policy. When a transformation turns out to be wrong — when someone discovers that "status = 7" actually means something different than what the original developer assumed — you go back to bronze and re-derive. If bronze is mutable, you've lost the original data, and you've lost the ability to recover from mistakes. That's not a minor inconvenience. It's a fundamental architectural failure.

Silver: the conformed layer

Silver is cleaned, validated, deduplicated, and type-enforced data. The guarantee is: this data is structurally correct and internally consistent. Columns have the right types. Nulls are handled according to policy. Duplicates are resolved. Timestamps are in a consistent timezone. Keys are validated. Schema is enforced.

Silver is where you deal with the mess of the real world. Source systems are chaotic — they send different date formats, they have implicit nulls, they duplicate records during retries, they change schemas without warning. Silver absorbs all of that chaos and produces clean, reliable, structurally consistent data. But — and this is critical — silver does not contain business logic. It doesn't calculate revenue. It doesn't classify customers. It doesn't aggregate transactions. It just makes the data trustworthy at the structural level.

Gold: the business-modeled layer

Gold is data shaped for specific business consumers. The guarantee is: this data answers a specific business question correctly. Revenue by region. Customer lifetime value. Churn prediction features. Operational KPIs. Gold tables contain business logic — the joins, calculations, aggregations, and classifications that turn clean data into business answers.

The key insight about gold: there isn't one gold layer. There are many. Different consumers need different models. The finance team needs revenue data aggregated by accounting period with specific recognition rules. The marketing team needs the same underlying data aggregated by campaign with attribution logic. The ML team needs feature tables with different grain and different temporal semantics. These are all gold — and they should all be separate, explicitly designed for their consumer.

Notice what the diagram emphasizes: guarantees, not folders. Each layer has a specific contract about what's true of the data inside it. If you can't articulate the guarantee at each boundary, you don't have a medallion architecture. You have three schemas with pretty names.

The five mistakes almost everyone makes

I've audited medallion implementations at companies ranging from 20-person startups to enterprise data teams with fifty engineers. The same five mistakes show up again and again. They're not edge cases — they're the norm.

Mistake 1: Treating layers as folders instead of contracts

This is the most common and most damaging mistake. A team creates three schemas — bronze, silver, gold — and starts putting tables in them based on vibes. Raw-ish data goes in bronze. Somewhat-clean data goes in silver. Final tables go in gold. There's no formal definition of what qualifies a table for a given layer. There's no validation at the boundaries.

The consequence is that you can't trust any layer. A table in "silver" might have duplicates, might have wrong types, might have nulls where it shouldn't. Nobody knows, because nobody defined what silver guarantees. When a dashboard shows a wrong number, you can't narrow down where the problem is, because no layer makes a promise you can test against.

The fix: write down the contract for each layer. Literally. A document that says "a table in silver must satisfy these conditions: no duplicate primary keys, all columns typed according to the schema registry, all timestamps in UTC, null policy enforced." Then test for those conditions at every layer transition.

Mistake 2: Putting business logic in silver

Silver should be domain-agnostic. It should clean data in ways that any downstream consumer would want: deduplicate, type-enforce, conform timestamps, validate keys. The moment you put business logic in silver — calculating revenue, classifying customers into segments, applying discount rules — you've coupled every downstream consumer to one team's business definitions.

I once audited a platform where silver contained a customer_segment column derived from a marketing team's classification rules. The ML team needed a different segmentation for their churn model. But they were reading from silver, which already had the marketing segmentation baked in. They couldn't get the raw attributes they needed to build their own classification. The "shared" silver layer had become a bottleneck because it encoded one team's business logic as if it were universal truth.

Business logic belongs in gold. Different gold layers can apply different logic to the same silver data. That's the entire point of having multiple gold outputs.

Mistake 3: Making bronze mutable

This one usually happens for cost reasons. "We're spending too much on storage. Let's clean up bronze and remove data older than 90 days." Or it happens accidentally: a pipeline overwrites bronze tables instead of appending to them.

The moment bronze is mutable, you've lost the core benefit of the medallion pattern. Bronze is your rewind button. When you discover that a transformation was wrong — and you will discover this, it's not a matter of if — you re-derive silver and gold from bronze. If bronze has been modified, you're re-deriving from a corrupted baseline. You might not even know it's corrupted until the numbers don't add up and you've spent three days debugging.

Storage is cheap. Re-deriving data from nothing is not. Keep bronze immutable. Use lifecycle policies and cold storage tiers to manage costs, but never delete or modify the data.

Mistake 4: Having only one gold layer

This mistake comes from taking the three-layer diagram too literally. Bronze is one layer, silver is one layer, therefore gold must be one layer. Wrong. Gold is where you model data for specific business questions, and different questions require different models.

A finance mart aggregates by fiscal period with specific revenue recognition rules. A marketing mart aggregates by campaign with attribution windows. An operational dashboard needs real-time or near-real-time grain. An ML feature store needs point-in-time correctness. Trying to serve all of these from a single gold schema means every table is a compromise — too granular for the executives, too aggregated for the ML team, using the wrong fiscal calendar for finance.

Build separate gold layers (I call them marts) for each major consumer or domain. They all read from the same silver tables, but they model the data differently. This is not duplication — it's specialization.

Mistake 5: Skipping silver

"We're a small team. We don't have time for a silver layer. We'll just go straight from bronze to gold."

I've heard this a dozen times. The team that skips silver always pays for it within six months. Here's why: without a shared clean layer, every gold model has to handle data cleaning itself. Model A deduplicates orders one way. Model B deduplicates them a different way. When the numbers disagree, nobody can tell whether the problem is in the business logic or the cleaning logic — because they're tangled together in every model.

Silver is the shared foundation. It solves the cleaning problem once, and every gold model builds on that clean foundation. It's also the most natural debugging boundary: when something looks wrong in gold, you check silver first. If silver is clean, the problem is in gold's business logic. If silver is dirty, the problem is upstream. Without silver, you're debugging the entire pipeline every time.

The time you "save" by skipping silver gets spent tenfold on debugging and data quality firefighting. Every single time.

When medallion is the wrong pattern

Medallion is a good pattern. It's not the only pattern. There are legitimate architectures where medallion doesn't fit, and forcing it creates more problems than it solves.

Real-time and streaming workloads

Medallion is fundamentally a batch-oriented pattern. It assumes data flows through layers in discrete runs — extract, then clean, then model. Streaming workloads need data available in milliseconds, not minutes. You can adapt medallion concepts to streaming (a "bronze topic" in a message broker, for instance), but the pattern loses most of its benefits when latency is the primary constraint. Streaming has its own patterns — event sourcing, CQRS, materialized views — that serve the latency requirement better.

Very simple pipelines

If you have one source system, one consumer, and a straightforward transformation, three layers are overhead. A two-layer approach (raw + modeled) or even a single well-tested pipeline is fine. Don't create complexity to satisfy a pattern. The pattern exists to manage complexity that's already there.

Data mesh architectures

In a data mesh, each domain team owns its data products end-to-end. They define their own quality contracts, their own schemas, their own interfaces. A centralized medallion architecture with shared bronze and silver layers contradicts the mesh principle of domain ownership. Each domain might use medallion internally, but the mesh itself operates on different principles — federated governance, domain-oriented decentralization, self-serve infrastructure.

Prototyping and exploration

When you're still figuring out what questions to ask and what data you need, formal layers slow you down. Explore freely in a sandbox. Read from source systems directly. Iterate fast. When patterns emerge and you know what the production pipeline should look like, then introduce medallion. The pattern is for production data products, not for discovery.

A better mental model: the data supply chain

The diagram everyone draws — three stacked boxes with arrows — is technically correct but pedagogically useless. It doesn't convey why the layers exist or how they relate. Here's a better mental model: think of medallion as a manufacturing supply chain.

Bronze is raw materials. Ore, timber, crude oil. You stockpile them exactly as they arrive from suppliers. You don't process them until you need to. You never throw them away, because you might discover a new use for them later. The raw materials warehouse is your strategic reserve.

Silver is the factory floor. This is where raw materials get inspected, cleaned, cut to specification, and prepared for assembly. The factory floor doesn't make finished products — it makes standardized parts. These parts are useful to multiple assembly lines. The factory floor is shared infrastructure: every product line benefits from the same quality-controlled components.

Gold is the finished product. Each product is designed for a specific customer with specific requirements. The executive dashboard is a luxury car — polished, aggregated, high-level. The ML feature store is precision tooling — exact, granular, temporally correct. The compliance report is a safety manual — complete, auditable, regulatory-grade. Same raw materials, same factory, different products.

This mental model makes the design decisions obvious. You wouldn't put product assembly logic on the factory floor — that would couple every product to one design. You wouldn't skip quality inspection to save time — you'd ship defective parts to every assembly line. You wouldn't throw away raw materials to save warehouse space — you'd lose the ability to build new products from existing supply.

How I actually implement it

Theory is useful. Implementation is what matters. Here's how I build medallion architectures in practice — the naming conventions, the testing strategy, the directory structure, and the decisions that trip up most teams.

Naming conventions

Naming is one of those things that feels trivial until you have 300 tables and can't find anything. I use a consistent prefix-based scheme:

bronze_{source}_{table} — e.g., bronze_crm_contacts, bronze_payments_transactions
silver_{domain}_{entity} — e.g., silver_customers_contacts, silver_finance_transactions
gold_{mart}_{table} — e.g., gold_marketing_campaign_performance, gold_finance_monthly_revenue

The prefix tells you the layer. The second segment tells you the source (for bronze), domain (for silver), or consuming mart (for gold). The remainder describes the entity. You can find any table in three seconds.

Testing at layer boundaries

The contract at each layer boundary needs to be tested. Not "we hope silver is clean" — actually tested, on every run. This is commonly implemented with a transformation tool's built-in testing framework, but the concept is universal regardless of tooling.

At the bronze-to-silver boundary, I test for: primary key uniqueness, null policy compliance (which columns allow nulls and which don't), type conformance (is the timestamp column actually a timestamp?), and referential integrity where applicable. At the silver-to-gold boundary, I test for: business rule correctness (does the revenue calculation match the finance team's definition?), aggregation completeness (do the parts sum to the whole?), and temporal correctness (are we including the right date ranges?).

These tests are the implementation of the quality contracts. Without them, you just have assertions in a document that nobody checks.

Schema enforcement

Bronze can use schema-on-read — the data arrives as-is and you define the schema when you query it. This is important because source schemas change without warning, and you don't want your ingestion pipeline to break every time a source system adds a column.

Silver must use schema-on-write — the schema is enforced when data enters the layer. If a column type changes upstream, the bronze-to-silver transformation should catch it, log it, and either handle it or fail loudly. Silent schema drift in silver is a debugging nightmare.

Gold uses strict schema-on-write with explicit column documentation. Every column in a gold table has a description, a business definition, and an owner. If you can't explain what a column means in business terms, it shouldn't be in gold.

Lineage tracking

Every table should trace back to its inputs. If someone asks "where does the revenue number on the executive dashboard come from?" you should be able to answer: it comes from gold_finance_monthly_revenue, which is built from silver_finance_transactions and silver_finance_exchange_rates, which are built from bronze_payments_transactions and bronze_forex_daily_rates. This is lineage — and it's what makes the entire architecture debuggable.

Most modern transformation frameworks generate lineage automatically from the dependency graph. If yours doesn't, maintain it manually. It's worth the effort. When something breaks at 3 AM and the on-call engineer needs to trace a wrong number back to its source, lineage is the difference between a 20-minute fix and a 4-hour debugging session.

What the directory structure looks like

In a transformation project, I typically structure models like this:

models/
  bronze/
    crm/
    payments/
    marketing/
  silver/
    customers/
    finance/
    products/
  gold/
    marketing_mart/
    finance_mart/
    ml_features/
    operational_mart/

Each directory has its own schema configuration, its own test suite, and its own documentation. The structure mirrors the architecture, which makes navigation intuitive and onboarding fast.

Document your medallion decisions

Here's the thing nobody tells you about medallion architecture: the hardest part isn't building it. It's maintaining it after the person who built it leaves.

Every medallion implementation is full of decisions that aren't visible in the code. Why does the customer deduplication logic in silver use email instead of phone number? Because the CRM source has unreliable phone data — the original engineer discovered this during a three-day debugging session six months ago. Why does the finance mart use a different fiscal calendar than the marketing mart? Because the finance team's reporting period doesn't align with calendar months — they close on the last Friday of each month.

These decisions are critical. They're also invisible. They live in the original engineer's head, in Slack threads that get archived, in meeting notes that nobody re-reads. When that engineer leaves — and they will leave — the next person inherits a system full of choices they don't understand. They'll change the deduplication logic to use phone numbers (it seems more reliable, right?) and break everything. They'll "fix" the fiscal calendar to use standard months and produce wrong financial reports for three months before anyone notices.

The solution is to document your medallion decisions in a structured knowledge base — not as scattered wiki pages or README files, but as a cross-referenced, searchable body of institutional knowledge. For every non-obvious decision in your medallion architecture, record:

What the decision is (e.g., "Customer deduplication uses email as primary key")
Why it was made (e.g., "Phone data from the CRM has 15% duplicate rate due to formatting inconsistencies")
What the alternative was (e.g., "We considered phone-based dedup but found it produced 8% false matches")
When it should be revisited (e.g., "If the CRM migrates to a new platform, re-evaluate phone data quality")

I've written about this approach in detail in The Knowledge Base Strategy. The short version: your medallion architecture should have a companion knowledge base that captures the why behind every significant design choice. The architecture is the what. The knowledge base is the why. Both are essential for long-term maintainability.

When the next engineer joins — or when an AI agent needs to understand your system to plan a migration — the knowledge base provides the context that code alone can't. It's the difference between inheriting a system you can maintain and inheriting a system you can only fear.

The bottom line

Medallion architecture is simple. That's its strength and its trap. The simplicity invites cargo-culting — copy the three-layer diagram, create three schemas, call it done. The teams that get real value from medallion are the ones that go beyond the diagram and implement what the pattern actually requires:

Define explicit quality contracts at each layer. Write them down. Test for them on every run. If you can't articulate what a layer guarantees, the layer doesn't exist — it's just a namespace.
Keep silver domain-agnostic. Clean, conform, validate — but don't encode business logic. That belongs in gold, where different consumers can apply their own rules.
Never mutate bronze. It's immutable. It's append-only. It's your ability to recover from every mistake you'll make in silver and gold. Treat it as sacred.
Build multiple gold layers. Different consumers need different models. A shared gold layer is a compromise that serves nobody well.
Don't skip silver. The time you save now gets spent tenfold in debugging and data quality firefighting later.
Don't force the pattern. Streaming, simple pipelines, mesh architectures, and exploratory work all have better patterns. Use medallion where it fits. Ignore it where it doesn't.
Document your decisions. The architecture is the what. The knowledge base is the why. Both are essential for the system to survive its original builder's departure.

Medallion isn't bronze/silver/gold. It's a quality contract that compounds with every layer. Implement the contracts, not just the labels, and you'll have a data supply chain that's reliable, debuggable, and evolvable. Skip the contracts, and you'll have three folders and a false sense of order.

The pattern is simple. Doing it right takes discipline. But the payoff — a data platform where you can trust every number, trace every lineage, and debug every problem — is worth the rigor.

Need help designing your data architecture?

I help teams implement medallion (and know when not to). From architecture to production.

Book a Discovery Call Read the Platform Guide