“We do ETL.” You hear it in almost every data team — and it is often wrong. If every transformation runs as a stored procedure after the raw data has already been loaded, unchanged, into a staging table, then what you have is really ELT. The label hides what is actually happening. The difference between ETL vs. ELT is not decided by the order of the letters, but by where the transformation computes — and whether raw data is loaded first.
TL;DR — what this article shows:
- What E, T and L really are — three logical phases that occur in every data-integration approach. The order of the letters alone decides nothing.
- The two-criteria test — where does the transform compute run, and is raw data loaded first? With it you can classify any pipeline as ETL or ELT.
- In-flight vs. push-down — the actual dividing line, with a T-SQL and a Postgres example.
- This series’ “ETL pattern” classified — architecturally it is persistent-staging ELT (with an honest hybrid caveat at the final load).
- The family — ELT, EtLT, Medallion, Data Vault — and when ETL still remains the right choice.
Prerequisite. A basic understanding of ETL and data integration. This is a conceptual classification article, not a step-by-step tutorial. SQL Server and Postgres serve as example engines; the code snippets are illustrative and make the difference visible — they do not build a runnable schema.
Contents
- ETL vs. ELT: What E, T and L Really Are
- The Test: Two Criteria
- Classified: This Series’ “ETL Pattern” Is ELT
- ELT, EtLT and the Medallion Family
- When ETL, When ELT — the Practical Decision
- FAQ
- Related Articles
ETL vs. ELT: What E, T and L Really Are
Extract, Transform, Load — three letters, three jobs. Pull data from a source, bring it into the desired shape, write it to the target. These three phases are logical and occur in every data-integration approach. An ELT pipeline extracts, transforms and loads, too. What separates ETL and ELT is not the existence of these phases, but their physical arrangement and where they execute.
This is where the first misconception sits: the acronym does not describe the logic, but the execution model. “ETL” means the transformation happens before the load into the target — typically in a dedicated tool engine that processes the data in transit. “ELT” means raw data is loaded first and the transformation runs afterwards in the target or staging engine, usually as SQL.
The second misconception is generational: ETL is seen as “old”, ELT as “modern and cloud”. That is marketing, not an architectural argument. ELT is not an invention of the cloud vendors — a raw load into a staging table followed by SQL transformation is a technique as old as the data warehouse itself. Cloud platforms like Snowflake or BigQuery (managed analytical databases with enormous SQL compute) only made ELT popular, because their engine handles the transformation cheaply and in parallel. The distinction stays architectural, though: it depends on where the compute runs, not on which decade or which platform.
The Test: Two Criteria
Instead of clinging to the acronym, you classify a pipeline with two questions:
- Where does the transform compute run? In a separate pipeline/tool engine that streams the rows through (in-flight) — or as SQL in the target/staging engine once the data is sitting there (push-down)?
- Is raw data loaded first? Does an unchanged raw copy land first in a persistent staging layer (load-raw-then-transform) — or is the row transformed before it is written anywhere (transform-then-load)?
The two answers usually go together: in-flight transformation pairs with transform-then-load (= ETL), push-down with load-raw-then-transform (= ELT). When they diverge — say, light cleansing in the data flow, the rest via SQL after the load — you land in the in-between that shows up later as EtLT.
The two snippets below make the difference visible on the same mini-case: a CSV with customer name, amount and country text should go into a target table, with the amount typed and the country mapped to a key.
In-flight: The ETL Signature
In a graphical ETL tool such as SSIS, the transformation hangs inside the data flow, between source and target. Each row is converted, trimmed and looked up in the tool’s memory — and only the finished row is written to the target:
CSV source
│
├─ Data Conversion : betrag (text -> decimal)
├─ Derived Column : TRIM(name)
├─ Lookup : land_text -> land_id
│ (all transformations run INSIDE the tool data flow, in memory)
▼
OLE DB destination : dwh.kunde <- only here is the row written
Transform before load, compute in the tool engine — that is the classic ETL signature. The data-flow designer is not copy-paste SQL, which is why a diagram stands here instead of a statement.
The in-flight approach has an organizational downside that you see constantly in day-to-day tool work: because the tool creates one data flow per target table, the obvious reflex is to build one package per table — and each package bundles extraction, transformation and loading for that single table. A process then consists of many near-identical packages in a row, each doing the same thing for its table. That is tool-driven thinking, not process-driven.
The set-based counter-design turns the axis: not E+T+L per table, but all tables per phase — first extract everything, then transform and check everything, then load everything. Each phase is a set operation in the DB engine instead of a loop over per-table packages. It is precisely this set-based transform-and-check phase that is the core of the push-down model — and the reason data quality can be checked generically, in one place, instead of scattered across every individual table package.
Push-down: The ELT Signature
Push-down reverses the order: the raw row lands unchanged in a staging table first (columns as text), then a SQL statement transforms it in the engine. In T-SQL:
1: -- 1) Load raw: nothing is transformed in transit
2: INSERT INTO staging.kunde_raw
3: (
4: name
5: ,betrag
6: ,land_text
7: )
8: SELECT
9: name
10: ,betrag
11: ,land_text
12: FROM
13: OPENROWSET(BULK N'kunde.csv', FORMATFILE = N'kunde.fmt') T01(name, betrag, land_text);
14:
15: -- 2) Transform only in the target engine: typing after the load
16: INSERT INTO dwh.kunde
17: (
18: name
19: ,betrag
20: ,land_id
21: )
22: SELECT
23: TRIM(T01.name)
24: ,TRY_CONVERT(decimal(12, 2), T01.betrag) -- invalid becomes NULL, no abort
25: ,T02.land_id
26: FROM
27: staging.kunde_raw T01
28: LEFT JOIN dim.land T02
29: ON
30: T02.land_text = T01.land_text;
The same pattern in Postgres — COPY loads raw into text columns, the transformation follows via SQL:
1: -- 1) Load raw: COPY into text columns, no transformation
2: COPY staging.kunde_raw
3: (
4: name
5: ,betrag
6: ,land_text
7: )
8: FROM '/import/kunde.csv' WITH (FORMAT csv, HEADER true);
9:
10: -- 2) Transform only in the engine: cast + lookup via SQL after the load
11: INSERT INTO dwh.kunde
12: (
13: name
14: ,betrag
15: ,land_id
16: )
17: SELECT
18: TRIM(T01.name)
19: ,T01.betrag::numeric(12, 2) -- secured in production, see DQ series
20: ,T02.land_id
21: FROM
22: staging.kunde_raw T01
23: LEFT JOIN dim.land T02
24: ON
25: T02.land_text = T01.land_text;
Raw first (the L), transformation via SQL in the engine afterwards (the T) — that is the ELT signature. In both snippets the transform compute sits where the data sits; the tool that moved the CSV was only a loader. The typing here is illustratively shortened: in practice the cast is secured (TRY_CONVERT instead of a hard cast, and a checked variant in Postgres) so that an invalid value does not abort the run — which is exactly the subject of the TRY_CONVERT series and the safe type conversion below.
Classified: This Series’ “ETL Pattern” Is ELT
Apply the two-criteria test to the ETL design pattern worked out across this series, and the verdict is unambiguous. The pattern layers the process into work packages with their own database schemas (E0–L2):
- Raw landing in schema E1. Data from text files is first materialized as
nvarchar— no typing during extraction. That is load-raw-then-transform in its purest form. - Typing and data quality in T1. Only afterwards do generic stored procedures convert the values into the target data types and check technical data quality — via SQL, in the engine, after the load; the historized continuation of this data lives in T2.
- Structural transformation in L1. Foreign-key resolution and lookups run as SQL
SELECTwith JOINs over the staged tables; the historization of this stage lives in L2. - SSIS as an EL mover only. The ETL tool moves files and materializes them in E0/E1 — the transformation substance lives in T-SQL.
Both test criteria therefore point to ELT: raw is loaded first, and the transform compute runs as SQL in the engine. More precisely: persistent-staging ELT — every intermediate stage is materialized and kept for audit and restart. And that is exactly why the pattern is robust, restartable and auditable: bad data is isolated at the schema boundaries instead of bringing an in-flight data flow to a halt.
This is not a correction of the cluster’s brand. The existing cluster is rightly called an “ETL process”: the acronym names the three logical phases the process goes through, and those are indisputably present. What is being made precise here is the execution model behind it — and that is ELT. Acronym and execution model are two different levels; both may stand side by side.
Staying honest about the verdict. The pattern is not purely ELT. Its final step — the load into the target system — is ETL-like again depending on the target: into a SQL database it stays push-down (INSERT/UPDATE). But over an API into a non-DB target such as Dynamics 365 (explicitly mentioned in the architecture description), it again needs an ETL tool that translates the rows into API calls. Stated cleanly: a hybrid with an ELT focus — the heavy lifting (typing, DQ, structural transformation) is push-down, the last mile to the non-DB target can be in-flight.
ELT, EtLT and the Medallion Family
Once a pipeline is identified as ELT, it fits into a larger family. The common bracket: load raw first, then transform in several layers (multi-hop, persistent staging).
- Medallion architecture. The Bronze/Silver/Gold build-up common in the cloud world (Databricks, lakehouse) follows the same principle: Bronze = raw landing data, Silver = cleaned/typed, Gold = business-ready. Roughly mapped onto the series pattern: Bronze ≈ E1, Silver ≈ T1/T2, Gold ≈ L1/L2 — with the caveat that “Gold” often additionally means aggregated marts, whereas in the series pattern aggregation is a separate, downstream step. Same pattern, different vocabulary.
- Data Vault. Business key, hash-based delta detection and the Raw/Cleansed/Business layering are closely related — the architecture FAQ already classifies the pattern as a “lighter” Data Vault sibling.
- EtLT. The small “t” stands for a light cleansing already during the load (trimming, encoding, obvious garbage), while the business-level heavy lifting (the big “T”) still happens push-down in the engine. Exactly the case where the two test criteria diverge.
On the tool side, dbt (a SQL-centric transform framework with versioning and tests) is today’s defining ELT tool: it transforms exclusively after the load, in the target engine — push-down by definition. So whoever runs dbt on Snowflake, BigQuery or Postgres is doing ELT, no matter what the project is called internally.
When ETL, When ELT — the Practical Decision
The classification is not an end in itself — it helps with the choice. ELT is the default today when the target or staging is a capable SQL engine and push-down plus auditability are wanted. That holds for every relational data warehouse, for Postgres as much as for SQL Server, and all the more for the cloud analytics platforms whose engine is built precisely for it.
ETL — transformation before the load, in a dedicated engine — remains the right choice in three constellations:
- The target is not a capable SQL engine. If the pipeline writes over an API into a non-DB target (Dynamics 365, a SaaS CRM, a message bus), there is no “engine” to push-down transform into. The transformation has to happen beforehand — the series pattern’s final load is exactly this case.
- Data must be transformed before it lands. Compliance reasons such as PII masking or pseudonymization can require that sensitive raw values never sit persistently in a staging layer at all. Then the (at least masking) transformation has to happen in-flight.
- The transform engine is the integration point. When an ETL tool connects many heterogeneous sources and targets as a central hub and plays to its connector and buffer strengths, it can make sense to bundle the transformation there instead of moving it into a target database.
Where the decision falls concretely between SSIS and pure T-SQL — the micro level of the same question — the article SSIS vs. SQL goes deeper with three decision criteria. Its pragmatic synthesis is, incidentally, a nice piece of evidence for this article: SSIS as an orchestration wrapper, the SQL substance in stored procedures — that is exactly the ELT build, the tool as a mover, the transformation push-down.
One boundary remains: the two-criteria test is deliberately batch-oriented. Continuous and streaming pipelines (change data capture, Kafka, Flink) blur the separation — there the ETL-vs-ELT question becomes its own discussion, which this article does not take up.
FAQ
No. Both contain the same three logical phases (Extract, Transform, Load), but they differ in the execution model: ETL transforms before the load in a separate engine; ELT loads raw first and transforms afterwards via SQL in the target or staging engine. It is an architectural difference, not a renaming.
Yes. If the rows are converted, cleansed and looked up in the tool’s memory and only the finished row is written to the target, that is the classic in-flight signature — transform before load, compute in the tool engine. That is ETL, no matter how modern the tool is.
It depends on how it is built — SSIS is not ETL per se. If the transformation logic sits in data-flow components (Derived Column, Lookup, Data Conversion), it is ETL. But if the package only moves files into a staging table and then calls stored procedures that transform via SQL, it is ELT — same tool, different execution model.
Yes, fully. The two-criteria test is platform-neutral. A COPY into a text staging table followed by SQL transformation is ELT on Postgres — without any cloud. Conversely, an on-prem pipeline with in-flight tool transformation can be classic ETL. ELT is not a cloud feature but a question of where the compute runs.
EtLT is the mixed case: a small “t” for a light cleansing already during the load (trimming, encoding fixes, obvious garbage), while the actual business transformation (the big “T”) still happens push-down in the engine after the load. It describes pipelines where the two test criteria do not fall cleanly together.
Both are variants of ELT with persistent, multi-layered staging. The Medallion layers Bronze/Silver/Gold correspond roughly to the schema layers E1 / T1–T2 / L1–L2 of the series pattern. Data Vault shares the business key, hash-delta detection and the Raw/Cleansed/Business separation, but it is stricter and heavier — the series pattern is the more pragmatic sibling.
Related Articles
ETL design pattern cluster — the worked-through case classified here:
- Data Quality in an ETL Process — the root of the series: catching technical and business errors before they reach the target system.
- Design Pattern // The Architecture of an ETL Process — the E0–L2 layering, the central evidence for this article’s ELT verdict.
- Design Pattern // Logging an ETL Process with T-SQL — the logging layer that makes push-down transformation evaluable.
- Design Pattern // Safe Type Conversion with T-SQL — the
fn_try_convert_*UDFs for the “T” after the load (only available in German for now). - Data Quality // Fundamentals of Type Conversion with T-SQL — CAST, CONVERT, TRY_CAST and TRY_CONVERT compared (only available in German for now).
SSIS-vs-SQL cluster — the same question at the tool micro level:
- SSIS vs. SQL — when SSIS, when pure T-SQL, when both; the closest relative to the ETL-vs-ELT question.
- SSIS vs. SQL: Readability and Maintainability — how much SQL belongs in an SSIS package?
- SSIS vs. SQL: Impersonation — why the SSIS wrapper becomes indispensable for agent jobs.
- SSIS vs. SQL: Source Code Management — why SP diffs are readable and
.dtsxdiffs are not.
The “T” in detail — the TRY_CONVERT series as push-down building blocks: