Design Pattern // The Architecture of an ETL Process — How to Isolate Bad Data Cleanly

A single date string that cannot be parsed, and the entire ETL run aborts. The design pattern for ETL process architecture presented here prevents exactly that: bad data is isolated, not passed along. TL;DR — what this article covers: Prerequisite. Basic familiarity with ETL processes. This is a conceptual article — not a step-by-step tutorial. Root of … Read more

Data Quality in an ETL Process — Catching Technical and Business Errors Before They Reach the Target System

A single value that cannot be converted — a date in the wrong format, a number with the wrong decimal separator — and the entire ETL run aborts. Data quality in an ETL process means catching such errors proactively: identifying, logging and isolating them before they reach the target system. This article is the entry point to … Read more

Design Pattern // Logging an ETL Process with T-SQL — How to Capture Run, Component and Action in Evaluable Log Tables

An ETL process finishes without an exception — but was everything really loaded that should have been? The mere fact that a process did not abort says nothing about whether it actually did what was expected of it. A readable, evaluable log is what turns a gut feeling into a defensible statement. This design pattern … Read more