Checking Data Quality with SQL — a Configurable Framework for Spotting Bad Data Generically
Bad data gives no warning. An age of 200 years, a duplicate customer number, a country code that doesn’t exist — in the source system nobody notices. Only when the ETL run tries to push the rows into the strictly modelled target layer does the load break: on a CHECK, on a UNIQUE index, on a foreign key. Checking … Read more