Design Pattern // The Architecture of an ETL Process — How to Isolate Bad Data Cleanly

A single date string that cannot be parsed, and the entire ETL run aborts. The design pattern for ETL process architecture presented here prevents exactly that: bad data is isolated, not passed along. TL;DR — what this article covers: Prerequisite. Basic familiarity with ETL processes. This is a conceptual article — not a step-by-step tutorial. Root of … Read more

Data quality in SQL Server // TRY_CONVERT for date, datetime, datetime2 and time done safely

If you’ve ever imported a CSV column with mixed date formats into a datetime column, you know: data quality starts with type conversion. SQL Server leaves you alone with style codes the moment the format strays from the documented ones — TRY_CONVERT handles the documented formats, anything else needs a function of your own. What you’ll take away: Prerequisites: SQL Server 2017+ (for TRY_CONVERT styles 23/126), PostgreSQL 12+ for the … Read more

Data quality in SQL Server // TRY_CONVERT for bit done safely — converting yes/no values

Data quality starts with type conversion — and for bit columns it shows up right at the input value: yes/no information comes out of legacy sources in a wide variety of notations (‘J’, ‘Y’, ‘ON’, ‘1’, ‘x’, ‘-‘, …). SQL Server’s built-in TRY_CONVERT(bit, …) only covers the integer standard plus ‘true’/’false’ — everything else needs a dedicated conversion function. This article describes both: what’s built in, when … Read more

Data quality in SQL Server // TRY_CONVERT for float and real done safely

Data quality with floating-point columns is a discipline of its own — and TRY_CONVERT(float, …) has a few quirks that tend to slip past the import path: an empty string becomes 0 (not NULL), a comma as decimal separator yields NULL, and even after a clean conversion 2 + 3.4 – 3.4 – 2 does not return exactly 0 as a float. This article sorts out the … Read more

Data quality in SQL Server // TRY_CONVERT for money and smallmoney done safely

Anyone who has imported a point-of-sale report with values like ‘1.234,56 €’ from a CSV into a SQL Server database knows the pattern: TRY_CONVERT(money, ‘1,234.56’) yields 1234.5600. Yet TRY_CONVERT(money, ‘1.234,56’) yields NULL. And even when the import runs cleanly: money / 100 * 100 is not necessarily the same as the input value. At a glance — what this article covers: Prerequisite: SQL Server 2017+ (TRIM is used … Read more

Data Quality in an ETL Process — Catching Technical and Business Errors Before They Reach the Target System

A single value that cannot be converted — a date in the wrong format, a number with the wrong decimal separator — and the entire ETL run aborts. Data quality in an ETL process means catching such errors proactively: identifying, logging and isolating them before they reach the target system. This article is the entry point to … Read more

Data quality in SQL Server // TRY_CONVERT for bigint, int, smallint and tinyint done safely

Data quality starts with type conversion — and for integer columns, TRY_CONVERT has a few quirks that tend to slip past the import path: an empty string becomes 0 (not NULL), any decimal or thousands separator yields NULL, and a typed decimal is silently truncated — without rounding. This article sorts out the rules, shows the safe conversion pattern, and adds a … Read more

Data quality in SQL Server // TRY_CONVERT for decimal and numeric done safely

Anyone who has watched a price import turn ‘123.45 €’ into a NULL instead of the expected decimal number knows the drill: TRY_CONVERT(decimal(5, 2), ‘123,45’) returns NULL, because a comma isn’t accepted as a decimal separator. And even with the comma gone, TRY_CONVERT(decimal(5, 2), ‘1234.56’) is also NULL — this time because of one integer digit too many. At a glance: Prerequisite: SQL Server 2017+ (TRIM in … Read more

Design Pattern // Logging an ETL Process with T-SQL — How to Capture Run, Component and Action in Evaluable Log Tables

An ETL process finishes without an exception — but was everything really loaded that should have been? The mere fact that a process did not abort says nothing about whether it actually did what was expected of it. A readable, evaluable log is what turns a gut feeling into a defensible statement. This design pattern … Read more

SSIS vs. SQL: Readability and Maintainability — how much SQL belongs in an SSIS package?

Three ways to model the same ETL task in SSIS. One takes 10 minutes and is straightforward. One takes hours, 40 Data Flow Tasks, and won’t survive the next requirements change. The question “how much SQL belongs in an SSIS package?” decides maintainability, readability, and development speed — not tool loyalty. What you’ll take away: … Read more