Data quality in SQL Server // TRY_CONVERT for date, datetime, datetime2 and time done safely

If you’ve ever imported a CSV column with mixed date formats into a datetime column, you know: data quality starts with type conversion. SQL Server leaves you alone with style codes the moment the format strays from the documented ones — TRY_CONVERT handles the documented formats, anything else needs a function of your own. What you’ll take away: Prerequisites: SQL Server 2017+ (for TRY_CONVERT styles 23/126), PostgreSQL 12+ for the … Read more

Data quality in SQL Server // TRY_CONVERT for bit done safely — converting yes/no values

Data quality starts with type conversion — and for bit columns it shows up right at the input value: yes/no information comes out of legacy sources in a wide variety of notations (‘J’, ‘Y’, ‘ON’, ‘1’, ‘x’, ‘-‘, …). SQL Server’s built-in TRY_CONVERT(bit, …) only covers the integer standard plus ‘true’/’false’ — everything else needs a dedicated conversion function. This article describes both: what’s built in, when … Read more

Data quality in SQL Server // TRY_CONVERT for float and real done safely

Data quality with floating-point columns is a discipline of its own — and TRY_CONVERT(float, …) has a few quirks that tend to slip past the import path: an empty string becomes 0 (not NULL), a comma as decimal separator yields NULL, and even after a clean conversion 2 + 3.4 – 3.4 – 2 does not return exactly 0 as a float. This article sorts out the … Read more

TRY_CONVERT // Converting data to money, smallmoney

Anyone who has imported a point-of-sale report with values like ‘1.234,56 €’ from a CSV into a SQL Server database knows the pattern: TRY_CONVERT(money, ‘1,234.56’) yields 1234.5600. Yet TRY_CONVERT(money, ‘1.234,56’) yields NULL. And even when the import runs cleanly: money / 100 * 100 is not necessarily the same as the input value. At a glance — what this article covers: Prerequisite: SQL Server 2017+ (TRIM is used … Read more

Data quality in an ETL process

There is a lot of good literature on the subject of data quality. As a rule, this topic is described more theoretically, with impressive examples and the importance is proven with theories. However, I have not yet found any concrete instructions on how exactly to implement the checking and handling of bad data in an … Read more

Data quality in SQL Server // TRY_CONVERT for bigint, int, smallint and tinyint done safely

Data quality starts with type conversion — and for integer columns, TRY_CONVERT has a few quirks that tend to slip past the import path: an empty string becomes 0 (not NULL), any decimal or thousands separator yields NULL, and a typed decimal is silently truncated — without rounding. This article sorts out the rules, shows the safe conversion pattern, and adds a … Read more

Data quality in SQL Server // TRY_CONVERT for decimal and numeric done safely

Anyone who has watched a price import turn ‘123.45 €’ into a NULL instead of the expected decimal number knows the drill: TRY_CONVERT(decimal(5, 2), ‘123,45’) returns NULL, because a comma isn’t accepted as a decimal separator. And even with the comma gone, TRY_CONVERT(decimal(5, 2), ‘1234.56’) is also NULL — this time because of one integer digit too many. At a glance: Prerequisite: SQL Server 2017+ (TRIM in … Read more

Design Pattern // Logging an ETL process with T-SQL

How can you judge whether an ETL process has been successful? The mere fact that a process was not terminated with an exception does not necessarily mean that the process did what was expected of it. Readable and evaluable logging of an ETL process can help here. It enables a reliable assessment of the success of an ETL process.

This article presents a design pattern for logging an ETL process based on stored procedures that insert and update log records in log tables. The aim is to create a readable and evaluable protocol that provides answers to essential questions about the execution of an ETL process:

  • How long does the ETL process take in total?
  • How long does a stored procedure, SSIS package, or other component take to execute?
  • How long does it take to execute a specific SQL statement?
  • How many records were processed by an SQL statement?

Of course, we also want to know whether the process as a whole, a procedure or even a specific SQL statement was executed successfully.

The procedure is actually straightforward: At the beginning of each action, a log data record is written and after the action is completed, it is updated with either the status success or failure and, if necessary, other helpful information. That’s it! But it can still be a little more…

Read more

SSIS vs. SQL – Readability/Maintenance -or- How much of SQL you want to have in your SSIS package?

How this articel has developed… SQL Server Integration Services (SSIS) is a very powerful graphical tool set belonging to the Microsoft BI Stack along with SQL Server, SQL Server Analysis Services (SSAS), SQL Server Reporting Services (SSRS), etc. It supports a broad variety of data migration, integration and transformation tasks. With that, you will find … Read more

SSIS vs. SQL – Source Code Management

Dieser Artikel in Deutsch… Overview SQL Server Integration Services (SSIS) is a very powerful graphical tool set belonging to the Microsoft BI Stack along with SQL Server, SQL Server Analysis Services (SSAS), SQL Server Reporting Services (SSRS), etc. It supports a broad variety of data migration, integration and transformation tasks. With that, you will find … Read more