How can you judge whether an ETL process has been successful? The mere fact that a process was not terminated with an exception does not necessarily mean that the process did what was expected of it. Readable and evaluable logging of an ETL process can help here. It enables a reliable assessment of the success of an ETL process.
This article presents a design pattern for logging an ETL process based on stored procedures that insert and update log records in log tables. The aim is to create a readable and evaluable protocol that provides answers to essential questions about the execution of an ETL process:
- How long does the ETL process take in total?
- How long does a stored procedure, SSIS package, or other component take to execute?
- How long does it take to execute a specific SQL statement?
- How many records were processed by an SQL statement?
Of course, we also want to know whether the process as a whole, a procedure or even a specific SQL statement was executed successfully.
The procedure is actually straightforward: At the beginning of each action, a log data record is written and after the action is completed, it is updated with either the status success or failure and, if necessary, other helpful information. That’s it! But it can still be a little more…
Content
The Result
Let’s start…
Three-tier logging
[LL].[Execution]
[LL].[Component]
[LL].[Trace]
Schema LL
Table descriptions, declaration and data model
[LL].[Execution]
[LL].[Component]
[LL].[Trace]
[LL].[Error]
ETL process execution status
Stored procedures for logging
[LL].[spInsertTrace]
[LL].[spUpdateTrace]
[LL].[spUpdateTraceSuccess]
ExceptionHandling
Example of logging and exception handling in an ETL process
[T2].[spETLProcess]
[T2].[spDoSomething_2_2]
Summary
Related Posts
Download
The result
Even if the creation of a protocol itself is not or should not be a big deal, there is still a lot to write about. But let’s start with the result and put the cart before the horse. The process logging presented here is three-tier and uses the following log tables:
- [LL].[Execution]
- [LL].[Component]
- [LL].[Trace]
In these tables, an ETL process and the associated components and work steps are logged with increasing granularity – read from top to bottom. The ETL process, the components and the individual action are logged with exactly one row.
Of course, the logging of errors is also closely linked to the creation of the logging of the process. The logging approach presented here uses the following table for logging errors:
- [LL].[Error]
The following images show the result of the three-stage logging of a simple, compact but complete ETL process.
In the [LL].[Trace] table, each action is recorded, among other things, with the name of the procedure, the target entity that is processed by the procedure, a short description of what was specifically done and other information, such as the number of affected Records and duration of execution are logged.
The [LL].[Component] table logs the calls to procedures and SSIS packages or – generally speaking – components. Here too, a short description of the component’s task, the target entity and the duration of execution are logged.
At the top level, each execution of the ETL process is logged in the [LL].[Execution] table with exactly one row.
All process logging tables contain two columns [State] and [Success], which store the success or failure of the execution of an action, component, or ETL process.
Let’s start…
After the brief introduction of the result, the following aspects of the procedure need to be clarified:
- Three-tier logging
- Table descrition, declaration and datamodel
- Status of the execution of an ETL process
- Logging stored procedures
- Exception-Handling
- Example for logging and exception handling in an ETL process
Three-tier logging
The procedure involves logging an ETL process in the three tables [LL].[Execution], [LL].[Component] and [LL].[Trace]. Each of the tables is intended for logging specific artifacts. This section introduces the use of each table, its columns, and the code to create it.
[LL].[Execution]
This table records the execution of an ETL process with exactly one record. An ETL process always has an entry function. This can be a stored procedure, an SSIS package, a Talend job, or a SQL Server Agent job. At the beginning of execution of the entry function, a log record is inserted into this table. After all tasks have been successfully processed, this data record is updated with the status success, or in the event of an error with the status error. This table therefore contains an overview of all executions of the ETL process and provides information such as the status and duration of the execution.
[LL].[Component]
In this table, the execution of a component is logged with exactly one row. A component can be a stored procedure, an SSIS package, a Talend job. A component is characterized by the fact that it controls and executes one or more data manipulations or other components. As with the table [LL].[Execution], a log entry is inserted into this table at the beginning of execution and updated with the status success or error upon completion. This table therefore contains a list of the components executed for each execution of the ETL process and provides information such as the status and duration of the execution for each component.
[LL].[Trace]
The name of this table already indicates that the table is intended for a detailed logging of actions of the ETL process. Which actions are logged is a design decision by the developer. However, it is recommended to log at least every INSERT, UPDATE and DELETE statement with its own log entry. In addition to the status fields mentioned above, the table also contains a column that provides the number of affected rows of a data manipulation task, thus giving the developer the opportunity to assess whether the statements did exactly what was expected.
[LL].[Error]
Errors that were detected in an ETL process must of course be logged. A distinction must be made between the logging of exceptions and data errors. The structure of thi
s table is designed for the logging of data errors and contains columns in which each data errors can be logged completely, readably and evaluably. This article focuses on logging an ETL process rather than logging data errors. The logging of the process is closely linked to explicit exception handling. A separate section is dedicated to exception handling and exception logging.
Schema LL
The schema name LL stands for Logging Layer. This schema stores all the log tables and the stored procedures used for logging.
Table descriptions, declaration and data model
The following sections describe the process logging tables and the error logging table. Finally, there is a diagram with the data model for these tables.
[LL].[Execution]
Columns
The table contains the following columns:
Column | Description |
---|---|
Id | The [Id] column is declared as IDENTITY and identifies a log record in this table. |
Process | The logging tables cannot be used by just one ETL process. So that the log data records of different ETL processes can be distinguished from one another, a unique name of the ETL process must be specified in the [Process] column. |
Start End | These two columns indicate the start time and the end of the execution of an ETL process. When the ETL process starts, the current timestamp is saved in the [Start] column. The [End] column is initialized with a NULL at startup. The last command of an ETL process must update the state of the log entry ([State] and [Success] columns). When updated, the current timestamp is saved in the [End] column. A NULL in the [End] column indicates either that the process is still running or that it has aborted with an exception. Times are always stored as UTC time. |
DeltaStart DeltaEnd | These columns determine the time window for a delta load. [DeltaStart] sets the lower time limit and [DeltaEnd] sets the upper one. The time window is automatically determined by the stored procedure [LL].[spInsertExecution], which inserts the log entry into this table. |
User Machine Version | These columns must indicate the context in which the ETL process is executed. The user specified in the [User] column is the user under which the process is executed. This is crucial for authorizations to databases, tables, file shares, etc. The user is determined by automatically using the T-SQL function SUSER_SNAME() when inserting the log entry. The [Machine] column stores the name of the computer on which the ETL process is running. The name is also determined automatically when the log entry is inserted using the T-SQL function HOST_NAME(). The [Version] column takes the version number of the ETL process and may be helpful for troubleshooting after an incorrect deployment of an ETL process. |
State | The [State] column displays the current status of the ETL process. Possible values include processing, warning, success or error. Initially the ETL process is started with [State] = processing. If successful, [State] is updated with warning or success. In case of an error, [State] is updated with error. |
Success | The [Success] column indicates whether the ETL process completed successfully with [State] = warning or success. Initially the ETL process is started with [Success] = 0. If successful, [Success] is updated with 1. |
CreatedOn CreatedBy ModifiedOn ModifiedBy | The [CreatedOn] and [ModifiedOn] columns contain a UTC timestamp. Initially, [CreatedOn] is assigned the current timestamp via a default value and [ModifiedOn] is assigned a NULL. [ModifiedOn] is updated with the current timestamp when the status is updated. The same system applies to the fields [CreatedBy] and [ModifiedBy]. The user under which the process is executed is stored here. |
Declaration
The table is created using the following statement:
CREATE TABLE [LL].[Execution] ( [Id] bigint IDENTITY (1, 1) NOT NULL ,[Process] nvarchar (max) NOT NULL ,[Start] datetime NOT NULL ,[End] datetime NULL ,[DeltaStart] datetime NULL ,[DeltaEnd] datetime NULL ,[User] nvarchar(128) NULL ,[Machine] nvarchar(128) NULL ,[Version] int NULL ,[State] nvarchar(128) NULL ,[Success] bit NULL ,[CreatedOn] datetime CONSTRAINT [DF_LL_Execution_CreatedOn] DEFAULT (GETUTCDATE()) NOT NULL ,[CreatedBy] nvarchar(100) CONSTRAINT [DF_LL_Execution_CreatedBy] DEFAULT (SUSER_SNAME()) NOT NULL ,[ModifiedOn] datetime NULL ,[ModifiedBy] nvarchar(100) NULL ,CONSTRAINT [PK_LL_Execution] PRIMARY KEY CLUSTERED ([Id] ASC) ); GO CREATE TRIGGER [LL].[TR_LL_Execution_Update] ON [LL].[Execution] FOR UPDATE AS BEGIN SET NOCOUNT ON UPDATE [LL].[Execution] SET [ModifiedOn] = GETUTCDATE() ,[ModifiedBy] = SUSER_SNAME() FROM [LL].[Execution] INNER JOIN inserted ON inserted.[ID] = [LL].[Execution].[ID]; END; GO
[LL].[Component]
Columns
The table contains the following columns:
Column | Descritpion |
---|---|
Id | The [Id] column is declared as IDENTITY and identifies a log record in this table. |
ExecutionId | Using this foreign key on the table [LL].[Execution], log entries that are related to a specific execution of the ETL process can be identified. |
Source | The text in this column indicates the type of component that wrote the log record. Values used are, for example, T-SQL or SSIS. |
Component | This column holds the name of the component that wrote the entry. For procedures, this is the procedure name, for other components, for example, the file name of the component. |
Version | If a version number is available for the component, it can and should be logged in this column. |
Entity | The name of the target table or – to put it more generally – the entity that is processed by the component must be specified in this column. The name can be freely chosen. It is recommended to choose the name systematically in order to facilitate a systematic evaluation of the protocol. |
Step | A short description of the component’s task must be provided in the column [Step]. This description should also follow a strict system to enable easy evaluation of the protocol. |
FileId | The [FileId] column is used for the processing of files by the component. It is not important for describing the procedure presented here. |
Description | An additional description of the component’s task can be inserted in this column. |
State | The [State] column displays the current status of the component’s execution. Possible values include processing, warning, success or error. Initially the component is started with [State] equal to processing. If successful, [State] is updated with warning or success. In case of an error, [State] is updated with error. |
Success | The column [Success] indicates whether the component ended successfully with [State] equal to warning or success. Initially the component is started with [Success] equal to 0. If successful, [Success] is updated with 1. |
CreatedOn CreatedBy ModifiedOn ModifiedBy | See table [LL].[Execution]. |
Declaration
The table is created using the following statement:
CREATE TABLE [LL].[Component] ( [Id] bigint IDENTITY (1, 1) NOT NULL ,[ExecutionId] bigint NOT NULL ,[Source] nvarchar (5) NOT NULL ,[Component] nvarchar (128) NOT NULL ,[Version] int NULL ,[Entity] nvarchar (128) NOT NULL ,[Step] nvarchar (MAX) NOT NULL ,[Description] nvarchar (MAX) NULL ,[FileId] bigint NULL ,[State] nvarchar (128) NULL ,[Success] bit NULL ,[CreatedOn] datetime CONSTRAINT [DF_LL_Execution_CreatedOn] DEFAULT (GETUTCDATE()) NOT NULL ,[CreatedBy] nvarchar (100) CONSTRAINT [DF_LL_Execution_CreatedBy] DEFAULT (SUSER_SNAME()) NOT NULL ,[ModifiedOn] datetime NULL ,[ModifiedBy] nvarchar (100) NULL ,CONSTRAINT [PK_LL_Component] PRIMARY KEY CLUSTERED ([Id] ASC), ,CONSTRAINT [FK_LL_Component_ExecutionId] FOREIGN KEY ([ExecutionId]) REFERENCES [LL].[Execution] ([Id]) ); GO CREATE TRIGGER [LL].[TR_LL_Component_Update] ON [LL].[Component] FOR UPDATE AS BEGIN SET NOCOUNT ON UPDATE [LL].[Component] SET [ModifiedOn] = GETUTCDATE() ,[ModifiedBy] = SUSER_SNAME() FROM [LL].[Component] INNER JOIN inserted ON inserted.[ID] = [LL].[Component].[ID]; END; GO
[LL].[Trace]
Columns
The table contains the following columns:
Column | Description |
---|---|
Id | The [Id] column is declared as IDENTITY and identifies a log record in this table. |
ExecutionId | Using this foreign key to table [LL].[Execution], log entries that are related to a specific execution of the ETL process can be identified. |
ComponentId | Using this foreign key to table [LL].[Component], log entries that are related to a specific execution of a component can be identified. |
Source | The text in this column indicates the type of component that wrote the log record. Values used are, for example, T-SQL or SSIS. |
Component | This column holds the name of the component that wrote the entry. For procedures, this is the procedure name, for other components, for example, the file name of the component. |
Task | The stored procedures can also be used for logging in an SSIS package or other type of component. This column should store the name of the task in the component this log entry is related to. |
Entity | The name of the target table or – to put it more generally – the entity that is processed by the component must be specified in this column. The name can be freely chosen. It is recommended to choose the name systematically in order to facilitate a systematic evaluation of the protocol. |
Step | A short description of the component’s task must be provided in the column [Step]. This description should also follow a strict system to enable easy evaluation of the protocol. |
Description | An additional description of the component’s task can be inserted in this column. |
FileId | Column [FileId] is used for the processing of files by the component. It is not important for the description of the procedure presented here. |
Action | In this column, the action carried out must be sepcified with a keyword. Frequently used texts include insert, update, delete, copy, etc. The same applies here: a strict system in the use of keywords makes it easier to evaluate the protocol. |
AffectedRows | If the action carried out inserts, updates or deletes data, the number of affected rows must be saved here. The number of affected rows can be queried, for example, using the T-SQL system function @@ROWCOUNT. |
State Success CreatedOn CreatedBy ModifiedOn ModifiedBy | See table [LL].[Execution]. |
Declaration
The table is created using the following statement:
CREATE TABLE [LL].[Trace] ( [Id] bigint IDENTITY (1, 1) NOT NULL ,[ExecutionId] bigint NOT NULL ,[ComponentId] bigint NOT NULL ,[Source] nvarchar(5) NOT NULL ,[Component] nvarchar(128) NOT NULL ,[Task] nvarchar(128) NULL ,[Entity] nvarchar(128) NULL ,[Step] nvarchar(MAX) NOT NULL ,[Description] nvarchar(MAX) NULL ,[FileId] bigint NULL ,[Action] nvarchar(100) NULL ,[AffectedRows] int NULL ,[State] nvarchar(100) NOT NULL ,[Success] bit NOT NULL ,[CreatedOn] datetime CONSTRAINT [DF_LL_Trace_CreatedOn] DEFAULT (GETUTCDATE()) NOT NULL ,[CreatedBy] nvarchar(100) CONSTRAINT [DF_LL_Trace_CreatedBy] DEFAULT (SUSER_SNAME()) NOT NULL ,[ModifiedOn] datetime NULL ,[ModifiedBy] nvarchar(128) NULL ,CONSTRAINT [PK_LL_Trace] PRIMARY KEY CLUSTERED ([Id] ASC) ,CONSTRAINT [FK_LL_Trace_ComponentId] FOREIGN KEY ([ComponentId]) REFERENCES [LL].[Component] ([Id]) ,CONSTRAINT [FK_LL_Trace_ExecutionId] FOREIGN KEY ([ExecutionId]) REFERENCES [LL].[Execution] ([Id]) ); GO CREATE TRIGGER [LL].[TR_LL_Trace_Update] ON [LL].[Trace] FOR UPDATE AS BEGIN SET NOCOUNT ON UPDATE [LL].[Trace] SET [ModifiedOn] = GETUTCDATE() ,[ModifiedBy] = SUSER_SNAME() FROM [LL].[Trace] INNER JOIN inserted ON inserted.[Id] = [LL].[Trace].[Id]; END; GO
[LL].[Error]
As mentioned above, the structure of this table is designed for logging data errors, which is outside the scope of this article. However, the table is also used for logging exceptions. The description of the columns only takes into account those columns that are required for logging an exception.
Columns
Column | Description |
---|---|
Id | The [Id] column is declared as IDENTITY and identifies a log record in this table. |
ExecutionId | Using this foreign key to table [LL].[Execution], log entries that are related to a specific execution of the ETL process can be identified. |
ComponentId | Using this foreign key to table [LL].[Component], log entries that are related to a specific execution of a component can be identified. However, the foreign key is not implemented. This is to allow logging entries in this table in absence of a log entry in table [LL].[Component]. |
TraceId | Using this foreign key to table [LL].[Trace], log entries that are related to a specific execution of a component can be identified. However, the foreign key is not implemented. This is to allow logging entries in this table in absence of a log entry in table [LL].[Trace]. |
ErrorType | The severity of the error is marked in this column. Possible values in this column are: E = Error, W = Warning and I = Information. |
Source Component Task | Für diese Spalten gelten die gleichen Ausführung wie bei der Tabelle [LL].[Trace]. |
Step SchemaName TableName FileId ID1Value ID1ColumnName ID2Value ID2ColumnName ID3Value ID3ColumnName ErrorValue ErrorColumnName FileName | These columns are not relevant for logging an exception. |
Description | The error message is to be saved in this column. In case of an exception thrown in a stored procedure, the error message can be queried using the T-SQL BuiltIn function ERROR_MESSAGE. |
Number | This column is used to be saved the error number. In case of an exception thrown in a stored procedure, the error number can be queried using the T-SQL BuiltIn function ERROR_NUMBER. |
Line | In case of an exception thrown in a stored procedure, the line wihtin the script or stored procedure where the exception was thrown can be queried using the T-SQL BuiltIn function ERROR_LINE. |
State | The status code of the exception can be queried using the T-SQL BuiltIn function ERROR_STATE and should be written to this column.. |
CreatedOn CreatedBy | The [CreatedOn] column contains a UTC timestamp. Initially, [CreatedOn] is assigned the current timestamp via a default value. The [CreatedBy] column stores the user under which the error was logged. |
Declaration
CREATE TABLE [LL].[Error] ( [Id] bigint IDENTITY (1, 1) NOT NULL ,[ExecutionId] bigint NOT NULL ,[ComponentId] bigint NULL ,[TraceId] bigint NULL ,[ErrorType] char (1) NOT NULL ,[Source] nvarchar(5) NOT NULL ,[Component] nvarchar(128) NOT NULL ,[TaskName] nvarchar(128) NULL ,[Entity] nvarchar(128) NULL ,[Step] nvarchar(max) NULL ,[SchemaName] nvarchar(128) NULL ,[TableName] nvarchar(128) NULL ,[FileId] bigint NULL ,[ID1Value] nvarchar(max) NULL ,[ID1ColumnName] nvarchar(128) NULL ,[ID2Value] nvarchar(max) NULL ,[ID2ColumnName] nvarchar(128) NULL ,[ID3Value] nvarchar(max) NULL ,[ID3ColumnName] nvarchar(128) NULL ,[ErrorValue] nvarchar(max) NULL ,[ErrorColumnName] nvarchar(128) NULL ,[FileName] nvarchar(128) NULL ,[Description] nvarchar(max) NULL ,[Number] int NULL ,[Line] int NULL ,[State] nvarchar(max) NULL ,[CreatedOn] datetime CONSTRAINT [DF_LL_Error_CreatedOn] DEFAULT (GETUTCDATE() ) NOT NULL ,[CreatedBy] nvarchar(100) CONSTRAINT [DF_LL_Error_CreatedBy] DEFAULT (SUSER_SNAME()) NOT NULL ,CONSTRAINT [PK_LL_Error] PRIMARY KEY CLUSTERED ([Id] ASC) ,CONSTRAINT [FK_LL_Error_ExecutionId] FOREIGN KEY ([ExecutionId]) REFERENCES [LL].[Execution] ([Id]) );# GO
Data model
ETL process execution status
All process logging tables have two columns [State] and [Success], which store the current status of the ETL process, the execution of a component or a concrete action – for example an INSERT, UPDTAE or DELETE.
The current status is stated in the [State] column with either processing, warning or error. The success is stated with [Success] = 1. The current status of the process can only be determined by combining both status values. The following combinations of status values are permitted:
State | Success | Description |
---|---|---|
processing | 0 | The process is currently running – or – The process terminated with an exception, without updating this column. |
warning | 0 | The process is currently running and a warning has been logged in the error table [LL].[Error]. or – The process terminated with an exception, without updating this column. |
warning | 1 | The process was executed “successfully”. However, a warning was logged in the error table [LL].[Error]. |
success | 1 | The process was executed successfully. |
error | 0 | The process has thrown an exception that caused the process to be ended. The exception was catched and logged in table [LL].[Error]. |
Other combinations of status values are not permitted. The procedures used for logging will throw an exception if an invalid combination is passed.
Stored procedures for logging
The following procedures (among others) are available for inserting log entries into the above tables:
Table | Stored Procedure |
---|---|
[LL].[Execution] | [LL].[spInsertExecution] [LL].[spUpdateExecution] |
[LL].[Component] | [LL].[spUpdateComponent] [LL].[spUpdateComponent] [LL].[spUpdateComponentSuccess] [LL].[spUpdateComponentWarning] [LL].[spUpdateComponentError] |
[LL].[Trace] | [LL].[spInsertTrace] [LL].[spUpdateTrace] [LL].[spUpdateTraceSuccess] [LL].[spUpdateTraceWarning] [LL].[spUpdateTraceError] |
[LL].[Error] | [LL].[spInsertError] [LL].[spInsertErrorException] |
Essentially, these procedures perform an INSERT or UPDATE on the log tables. The values to be logged are passed to the procedures as parameters. The procedures check the parameters passed and raise an exception in the case of invalid parameters.
The following three code examples show the procedures [LL].[spInsertTrace], [LL].[spUpdateTrace] and [LL].[spUpdateTraceSuccess]. The remaining procedures are developed in the same fassion. The code for all procedures can be downloaded via this link.
[LL].[spInsertTrace]
-- -------------------------------------------------------------------------------- -- Parameters -- @p_executionId AS int -- Execution Id of the current execution -- @p_componentId AS int -- Each procedure call gets a unique id that allows identifying all messages -- that will be written by the procedure call -- @p_traceId AS int OUT -- Returns the Id of the new row in [LL].[Trace] -- @p_source AS nvarchar(5) -- Type of the calling source system (SSIS, T-SQL, ...) -- @p_component AS nvarchar(128) -- Name of the calling SSIS-Package -- @p_task AS nvarchar(128) = NULL -- SSIS-Task name the log entry refers to. -- @p_entity AS nvarchar(128) = NULL -- Entity name the log entry refers to. -- @p_step AS nvarchar(max) -- Description of the task in the calling object that will be logged -- @p_description AS nvarchar(max) = NULL -- Additional description of the task in the calling object that will be -- logged -- @p_fileId AS int = NULL -- Foreign Key to [LL].[FileList].[Id] -- @p_action AS nvarchar(max) = NULL -- Specifiy any action that will be logged by this procedure call like -- Insert, Delete, Update, ... -- @p_affectedRows AS nvarchar(max) = NULL -- Specifiy the number of rows/objects that were inserted, deleted, updated, ... -- @p_state AS nvarchar(100) -- State of the current task (processing, success, error, warning) -- @p_success AS bit -- Specifies whether the calling procedure succeded -- 0 = processing, warning, error -- 1 = success -- -------------------------------------------------------------------------------- CREATE PROCEDURE [LL].[spInsertTrace] @p_executionId AS int ,@p_componentId AS int ,@p_traceId AS int OUT ,@p_source AS nvarchar(5) ,@p_component AS nvarchar(128) ,@p_task AS nvarchar(128) ,@p_entity AS nvarchar(128) ,@p_step AS nvarchar(max) ,@p_description AS nvarchar(max) ,@p_fileId AS bigint = NULL ,@p_action AS nvarchar(100) = NULL ,@p_affectedRows AS int ,@p_state AS nvarchar(100) ,@p_success AS bit AS BEGIN SET NOCOUNT ON; -- -------------------------------------------------------------------------------- -- Declare variables -- -------------------------------------------------------------------------------- DECLARE @component AS nvarchar(128); DECLARE @table AS table([Id] int); DECLARE @message AS nvarchar(max); -- -------------------------------------------------------------------------------- -- Initialize variables -- -------------------------------------------------------------------------------- SET @component = OBJECT_SCHEMA_NAME(@@PROCID) + N'.' + OBJECT_NAME(@@PROCID); -- -------------------------------------------------------------------------------- -- Workload -- -------------------------------------------------------------------------------- BEGIN TRY -- -------------------------------------------------------------------------------- -- Check parameters -- -------------------------------------------------------------------------------- BEGIN -- -------------------------------------------------------------------------------- -- Check @p_executionId -- -------------------------------------------------------------------------------- IF (@p_executionId IS NULL) BEGIN EXEC [dbo].[spRaiseError] N'The parameter ''p_ExecutionId'' is NULL.', @component; RETURN 0; END; -- -------------------------------------------------------------------------------- -- Check @p_componentId -- -------------------------------------------------------------------------------- IF (@p_componentId IS NULL) BEGIN EXEC [dbo].[spRaiseError] N'The parameter ''p_ComponentId'' is NULL.', @component; RETURN 0; END; -- -------------------------------------------------------------------------------- -- Check @p_source -- -------------------------------------------------------------------------------- IF ([dbo].[fnIsNullOrEmpty](@p_source, 1) <> 0 ) BEGIN EXEC [dbo].[spRaiseError] N'The parameter ''p_source'' is either NULL or an empty string.', @component; RETURN 1; END; -- -------------------------------------------------------------------------------- -- Check @p_component -- -------------------------------------------------------------------------------- IF ([dbo].[fnIsNullOrEmpty](@p_component, 1) <> 0 ) BEGIN EXEC [dbo].[spRaiseError] N'The parameter ''p_component'' is either NULL or an empty string.', @component; RETURN 1; END; -- -------------------------------------------------------------------------------- -- Check @p_step -- -------------------------------------------------------------------------------- IF ([dbo].[fnIsNullOrEmpty](@p_step, 1) <> 0) BEGIN EXEC [dbo].[spRaiseError] N'The parameter ''p_step'' is either NULL or an empty string.', @component; RETURN 1; END; -- -------------------------------------------------------------------------------- -- Check @p_state -- -------------------------------------------------------------------------------- IF ([dbo].[fnIsNullOrEmpty](@p_state, 1) <> 0) BEGIN EXEC [dbo].[spRaiseError] N'The parameter ''p_state'' is either NULL or an empty string', @component; RETURN 1; END; -- -------------------------------------------------------------------------------- -- Check @p_success -- -------------------------------------------------------------------------------- IF @p_success IS NULL BEGIN EXEC [dbo].[spRaiseError] N'The parameter ''p_success'' is either NULL.', @component; RETURN 1; END; -- -------------------------------------------------------------------------------- -- Check combination of @p_success and @p_state -- -------------------------------------------------------------------------------- IF (@p_success = 0 AND @p_state IN ('success')) OR (@p_success = 1 AND @p_state IN ('processing', 'error')) BEGIN SET @message = CONCAT('Invalid state ''', @p_state, ''' for p_success = ''', CAST(@p_success AS nvarchar(100)),'''.'); EXEC [dbo].[spRaiseError] @message, @component; RETURN 1; END; END; -- -------------------------------------------------------------------------------- -- Insert trace into [LL].[Trace] -- -------------------------------------------------------------------------------- INSERT INTO [LL].[Trace] ( [ExecutionId] ,[ComponentId] ,[Source] ,[Component] ,[Task] ,[Entity] ,[Step] ,[Description] ,[FileId] ,[Action] ,[AffectedRows] ,[State] ,[Success] ) OUTPUT Inserted.Id INTO @table VALUES ( @p_executionId ,@p_componentId ,@p_source ,@p_component ,@p_task ,@p_entity ,@p_step ,CASE WHEN @p_description IS NULL OR DATALENGTH(@p_description) = 0 THEN NULL ELSE @p_description END ,@p_fileId ,@p_action ,@p_affectedRows ,@p_state ,@p_success ); SELECT @p_traceId = [Id] FROM @table; RETURN 0; END TRY BEGIN CATCH THROW; END CATCH; END; -- [LL].[spInsertTrace] -- DECLARE @ExecutionId AS int; -- DECLARE @ComponentId AS int; -- DECLARE @traceId AS int; -- DECLARE @source AS nvarchar(5); -- DECLARE @component AS nvarchar(128); -- DECLARE @entity AS nvarchar(128); -- DECLARE @step AS nvarchar(max); -- DECLARE @description AS nvarchar(max); -- DECLARE @fileId AS bigint -- DECLARE @action AS nvarchar(100) = NULL -- DECLARE @affectedRows AS int -- DECLARE @state AS nvarchar(100); -- DECLARE @success AS bit; -- -- SET @ExecutionId = 1; -- SET @ComponentId = 1; -- SET @source = 'T-SQL'; -- SET @component = 'test script'; -- SET @entity = '[LL].[spInsertTrace]'; -- SET @step = '[LL].[spInsertTrace]'; -- SET @description = 'none'; -- SET @fileId = 456; -- SET @action = 'Insert'; -- SET @affectedRows = 55; -- SET @state = 'processing'; -- SET @success = 0; -- -- EXEC [LL].[spInsertTrace] -- @p_executionId = @ExecutionId -- ,@p_componentId = @ComponentId -- ,@p_traceId = @traceId OUTPUT -- ,@p_source = @source -- ,@p_component = @component -- ,@p_entity = @entity -- ,@p_step = @step -- ,@p_description = @description -- ,@p_fileId = @fileId -- ,@p_action = @action -- ,@p_affectedRows = @affectedRows -- ,@p_state = @state -- ,@p_success = @success; -- -- SELECT @traceId; -- SELECT * FROM [LL].[Trace] WHERE [Id] = @traceId;
[LL].[spUpdateTrace]
-- -------------------------------------------------------------------------------- -- Parameters : -- @p_traceId AS int -- ID of the row in [LL].[Trace], that is to be updated. -- @p_description AS nvarchar(max) -- Additional description of the task in the calling object that will be logged -- @p_action AS nvarchar(max) = NULL -- Specifiy any action that will be logged by this procedure call like Insert, Delete, Update, ... -- @p_affectedRows AS nvarchar(max) = NULL -- Specifiy the number of rows/objects that were inserted, deleted, updated, ... -- @p_state AS nvarchar(100) -- State of the current task (processing, success, error, warning) -- @p_success AS bit -- Specifies the success state of the execution -- 0 = @p_state > processing, warning, error -- 1 = @p_state > success, warning -- -------------------------------------------------------------------------------- ALTER PROCEDURE [LL].[spUpdateTrace] @p_traceId AS int ,@p_description AS nvarchar(max) ,@p_action AS nvarchar(100) = NULL ,@p_affectedRows AS int ,@p_state AS nvarchar(100) ,@p_success AS bit AS BEGIN SET NOCOUNT ON; -- -------------------------------------------------------------------------------- -- Declare variables -- -------------------------------------------------------------------------------- DECLARE @component AS nvarchar(128); DECLARE @tempuid AS int; DECLARE @message AS nvarchar(max); -- -------------------------------------------------------------------------------- -- Initialize variables -- -------------------------------------------------------------------------------- SET @component = OBJECT_SCHEMA_NAME(@@PROCID) + N'.' + OBJECT_NAME(@@PROCID); -- -------------------------------------------------------------------------------- -- Workload -- -------------------------------------------------------------------------------- BEGIN TRY -- -------------------------------------------------------------------------------- -- Check parameters -- -------------------------------------------------------------------------------- BEGIN -- -------------------------------------------------------------------------------- -- Check @p_traceId -- -------------------------------------------------------------------------------- IF (@p_traceId IS NULL) BEGIN EXEC [dbo].[spRaiseError] N'The parameter ''p_traceid'' is NULL.', @component; RETURN 1; END; -- -------------------------------------------------------------------------------- -- Check @p_success -- -------------------------------------------------------------------------------- IF (@p_success IS NULL) BEGIN SET @message = 'The parameter ''p_success'' is NULL.'; EXEC [dbo].[spRaiseError] @message, @component; RETURN 1; END; -- -------------------------------------------------------------------------------- -- Check combination of @p_success and @p_state -- -------------------------------------------------------------------------------- IF (@p_success = 0 AND @p_state IN (N'success')) OR (@p_success = 1 AND @p_state IN (N'processing', N'error')) BEGIN SET @message = CONCAT(N'Invalid state ''', @p_state, N''' for p_success = ''', CAST(@p_success AS nvarchar(100)),'''.'); EXEC [dbo].[spRaiseError] @message, @component; RETURN 1; END; -- -------------------------------------------------------------------------------- -- Check whether an trace log does exist in [LL].[Trace] with -- [ID] = @p_traceId -- -------------------------------------------------------------------------------- SELECT @tempuid = [Id] FROM [LL].[Trace] WHERE [Id] = @p_traceId; IF (@tempuid IS NULL) BEGIN SET @message = N'A record with [ID] = ''' + CAST(@p_traceId AS nvarchar(max)) + N''' could not be found.'; EXEC [dbo].[spRaiseError] @message, @component; RETURN 1; END; END; -- -------------------------------------------------------------------------------- -- Update trace log in [LL].[Trace] -- -------------------------------------------------------------------------------- UPDATE [LL].[Trace] SET [Description] = CASE WHEN (@p_description IS NULL OR DATALENGTH(@p_description) = 0) AND ([Description] IS NULL OR DATALENGTH([Description]) = 0) THEN NULL ELSE @p_description END ,[Action] = @p_action ,[AffectedRows] = @p_affectedRows ,[State] = @p_state ,[Success] = @p_success WHERE [Id] = @p_traceId; RETURN 0; END TRY BEGIN CATCH THROW; END CATCH; END; -- [dbo].[spUpdateTrace] -- EXEC [LL].[spUpdateTrace] 1, N'process successfully finished', N'Insert', 123, N'success', 1;
[LL].[spUpdateTraceSuccess]
-- -------------------------------------------------------------------------------- -- Parameters : -- @p_traceId AS int -- ID of the row in [LL].[Trace], that is to be updated. -- @p_description AS nvarchar(max) -- Additional description of the task in the calling object that will be logged -- @p_action AS nvarchar(max) = NULL -- Specifiy any action that will be logged by this procedure call like Insert, Delete, Update, ... -- @p_affectedRows AS nvarchar(max) = NULL -- Specifiy the number of rows/objects that were inserted, deleted, updated, ... -- -------------------------------------------------------------------------------- ALTER PROCEDURE [LL].[spUpdateTraceSuccess] @p_traceId AS int ,@p_description AS nvarchar(max) ,@p_action AS nvarchar(100) = NULL ,@p_affectedRows AS int AS BEGIN SET NOCOUNT ON; EXEC [LL].[spUpdateTrace] @p_traceId ,@p_description ,@p_action ,@p_affectedRows ,'success' ,1; END; -- [LL].[spUpdateTraceSuccess] -- EXEC [LL].[spUpdateTraceSuccess] 1, 'description', 'Insert', 11;
Exception Handling
To ensure that an ETL process terminates in an orderly manner and is not terminated without appropriate logging, explicit exception handling is required. In this case, orderly means that the process catches a thrown exception and updates the related log entries in the tables [LL].[Execution], [LL].[Component] and [LL].[Trace] – if applicable – with [Status] = error and [Success] = 0 and the exception is logged in the table [LL].[Error]. The following diagram shows the basic procedure of exception handling:
Exception handling is required at least at the top level of the execution of an ETL process, which is logged in the [LL].[Execution] table.
The following code example shows an exception handling including logging in the tables [LL].[Execution] and [LL].[Error] according to the diagram above:
-- Execution logging variables DECLARE @executionId AS int; DECLARE @processName AS varchar(max); DECLARE @version AS int; -- Error logging variables DECLARE @componentId AS int; DECLARE @traceId AS int; DECLARE @source AS nvarchar(5); DECLARE @component AS nvarchar(128); DECLARE @task AS nvarchar(128); DECLARE @entity AS nvarchar(128); DECLARE @step AS nvarchar(128); DECLARE @state AS nvarchar(128); DECLARE @success AS bit; -- Exception variables DECLARE @error_message AS nvarchar(max); DECLARE @error_number AS int; DECLARE @error_line AS int; DECLARE @error_state AS nvarchar(max); BEGIN TRY -- Initialize execution logging variables SET @processName = 'Name of ETL-Process'; SET @version = 123; -- Initialize error variables SET @componentId = NULL; SET @traceId = NULL; SET @source = 'sql'; SET @component = 'Procedure Name'; SET @task = NULL; SET @entity = 'Any Entity'; SET @step = 'Do something'; EXEC [LL].[spInsertExecution] @executionId OUTPUT, @processName, @version; -- Do something RAISERROR('Any Exception', 15, 1); SET @state = 'success'; SET @success = 1; EXEC [LL].[spUpdateExecution] @executionId, @state, @success; END TRY BEGIN CATCH SET @error_message = ERROR_MESSAGE(); SET @error_number = ERROR_NUMBER(); SET @error_line = ERROR_LINE(); SET @error_state = ERROR_STATE(); SET @state = 'error'; SET @success = 0; IF @executionId IS NOT NULL BEGIN EXEC [LL].[spInsertErrorException] @executionId, @componentId, @traceId, @source, @component, @task, @entity, @step, @error_number, @error_message, @error_line, @error_state; EXEC [LL].[spUpdateExecution] @executionId, @state, @success; END; THROW; END CATCH
Example of logging and exception handling in an ETL process
After these explanations, the following diagram shows the exception handling of a simple, compact but complete ETL process:
This example shows logging an ETL process developed exclusively with and consisting of 5 stored procedures. The entry procedure [T2].[spETLProcess] calls the two procedures [T2].[spDoSomething_1] and [T2].[spDoSomething_2]. The first simulates and logs an INSERT, UPDATE and DELETE statement, while the second procedure executes two additional procedures [T2].[spDoSomething_1_1] and [T2].[spDoSomething_1_2]. These procedures also simulate and log an INSERT, UPDATE and DELETE statement. During the execution of the DELETE statement in the second procedure, an exception is thrown, which leads to an orderly termination of the ETL process with [State] = error and [Success] = 0.
The example code for creating a database with the required stored procedures for logging and the ETL process procedures mentioned in the diagram from the T2 schema can be downloaded from this link.
Showing all procedures would lead to repetition. Therefore, only the entry procedure [T2].[spETLProcess] and the procedure [T2].[spDoSomething_2_2] are listed here.
[T2].[spETLProcess]
CREATE PROCEDURE [T2].[spETLProcess] AS BEGIN SET NOCOUNT ON; -- -------------------------------------------------------------------------------- -- Declare variables -- -------------------------------------------------------------------------------- -- Error Variables DECLARE @error_message AS nvarchar(max); DECLARE @error_number AS int; DECLARE @error_line AS int; DECLARE @error_state AS nvarchar(max); -- Logging Variables DECLARE @component AS nvarchar(128); DECLARE @task AS nvarchar(128); DECLARE @schema AS nvarchar(128); DECLARE @table AS nvarchar(128); DECLARE @source AS nvarchar(5); DECLARE @step AS nvarchar(max); DECLARE @entity AS nvarchar(max); DECLARE @message AS nvarchar(max); DECLARE @traceId AS int; DECLARE @componentId AS int; DECLARE @executionId AS int; DECLARE @description AS nvarchar(max); DECLARE @affectedrows AS int; DECLARE @action AS varchar(100); DECLARE @state AS varchar(100); DECLARE @success AS int; -- -------------------------------------------------------------------------------- -- Initialize variables -- -------------------------------------------------------------------------------- -- Logging SET @message = NULL; SET @description = NULL; SET @affectedrows = 0; SET @component = OBJECT_SCHEMA_NAME(@@PROCID) + N'.' + OBJECT_NAME(@@PROCID); SET @source = N'T-SQL'; SET @entity = N'[].[ ]'; -- -------------------------------------------------------------------------------- -- Workload -- -------------------------------------------------------------------------------- BEGIN TRY -- -------------------------------------------------------------------------------- -- Check parameters -- -------------------------------------------------------------------------------- EXEC [LL].[spInsertExecution] @executionId OUTPUT, N'ETL process', 123; -- -------------------------------------------------------------------------------- -- Insert component log -- -------------------------------------------------------------------------------- SET @step = N'Orchestrate ETL process'; SET @description = ''; EXEC [LL].[spInsertComponent] @executionId, @componentId OUTPUT, @source, @component, NULL, @entity, @step, @description; -- -------------------------------------------------------------------------------- -- Execute Procedure [T2].[spDoSomething_1] -- -------------------------------------------------------------------------------- SET @task = NULL; SET @step = N'Execute [T2].[spDoSomething_1]'; SET @action = 'execute'; SET @description = NULL; SET @state = 'processing'; SET @success = 0; EXEC [LL].[spInsertTrace] @executionId, @componentId, @traceId OUTPUT, @source, @component, @task, @entity, @step, @description, NULL, @action, NULL, @state, @success; EXEC [T2].[spDoSomething_1] @executionId; EXEC [LL].[spUpdateTraceSuccess] @traceId, @description, @action, @@ROWCOUNT; -- -------------------------------------------------------------------------------- -- Execute Procedure [T2].[spDoSomething_2] -- -------------------------------------------------------------------------------- SET @task = NULL; SET @step = N'Execute [T2].[spDoSomething_2]'; SET @action = 'execute'; SET @description = NULL; SET @state = 'processing'; SET @success = 0; EXEC [LL].[spInsertTrace] @executionId, @componentId, @traceId OUTPUT, @source, @component, @task, @entity, @step, @description, NULL, @action, NULL, @state, @success; EXEC [T2].[spDoSomething_2] @executionId; EXEC [LL].[spUpdateTraceSuccess] @traceId, @description, @action, @@ROWCOUNT; -- -------------------------------------------------------------------------------- -- Execute another procedure -- -------------------------------------------------------------------------------- -- ... -- -------------------------------------------------------------------------------- -- Update component log -- -------------------------------------------------------------------------------- EXEC [LL].[spUpdateComponentSuccess] @componentId, @description; -- -------------------------------------------------------------------------------- -- Update execution log -- -------------------------------------------------------------------------------- SET @state = 'success'; SET @success = 1; EXEC [LL].[spUpdateExecution] @executionId, @state, @success; END TRY BEGIN CATCH SET @error_message = ERROR_MESSAGE(); SET @error_number = ERROR_NUMBER(); SET @error_line = ERROR_LINE(); SET @error_state = ERROR_STATE(); IF @executionId IS NOT NULL BEGIN EXEC [LL].[spInsertErrorException] @executionId, @componentId, @traceId, @source, @component, NULL, NULL, @step, @error_number, @error_message, @error_line, @error_state; IF @traceId IS NOT NULL EXEC [LL].[spUpdateTraceError] @traceId, @description; IF @componentId IS NOT NULL EXEC [LL].[spUpdateComponentError] @componentId, @description; SET @state = 'error'; SET @success = 0; EXEC [LL].[spUpdateExecution] @executionId, @state, @success; END; END CATCH; END;
[T2].[spDoSomething_2_2]
CREATE PROCEDURE [T2].[spDoSomething_2_2] ( @p_executionId AS int ) AS BEGIN SET NOCOUNT ON; -- -------------------------------------------------------------------------------- -- Declare variables -- -------------------------------------------------------------------------------- -- Error Variables DECLARE @error_message AS nvarchar(max); DECLARE @error_number AS int; DECLARE @error_line AS int; DECLARE @error_state AS nvarchar(max); -- Logging Variables DECLARE @component AS nvarchar(128); DECLARE @task AS nvarchar(128); DECLARE @schema AS nvarchar(128); DECLARE @table AS nvarchar(128); DECLARE @source AS nvarchar(5); DECLARE @step AS nvarchar(max); DECLARE @entity AS nvarchar(max); DECLARE @message AS nvarchar(max); DECLARE @traceId AS int; DECLARE @componentId AS int; DECLARE @description AS nvarchar(max); DECLARE @affectedrows AS int; DECLARE @action AS varchar(100); DECLARE @state AS varchar(100); DECLARE @success AS int; -- -------------------------------------------------------------------------------- -- Initialize variables -- -------------------------------------------------------------------------------- -- Logging SET @message = NULL; SET @description = NULL; SET @affectedrows = 0; SET @component = OBJECT_SCHEMA_NAME(@@PROCID) + N'.' + OBJECT_NAME(@@PROCID); SET @source = N'T-SQL'; SET @entity = N'[].[ ]'; -- -------------------------------------------------------------------------------- -- Workload -- -------------------------------------------------------------------------------- BEGIN TRY -- -------------------------------------------------------------------------------- -- Check parameters -- -------------------------------------------------------------------------------- BEGIN IF (@p_executionId IS NULL) BEGIN SET @message = N'The parameter ''p_executionId'' is NULL.'; EXEC [dbo].[spRaiseError] @message, @component; RETURN -1; END; END; -- -------------------------------------------------------------------------------- -- Insert component log -- -------------------------------------------------------------------------------- SET @step = N'Do something'; SET @description = ''; EXEC [LL].[spInsertComponent] @p_executionId, @componentId OUTPUT, @source, @component, NULL, @entity, @step, @description; -- -------------------------------------------------------------------------------- -- Insert data -- -------------------------------------------------------------------------------- SET @task = NULL; SET @step = N'Insert data'; SET @action = 'insert'; SET @description = NULL; SET @state = 'processing'; SET @success = 0; EXEC [LL].[spInsertTrace] @p_executionId, @componentId, @traceId OUTPUT, @source, @component, @task, @entity, @step, @description, NULL, @action, NULL, @state, @success; -- Insert here a SQL Statement that inserts data into a table EXEC [LL].[spUpdateTraceSuccess] @traceId, @description, @action, @@ROWCOUNT; -- -------------------------------------------------------------------------------- -- Update data -- -------------------------------------------------------------------------------- SET @task = NULL; SET @step = N'Update data'; SET @action = 'update'; SET @description = NULL; SET @state = 'processing'; SET @success = 0; EXEC [LL].[spInsertTrace] @p_executionId, @componentId, @traceId OUTPUT, @source, @component, @task, @entity, @step, @description, NULL, @action, NULL, @state, @success; -- Insert here a SQL Statement that updates data a table EXEC [LL].[spUpdateTraceSuccess] @traceId, @description, @action, @@ROWCOUNT; -- -------------------------------------------------------------------------------- -- Delete data -- -------------------------------------------------------------------------------- SET @task = NULL; SET @step = N'Delete data'; SET @action = 'delete'; SET @description = NULL; SET @state = 'processing'; SET @success = 0; EXEC [LL].[spInsertTrace] @p_executionId, @componentId, @traceId OUTPUT, @source, @component, @task, @entity, @step, @description, NULL, @action, NULL, @state, @success; -- Insert here a SQL Statement that deletes data from a table RAISERROR('Exception in [T2].[spDoSomething_2_2]', 15, 1); EXEC [LL].[spUpdateTraceSuccess] @traceId, @description, @action, @@ROWCOUNT; -- -------------------------------------------------------------------------------- -- Update component log -- -------------------------------------------------------------------------------- EXEC [LL].[spUpdateComponentSuccess] @componentId, @description; END TRY BEGIN CATCH SET @error_message = ERROR_MESSAGE(); SET @error_number = ERROR_NUMBER(); SET @error_line = ERROR_LINE(); SET @error_state = ERROR_STATE(); -- Write in Logging IF @p_executionId IS NOT NULL BEGIN EXEC [LL].[spInsertErrorException] @p_executionId, @componentId, @traceId, @source, @component, NULL, NULL, @step, @error_number, @error_message, @error_line, @error_state; IF @traceId IS NOT NULL EXEC [LL].[spUpdateTraceError] @traceId, @description; IF @componentId IS NOT NULL EXEC [LL].[spUpdateComponentError] @componentId, @description; END; THROW; END CATCH; END;
The following script contains the commands for executing the ETL process. In order to obtain a compact process log, the date in the log tables is deleted beforehand. The final SELECT statement produces the result as shown at the beginning.
-- -------------------------------------------------------------------------------- -- 01: Delete data from logging tables -- -------------------------------------------------------------------------------- TRUNCATE TABLE [LL].[Error] TRUNCATE TABLE [LL].[Trace] DELETE FROM [LL].[Component] DELETE FROM [LL].[Execution] DBCC CHECKIDENT (N'[LL].[Component]', RESEED, 0); DBCC CHECKIDENT (N'[LL].[Execution]', RESEED, 0); -- -------------------------------------------------------------------------------- -- 02: Execute ETL Process -- -------------------------------------------------------------------------------- EXEC [T2].[spETLProcess]; -- -------------------------------------------------------------------------------- -- 03: Query logging tables -- -------------------------------------------------------------------------------- SELECT * FROM [LL].[Execution]; SELECT * FROM [LL].[Component]; SELECT * FROM [LL].[Trace];
Summary
ETL processes are data-driven processes. If data is delivered that is not expected, then there is a high probability that either not all data or even incorrect data will be written to the target system. What is only mentioned casually in this article is that at the lowest level of logging, the number of affected data records is or can be logged. Knowing the number of records expected and processed is a good indication of the success or failure of an ETL process.
This approach therefore supports the development of robust ETL processes, which also ensure an assessment of data quality.
What is a must in software development is often neglected when developing ETL processes: explicit exception handling. The exception handling presented here initially ensures that a process is terminated in an orderly manner in the event of an error. But when is there an error? And does an error really always have to lead to the process being aborted? In conjunction with knowledge of the expected and actual number of records processed, real error handling can be implemented.
The approach presented here is, so to speak, an invitation to delve more into the data and the expected result. The protocol and in particular the knowledge of the data support the developer during the development of the ETL process in being able to assess the correctness of the development.
The procedures presented merely provide a toolbox for logging. When used correctly, it produces a readable and interpretable log that allows assessing the correctness of the execution of the ETL process and the data processed.