Friday, 7 June 2024

Dimensions and fact data load optimization using SSIS-speed the data load using control flow and dataflow settings

 In this particular blog, we are looking at the initial and incremental data loads for the following:

·         Item journal credit
·         Item journal debit
·         Dim category
·         Dim warehouse

The given set of product was created to enable the following:
·         Optimize the speed of the data load
·         Optimize the error logging, audit logging facilities within SSIS.

The sql code for the item journal debit and item journal credit fact load was converted into SSIS as shown below.
Indexes were dropped before the data load and created after the data load as shown below.





For the fact load, Lookup transform was used as shown below:




The lookup transform was used to pick up the minimum surrogate key in case multiple outputs were derived from the solution.

Attached are the control flow, container and dataflow setting and the reasons thereof.

Control flow settings:

Checkpoint was enabled as shown below:
  • ·         Checkpoint file exists
  • ·         The job to restart if the error happens
  • ·         The parent to fail in case there is an error at the dataflow level.


      Delay validation was not enabled because we are not using temporary tables.


.

Control flow settings:
In order for the rollback to function, we have set the following at the transaction level.
·         Transaction option: Required
  • ·         Isolationlevel: Read Uncommitted. This is because the tables from which the data is been loaded do not have any other update or delete transactions happening with the same.
  • ·         Logging mode has been enabled at the control flow level.
  • ·       The delay validation is false since we are not using any temporary tables.

  
     The data flow settings are as follows:

  • ·         Isolation level: read uncommitted
  • ·         Transaction option: supported.
  •        This has been done because the isolation level at the control flow was uncommitted.
  •        The transaction level supported indicates that the transaction at the container level would continue to run at the data flow level.


The ‘failparentonfailure’ has been marked to be true to enable checkpoint usage and better auditing and error logging.






 For the logging mode the dataflow is using the container settings.




With the given settings the given set of tasks was able to achieve the following results:
  • ·         Optimized the running of the procedures and tasks within SSIS
  • ·         Optimum error handling and logging activities
  • ·         Restarting the jobs from the error points using breakpoints and checkpoints
  • ·         Transaction rollback was enabled where required


Prepared by:
Ishwar Singh
Lead ETL Developer
Website: www.mnnbi.com
Email address: apoorv@mnnbi.com
Phone: +91-8802466356

No comments:

Post a Comment

Initial and Incremental data Load Template by M&N Business Intelligence-SAP Data Services

  INCREMENTAL LOAD/CREATION OF DIMENSION TABLE LOGIC AT SAP DATA SERVICES END CLIENT In this particular document we shall be looking at the ...