Showing posts with label SSIS incremental data load. Show all posts
Showing posts with label SSIS incremental data load. Show all posts

Sunday, 25 November 2018

Conversion of Nil rated GST report from SQL to SSIS




The necessity of converting the sql code to SSIS code arose from the following requirements.
1.      Design scalable module across various ERPs. For the same the ETL team, segregated  various modules using parent and child packages as shown below.




                A  hierarchy of packages was built in within the SSIS dataflow.




·        
The need to design an error proof reporting system across large number of extraction systems. Say an end client has customers on Tally, SAP, Navision etc. then the end client can use the Turbodata GST module to run the GST reports from all the customers simultaneously. 

  •           The need to rerun the jobs when error occurs.
    Say, we are running the particular set of stored procedures. 


The above set of procedures shall fail if any of the intermediary procedures fail.

In case of large scale deployment of GST consolidated solutions, the above is a bottleneck. to get around the same, the ETL team used the SSIS code to achieve the following.
  •       The entire set of the sql code  was embedded within containers. Each container execution was sequenced to be dependent upon the prior container(completion and not success). a sample as shown under:



Implications for the business: the entire process shall run smoothly over large data loads. The process of error handling shall be easier. The resources can check for the errors after the completion of the load during the night(in morning).

By passing the failure problem during sql execution.
For the same, the ETL team used the failure tracking processes at the dataflow level for the both the data insertion and data processing.
An example is attached herewith:
·         Error handling during data processing.
On a particular condition, attached herewith, the code was stopping multiple times.



The ETL team at the transformation level had 3 options:


  •         Ignore failure
  •        Redirect the row
  •         Fail Component.

The ETL team decided to redirect the rows on each failure into the audit table. This audit table was available to the end client managers for analysis at a later date. However this feature ensures that in case an error happened during the night then the code would not stop across multiple end users.

Similarly during the data insertion and data updates, errors could occur. such rows were redirected at the ETL level.





·         Saving on the execution times: for the same the ETL team adopted the 5 pronged strategy.

o   Identifying the processes that could be run in parallel. for example in sql code, the fact load for item sales order, item journal fact and others was carried out sequentially. The ETL team decided to execute the same in parallel.



o   Minimizing the number of columns to be processed: the extraction module had minimum number of columns to be processed and minimum rows to be processed.

o   Capturing the error prone data before processing the same as Nil rated data.
o   dropping of indexes before data load in a dimension table and recreating the indexes after the data load.



For more information on how to do one of the following:
  •          Design optimum GST reports
  •          Catch error prone data before processing
  •          Design scalable datawarehouses for SAP, Navision, tally on premises and on cloud

Contact:
Apoorv Chaturvedi
Phone: +91-8802466356
website: www.mnnbi.com

SSIS module and blog prepared by Aprajita Kumari:


Thursday, 18 October 2018

Load of the datawarehouse without the lookup transform-SSIS





               Load of the datawarehouse without the use of lookup tables

This particular module has been developed for those customers where the lookup tables are very large and the join transformations shall help process the information faster than lookup. In this particular case the ETL team has replaced the lookup with join transforms.
Note: This method cannot be applied for type 2 entities
This method currently handled insert, updates and deletes.
The solution shall be developed over 3 data flows as given under:





In the supply load dataflow the ETL team has developed the methodology for inserts, updates and deletes as follows:





Method for handling inserts and updates
·         Join the source and the target table using the merge transform based on the composite keys. The end client could pick up the surrogate keys based on the same.






In case of the insert, the identity key column should be checked for the identity row insert.




For updates the table data is stored in a copy of the target table. Thereafter the update sql is run for the same using the sql transform.



Handling deletes:
In the case of deletes, the ETL team looks for the composite key details in the target table and the given details should not be in the source table. In case the given condition is applicable then the row in the target table is marked as ‘D’(delete flag)


Contact details:
For resource requirements:resources@mnnbi.com
For training and consulting services: apoorv@mnnbi.com
Blogspot details: https://mndatasolutionsindia.blogspot.com/

Written by Gauri(gauri@mnnbi.com)

https://mndatasolutionsindia.blogspot.com/p/gauri.html



Data insertion-Tally(approaches)

 Problem statement: Many softwares look to insert data into Tally from their application.This blog looks at issues and approaches for the sa...