Showing posts with label GST ERROR HANDLING. Show all posts
Showing posts with label GST ERROR HANDLING. Show all posts

Thursday, 16 May 2019

Automation of GST filing for a large cinema chain-more than 150 locations



Problem statement:
The end client has more than 156 cinema locations for which the GST data needs to be consolidated for monthly GST filing. The problems that the end client desires to solve are as follows:
·         Automation of the GST loads process.
·         Error handling and data auditing

Tasks achieved
Full load of all 156 locations:

The ETL team tried single load and 2 parallel load approach for implementation of the same.
The performance parameters of the ETL process was gauged by the following parameters:
·         Time of load
·         Error logging and status logging
·         Types of errors
·         Validation issues

Process flow: The end client has a set of stored procedures running at remote locations. The procedures are of 2(two) types: 2 parameter and 3 parameters. The ETL load accommodated both the types of stored procedures into the final solution.
The ETL team has done an initial load for 2 parameter(only the date from and date to are given as input parameters) and 3 parameter locations(only the cinema_operator,date from and date to are given as input parameters).

Attached is the entire process flow at the process task level:



In the initial load package, the initial load of 2 parameters and 3 parameters happens in parallel.
Hereafter there could be some locations that are not loaded (of 2 parameters and 3 parameters). These locations are loaded in the incremental set up.
Condition for the incremental set up is derived from the variable value:



@[user::count]>0 condition is given for add missing value in data for  the process is running  when count Value is greater then 0.

The given variable @usercount is derived from the execution of the sql transform as follows:



Mapping of the result output to the variable:



Package explanation of initial load :



3 parameters and 2 parameters are loaded in parallel in initial load package.




Explanation of two parameters load




Fetch of the maximum row_id from audit_table. The table name audit_table has all the locations with 2(two) parameter load.



Set the parameters for loop increments.
The variable @inc is set to 1.
@max variable has been set to the maximum number of rows in the audit table.





For each location a dynamic connection string is used for connecting to the remote location and extracting data from the remote location.
Fetching of connection string for dynamic location: the audit table has the connection parameters or all locations. These connection parameters are extracted one by one using the execute sql command.






Sqlcommand: =
" select Cinema_strCode code, Accessed_user id, Channel_Address ip, convert(varchar(30), DECRYPTBYPASSPHRASE('8', password)) pass, SUBSTRING(tableSynonym, 0, CHARINDEX('.', tableSynonym, 1))dbname from tblCinema_Master where Cinema_strCode in (select code from audit_table where  row_id="+(dt_wstr,4) @[User::inc]+" );"

The data is set to result set



Dynamic Location is set to the static location by setting the expression connection string








Con1:(setting up the connection string)

"Data Source="+ @[User::ip1]  +";User ID="+ @[User::id1] +";Initial Catalog="+ @[User::dbname1] +";Provider=SQLNCLI11.1"+";Password="+ @[User::pass1] +";"



Now the Sql Execute Task connection string assigned dynamic connection.


Fetching the data from remote location using dynamic connection string:


Updated Intermediate_Test






Validation rules: the end client asked the ETL team to apply validation rules to check the quality of the data. the End client had input percentages of 18,12,11,14,0 and 5. Any transactions not in the given percentage ranges were classified as having wrong data.

Rule Apply first Check (18,12,11,14,0,5)


Checked Rule Second : this rule entailed that any entry having 0% GST should not have total value less than -2 or more than 2.




Update the Intermediate location with covered date and location.




Executed the exjoinexceptiontable for apply both rules.







Truncated the intermediate tables





truncate table tblGstIntermediate;
truncate table [dbo].[AuditTable1];
truncate table [dbo].[auditTable2];



 Incremental load of two parameters; this module indicates the approach for 2 parameter incremental load.
The incremental load for 2 parameters was required since there could be some locations that have been missed out of the initial load due to the connection errors. In such a scenario the incremental load for 2 parameter locations needs to be done. 

ALTER procedure [dbo].[sp_missing_2_parameter]
as
begin
truncate table  missing_2_parameters;
--drop index idx_cinema_str on tblGst_Test11;

with cte1 as(select distinct cinema_strcode from [dbo].[tblGst_Test11])
select * into #cte1 from cte1;
create index #idx_cte1 on #cte1(cinema_strcode);


with
cte2
as( select * from [dbo].[audit_table]

where code not in (select distinct cinema_strcode from #cte1))

insert  into missing_2_parameters
select *  from cte2;

--create index idx_cinema_str on tblGst_Test11(cinema_strcode);
end;


We will find the missing location from missing_2_parameters, and we

Start loading the data from missing_2_parameters table.




Executed the sp_missing_2_parameter for load the missing location
exec sp_missing_2_parameter;
The find out the maximum missing_id  from missing_2_parameters.

select max( missing_id)  from missing_2_parameters



Setting the value of for loop parameters


For loop will iterate till the max_inc meet.
Note :
Then we have taken sequence container

Inside taken Execute SQL task (Taking Dynamic connection string one by one)


The process flow hereafter is the same as that was done for the initial 2 parameter.

The ETL team replicated the same process for 3 parameter load.

Note: this blog covers only the serial execution. The process of parallel execution(running multiple dataflows simultaneously for 2 parameter data load was work in progress).


APPENDIX
Activity
1.)    Logging : the logging mode was enabled for package data load.

1.       Set the error log for the package.

Steps:-



Set the Provider type and configure the type of data base where to store.

Then set the event to store the error points






In case of any GST automation requirements, please contact the following:
Name: Apoorv Chaturvedi
email: support@mndatasolutions.com;support@turbodatatool.com
Phone; +91-8802466356
Website: https://mn-business-intelligence-india.business.site/









Sunday, 25 November 2018

Conversion of Nil rated GST report from SQL to SSIS




The necessity of converting the sql code to SSIS code arose from the following requirements.
1.      Design scalable module across various ERPs. For the same the ETL team, segregated  various modules using parent and child packages as shown below.




                A  hierarchy of packages was built in within the SSIS dataflow.




·        
The need to design an error proof reporting system across large number of extraction systems. Say an end client has customers on Tally, SAP, Navision etc. then the end client can use the Turbodata GST module to run the GST reports from all the customers simultaneously. 

  •           The need to rerun the jobs when error occurs.
    Say, we are running the particular set of stored procedures. 


The above set of procedures shall fail if any of the intermediary procedures fail.

In case of large scale deployment of GST consolidated solutions, the above is a bottleneck. to get around the same, the ETL team used the SSIS code to achieve the following.
  •       The entire set of the sql code  was embedded within containers. Each container execution was sequenced to be dependent upon the prior container(completion and not success). a sample as shown under:



Implications for the business: the entire process shall run smoothly over large data loads. The process of error handling shall be easier. The resources can check for the errors after the completion of the load during the night(in morning).

By passing the failure problem during sql execution.
For the same, the ETL team used the failure tracking processes at the dataflow level for the both the data insertion and data processing.
An example is attached herewith:
·         Error handling during data processing.
On a particular condition, attached herewith, the code was stopping multiple times.



The ETL team at the transformation level had 3 options:


  •         Ignore failure
  •        Redirect the row
  •         Fail Component.

The ETL team decided to redirect the rows on each failure into the audit table. This audit table was available to the end client managers for analysis at a later date. However this feature ensures that in case an error happened during the night then the code would not stop across multiple end users.

Similarly during the data insertion and data updates, errors could occur. such rows were redirected at the ETL level.





·         Saving on the execution times: for the same the ETL team adopted the 5 pronged strategy.

o   Identifying the processes that could be run in parallel. for example in sql code, the fact load for item sales order, item journal fact and others was carried out sequentially. The ETL team decided to execute the same in parallel.



o   Minimizing the number of columns to be processed: the extraction module had minimum number of columns to be processed and minimum rows to be processed.

o   Capturing the error prone data before processing the same as Nil rated data.
o   dropping of indexes before data load in a dimension table and recreating the indexes after the data load.



For more information on how to do one of the following:
  •          Design optimum GST reports
  •          Catch error prone data before processing
  •          Design scalable datawarehouses for SAP, Navision, tally on premises and on cloud

Contact:
Apoorv Chaturvedi
Phone: +91-8802466356
website: www.mnnbi.com

SSIS module and blog prepared by Aprajita Kumari:


Data insertion-Tally(approaches)

 Problem statement: Many softwares look to insert data into Tally from their application.This blog looks at issues and approaches for the sa...