Thursday, 20 December 2018

Usage of PowerBI over Turbodata for best in class reporting solutions




For free PowerBI dashboards elaborating the features below, contact the following:

  • Apoorv Chaturvedi
  • email: apoorv@mnnbi.com


The ETL team intended to develop a reporting solution over Turbodata with the following tenets:
·         Scalable: the reporting solution should be used across multiple end clients. Thus it should have the following features:
o   Role management
o   Migration properties
o   Connectivity across multiple databases such as SAP, Navision etc.
o   Optimum cost: PowerBI is free below 1GB of data.
o   Handling of reporting errors such as fan traps and chasm traps

Implementation of turbodata reporting in the form of Dashboard and semantic layer and relevant query implementation.
Design the semantic layer






Import Database and their tables.

Step-2: Import option loads the data onto local machine, direct query runs the query at database level.


We loaded Dimensions tables, Fact table and Aggregate tables.

Manage the relationship between tables.

Developing Reports

New Report

Export data to excel
Drill down and Drill through
  • }   Set up a hierarchy
  • }   Drill down on the graph by hierarchy
  • }   Drill down for data points


Open chart –inventory turnover

Drill up and Drill down
Custom measures-sample
Power BI Joins Implementations
Steps:
 i) Goto Edit Queries  ii) Click Merge Queries
 iii) Select Source and Destination Database
 iv) Select the join type (left, right, full)   v) Select common column 
We applied the condition with ok.
Query Applied Data:
Right Outer joins In Power Bi Desktop
      Steps: i) Goto Edit Queries    ii) Click Merge Queries   iii) Select Source and Destination Database
      iv) Select Right Outer joins   IV) Select common column 


Full Outer Join Table
Inner Join in Power BI
Output of inner-join
Why parameters?
}   Query to be executed in sql by using external inputs
      Report migration

      On fly ABC classification

Apply The Parameter inside the power BI

  1. Create the parameter 
  2. Create Queries

2. Goto PowerBI
 I) Right Click on the table, click on the Advanced Editor

We have solved the problem of Chasm trap and Fan trap by implementing the following:
I.)Union all
ii) Table alias name
iii) Decision on contexts
iv) Aggregate awareness


2. Goto PowerBI
  I) Right Click on the 
Role Management in Power BI
By the use of Manage role, we can apply the concept security like, we can give the access permission to specific user specific data and condition data access allowance.
 Data representation on the basis of Role base Authentication on Dashboard




Name: Aprajita Kumari
Email: aprajita@mnnbi.com
Phone: +91-9717077793

Alternate contact details: apoorv@mnnbi.com
Phone: +91-8802466356


Website: www.mnnbi.com



Monday, 17 December 2018

Dimensions and fact data load optimization using SSIS-speed the data load using control flow and dataflow settings

In this particular blog, we are looking at the initial and incremental data loads for the following:
·         Item journal credit
·         Item journal debit
·         Dim category
·         Dim warehouse

The given set of product was created to enable the following:
·         Optimize the speed of the data load
·         Optimize the error logging, audit logging facilities within SSIS.

The sql code for the item journal debit and item journal credit fact load was converted into SSIS as shown below.
Indexes were dropped before the data load and created after the data load as shown below.





For the fact load, Lookup transform was used as shown below:




The lookup transform was used to pick up the minimum surrogate key in case multiple outputs were derived from the solution.

Attached are the control flow, container and dataflow setting and the reasons thereof.

Control flow settings:

Checkpoint was enabled as shown below:
  • ·         Checkpoint file exists
  • ·         The job to restart if the error happens
  • ·         The parent to fail in case there is an error at the dataflow level.


      Delay validation was not enabled because we are not using temporary tables.


.

Control flow settings:
In order for the rollback to function, we have set the following at the transaction level.
·         Transaction option: Required
  • ·         Isolationlevel: Read Uncommitted. This is because the tables from which the data is been loaded do not have any other update or delete transactions happening with the same.
  • ·         Logging mode has been enabled at the control flow level.
  • ·       The delay validation is false since we are not using any temporary tables.

  
     The data flow settings are as follows:

  • ·         Isolation level: read uncommitted
  • ·         Transaction option: supported.
  •        This has been done because the isolation level at the control flow was uncommitted.
  •        The transaction level supported indicates that the transaction at the container level would continue to run at the data flow level.


The ‘failparentonfailure’ has been marked to be true to enable checkpoint usage and better auditing and error logging.






 For the logging mode the dataflow is using the container settings.




With the given settings the given set of tasks was able to achieve the following results:
  • ·         Optimized the running of the procedures and tasks within SSIS
  • ·         Optimum error handling and logging activities
  • ·         Restarting the jobs from the error points using breakpoints and checkpoints
  • ·         Transaction rollback was enabled where required


Prepared by:
Ishwar Singh
Lead ETL Developer
Website: www.mnnbi.com
Email address: apoorv@mnnbi.com
Phone: +91-8802466356







Thursday, 6 December 2018

Crashing the Load Times using Turbodata


Turbodata-SQL Problem Statement
·        Ever increasing data of the end clients
·        Serial execution of the queries. If one query fails then the entire process fails. Process has errors. Manual resources required to restart the process.
·        Error logging: usage of temporary tables. No intermediate error data available.
·        Nightly process-SQL. Client interface in C#/.NET
·        The product had to be made scalable. Large number of multiple end users along with audit logs, error logs and data loading. The product to support large number of different kinds of reports-GST, inventory and ledger for the end clients.
·        Scalable module with required security is to be developed.

Turbo Data Solution
·        Crash the execution time-migrate from SQL to SSIS for load times
·        Handle errors separately.
·        Bypass the bottleneck processes: store historical data in data warehouse during the fiscal day
·        Extensive error mapping, logging systems.

Change in the end client systems:





Business Benefits
Ø Meet the SLA for the management
Ø Reduce execution times: parallel execution
Ø Remove error prone data
Ø Keep error data in staging tables
Ø Restart the process from the place where error occurred: checkpoint and breakpoint


Change in Turbodata-SQL to SSIS

The ETL team adopted the following methodology for converting the sql code to SSIS code:
·         Conversion of cte-container
·         All the joins-SSIS transforms
·         Redirect the error rows for mappings and table inserts
·         Drop the indexes before data load and recreate the indexes after data load.



TurboData Conversion Experience



For the linkage of one execute package Task to another, we are using three concepts (Success, Failure, Completion). Hence precedent constraints are marked as ‘completion’ and not ‘successes. The reason for conversion of the same is that the process should run smoothly even if there is an error.


Improved Meta Data Handling


Error history: in the Turbodata initial product, we were not able to track the details of errors at the transform level due to usage of temporary tables. The SSIS team
Validate external metadata is set to false for temporary tables. It should be set to true. Hence the ETL team uses permanent staging tables instead of temporary tables.

Better Error Handling



The ETL team added the breakpoints at container level.
Note *A number of actions possible on breakpoint error such as emailing, inserting details into transaction logs etc.


Error Handling: Dataflow


By some transform error can be generated and we can store error or ignore and fail it. Rows were redirected to ensure smooth dataflow run during the execution process: quicker execution time

Error Handling: Control Flow Level
*Note:  Container (At control level): Transaction Option should be Required, So the child component if fail then the whole transaction will be rollback.
The child component of container: The transctionOption should be supported. So that if it fails the all transaction of same container wills rollback.
Error Handling: Control Flow Level


FailParentOnFailure: True (It is applied in case of check point enable) Indicates whether the parent of the executable fail when the executable of child fail.

 Rerunning the job from error point:

*The ETL team used the checkpoints to re run the jobs from the places where error had occurred

*Reduce the process re running time

Optimize the Job Run


For job optimization: - On Development side we applied Repeatable and production side applied ReadUncommited.
RunOptimizedMode: True (Applied on Data Flow Task)
Unused columns are not put in the ETL tool buffer (Lower load on the system. Quicker query execution)


Log Settings (Logging Mode enabled at container level and ‘Use Parent Settings’ at dataflow level. Capture execution details at lowest level possible)

Know the number of rows loaded in SQL Load: Better logging features

Benefits for Turbodata
Key Result Areas for End Client

Steps for implementation-End Client



Prepared by:

Aprajita Kumari
Contact Details:
Website: www.mnnbi.com
Phone: +91-8802466356

Initial and Incremental data Load Template by M&N Business Intelligence-SAP Data Services

  INCREMENTAL LOAD/CREATION OF DIMENSION TABLE LOGIC AT SAP DATA SERVICES END CLIENT In this particular document we shall be looking at the ...