Showing posts with label spreadsheet consolidation. Show all posts
Showing posts with label spreadsheet consolidation. Show all posts

Tuesday, 21 May 2019

Documentation for Dashboards and Data Modeling for Bottling Company with sales in Africa



The end client desires to upgrade the sales reporting for its resources from the current MS Excel spreadsheets to an online system.

The requirements of the end client were as follows:
1 .)    Consolidate the data from  multiple spreadsheets for analysis over period in time.
2.)    Provide visibility for sales and expenses by route and  truck .
Step 1: the ETL  team mapped  the spreadsheets onto a database(initially MS Access) for easy data entry at the client main location for after trip and before trip details. Attached are the details for the same (appendix a)
Design of the audit report:
As a first step the ETL team replicated the audit report for the end client in Powerbi. The purpose of replicating the audit report was as follows:
1.)    Check whether the numbers are correct as per the data entry done for the end client
2.)    Consolidate all the audit numbers for the end client in one interactive framework.
The details of the replication of the audit report are in appendix b.

Development of analytics and dashboards: the ETL team  identified four(4) key kpis for analytic purposes.
·         Sales: date, truck and route
·         Returned sales-returned crates, returned full bottles, returned empty bottles: date, truck and route
·         Samples: date, truck and route
·         Expenses: date, truck, route and expense type.
The behavior of expense is as follows:
·         Aggregate by date dimension and expense type
·         Semi aggregate by route and truck master
·         Non aggregate by item dimension

The behavior of sales is as follows:
·         Aggregate by date dimension
·         Aggregate by truck and route master
·         Non aggregate by expense type
·         Aggregate by item entities.

In case of expenses analysis the analytics to be developed were to be grouped by expense type(that does not fall in direct hierarchy with route and truck hierarchies). Hence the calculated measures need to be fixed at each level of analysis by expense type( for route truck and overall analysis).
Accordingly the etl team developed the following kpis for the expense analytics:
Expense(fixed overall):
All expense: CALCULATE(SUM(after_trip_allocation[expense]),ALL(after_trip_allocation),after_trip_allocation[type_of_expense_id]>0)
NOTE: The type of expense id was stipulated to be more than 0 since without the expense id expense aggregate would not make sense.

Expense across route id and expense nature:the given measures fixes the total expense
EXPENSE_ACROSS_ROUTE_EXPENSETYPE = CALCULATE(SUM(after_trip_allocation[expense]),ALLEXCEPT(after_trip_allocation,after_trip_allocation[route_id],'type of expense'[expense_nature]), after_trip_allocation[type_of_expense_id]>0)

Percentage calculation for route:
PERCENT_ROUTE = (after_trip_allocation[EXPENSE_ACROSS_ROUTE_EXPENSETYPE]/after_trip_allocation[expense_excl_nature])*100

Expense across truck id and expense nature:
EXPENSE_ACROSS_TRUCK_EXPENSE_NATURE = CALCULATE(SUM(after_trip_allocation[expense]),ALLEXCEPT(after_trip_allocation,Truck_master[Truck_ID],'type of expense'[expense_nature]),after_trip_allocation[type_of_expense_id]>0)


Percent expense along the truck route is defined as follows:
PERCENT_TRUCK = (after_trip_allocation[EXPENSE_ACROSS_TRUCK_EXPENSE_NATURE]/after_trip_allocation[expense_excl_nature])*100

Attached is the dashboard for the expense analysis:



Calculation  of sold bottles:
Sold bottles is the difference of quantity loaded,return bottles and sample bottles.
Attached is the snapshot for the sold bottles
sold_bottles =
SUM('Before_trip_allocation'[quantity_loaded]) - SUM('after_trip_allocation_item'[Returned_bottles])-CALCULATE(sum(after_trip_allocation_item[returned_full_bottles]),after_trip_allocation_item[sample_flag]="'sample'")

Sold bottles are calculated on the basis of:
·         Vehicle no.
·         Route description
·         Date dimension
·         Item dimension(brand,volume and holding type)
Attached is the dashboard of the sold bottles:



Vehicle no and the route description  is the slicer .we can analyzed the sold bottles for the particular  vehicle and route.


Calculation  of sample bottles
Sample bottles is extracted from the sample flag.
Attached is the snapshot for the sold bottles
Sample = CALCULATE(sum(after_trip_allocation_item[returned_full_bottles]),after_trip_allocation_item[sample_flag]="'sample'")

Sample bottles are calculated on the basis of:
·         Vehicle no.
·         Route description
·         Date dimension
·         Item dimension(brand,volume and  holding type)
Attached is the dashboard of the sample bottles:



Calculation  of return bottles:
Return bottles is extracted from the sample flag.
Attached is the snapshot for the sold bottles
Returned_bottles = CALCULATE(sum(after_trip_allocation_item[returned_full_bottles]),after_trip_allocation_item[sample_flag]="'returned'")

Return bottles are calculated on the basis of:
·         Vehicle no.
·         Route description
·         Date dimension
·         Item dimension(brand,volume and holding type)
Attached is the dashboard of the return bottles:



Appendix A
LIQUOR COMPANY  MOBILITY APP

The given implementation has been done to automate the field sales collection data for Liquor Company.

As a first step, the ETL team has deployed the MS Access database on the client end for data capture of before and after truck load. That is the ETL team has devised a system wherein it captures the data when the truck is loaded by various SKUs. Also the end client inputs the data after the truck has completed its journey.
The benefit to the end client is that the ms excel spreadsheets shall be replaced by database and consolidated reporting shall be available to the end client.
Access  Implementation
The end client desired  to have a set of reports using PowerBI  and database is created in MS Access  using the following features:
·         Calculate the crates loaded on the basis of driver name,areaname,sales person.
·         Calculate the Daily routine  sales and commission  report on the basis of brand ,volume and holding types.
·         Calculate the total bottles sold after the trip.
Database model for Access interface is as follows:( the data model was developed by reverse engineering the client spreadsheets)




Attached is the explanation of the database model:

Truck  Master: The truck  master entails all the master data for the truck.It includes the vehicle no.,driver no.,helper,sales person of the truck.

Route  Master: The route master entails all the master data for the route.The route master  has one to many relation with the truck  master.That is a truck can use the various route to travel or will have the same route on multiple dates. .It includes all the information of route i.e. when the  journey started ,what was the meter  reading and how many crates are loaded.

Before trip allocation: Before trip allocation entails all the information of the item before the journey  was started i.e. how much  quantity was loaded of which brand,volume and holding type.which type of load was it,either it is for “sales” ,”promotion”or ”sample”.

After trip allocation:it entails all the information of after trip of the truck on the given route.i.e how any full bottles,emptybottles,emptycrates,breakage bottles are returned,how much fuel was consumed and the expense details .

After trip allocation item: it entails all the information of after trip.i.e how any full bottles,emptybottles,emptycrates,breakage bottles are returned,which types of material  was sold “sample”,”promotion” or “sold”.

The database is created in MS Access.There are two buttons before trip and after trip.
Attached is the snapshot :

Before trip allocation contains all the details of the truck ,when the journey was started and how much quantity was loaded of which brand ,what is the volume and the holding types.





After trip allocation:the after trip form contain the  after trip allocation detail and expense detail on the basis of vehicle no. and area name.
Attached is the snapshot:

After trip allocation form contains the duration of journey and the description of the sample flag i.e  either the item returned was “return” or “sample” on the basis of brand, volume and holding types.




Expense detail: This form contains the expense  details of the truck.



Appendix B
Attached is the explanation for the dashboard developed for replicating the audit report.





The above portion has been mapped in PowerBI as follows

Area name and Vehicle no: These 2 have been assumed to be filter parameter as follows:







Vehicle no. and area name have been used as slicers for the given dashboards.



DAX function:The sample and returned bottles are calculate by  using the dax functionality.
Attached is the snapshot:





This is the aggregate function.we use the DAX functionality for the calculation of the returned bottles


For the same the etl team generated an aggregate for ‘after_trip_allocation’
Attached is the snapshot




Cross filter sample:



In case interested in converting spreadsheets to data model, please contact the following:

Name: Apoorv Chaturvedi
Phone: +91-8802466356
email: apoorv@mnnbi.com

Blog prepared by:

Tuesday, 1 May 2018

Consolidation of MS Excel spreadsheets




Name: Apoorv Chaturvedi
Phone: +91-8802466356
website: www.mnnbi.com

Most of the businesses use MS Excel spreadsheets for data capture, data reporting. A common set of problems that arise during spreadsheet consolidation are as follows:
  •       Data consolidation takes times
  •       Manual process.
  •        Error prone: auditing of data is an issue.
  •        Inflexibility of  analysis: the end clients do not get graphs and dashboards
  •        The process is very time consuming

Turbodata helps resolve the problem by taking the following approach.
Turbodata has readymade datawarehouses associated with inventory, ledger where the on fly calculations are matching with a standard accounting package such as Tally ERP. The product has been designed using the philosophy of Goldratt, Deming and Toyota Production system. There are readymade extractors using C#,.Net for extracting data from ERPs such as Tally, SAP, Navision and even custom spreadsheets.
How does Turbodata help solve the problem?
  •        The input spreadsheets could be imported in an xml format into a shared folder. The condition is that the input spreadsheet formats should not change for the end client.
  •        The xml inputs are then imported to the database from wherein the reporting takes place.

Why use Turbodata?

  •       Turbodata has a readymade datawarehouse for ledger and inventory with built in on fly calculations.  Turbodata inventory and Turbodata ledger have readymade data with perpetual valuations to re generate any type of MS Excel report.
  •        The data extraction shall take place in such a way that minimal RAM is used. This shall help the end client in the following ways:
  •         Multiple end users can update the spreadsheet data while data extraction is happening without any problems
  •           The source machine shall not be stuck.
  •        The solution shall be plug and play and very easy to use.
  •        The data audit step of matching the output with the source table entries shall be part of the solution
  •        The end client can get dashboards using a Business Intelligence tool of its choice. If the consolidated data is below 1 GB then Microsoft PowerBI can be used for getting best in class dashboards
  •        Data entry and data validation shall be designed as part of the solution.
  •     Lower cost of the solution through data compression and sql reduction.
  •        Reduced manual data entry time and quick turnaround times for development of the final reports.


Next steps:
The end client could send the following to the above email address:
·       Sample output spreadsheet to be automated
·       Sample input spreadsheets to be consolidated/The details of the source systems to be consolidated.


How Turbodata helped lower the costs of developing a datawarehouse and helped the end clients do predictive analytics with quickness and ease-applicable for retail sales and inventory(Website: www.mnnbi.com)

Purpose of the development of the product: The Turbodata team intends to reduce the costs of the analytic solutions by creating a single platform for ETL, Reporting, Reporting development and predictive analytics. The team also intends to provide the best in class analytics on the same machine on which the ERP is running or with the addition of minimum hardware requirements for the end client. This has been done to develop scalable systems that can be deployed over a large number of customers(with limited budgets) with ease(deployment, delivery and usage) and convenience(maintenance).
The end goal is to increase derisking and predictability for the end clients at lower costs.




Methodology for achieving the required ends for the end client:
·         Turbodata adopted the Inmon methodology for datawarehouse development so that multiple data sources could be added onto the same datawarehouse. That is the change from one data source to another was done with ease. More details on the attached web page link: http://mnnbi.com/mnnbi_tallydataconsolidation.html


o   The benefits of the normalization of data were as follows:
§  The incremental data load took minimum time and had minimum impact on the source system. The ETL team was able to commit the incremental data load to a maximum of 2GB RAM from multiple source systems. The source systems did not hang with the incremental data load working.
§  Error handling was done with ease at staging layer.
§  Massive data compression took place due to reduced data redundancies.
§  The business logic was coded between staging and the ODS layers thereby reducing the length of the final sql code.
The attached video shows a more detailed description of the benefits listed above:
The joins were reduced in the data mart layer(over which a reporting layer was built).

The ETL team was able to develop extremely complex reports using the datawarehouse as in the attached sample:

Due to the data compression for most projects the ETL team are able to bring the data within 1 GB. Hence the desktop version of Microsoft Power BI could be used free of cost for the end client.

Reducing the cost of predictive analytics solutions
 Most of the end clients use high end predictive tools over the datawarehouse/ over the direct data extract from various source databases. With large datasets predictive analytics using in memory solutions entails high usage of RAM. The ETL team has gone around this issue in the following manner:
o   A seamless environment was created for ETL, reporting and thereafter predictive analytics on SQL/C# and .Net. The reasons for the same are attached herewith:
§  Maintenance becomes easier since there is a single platform for all aspects.
§  The cost comes down since the resources to be used for ETL can also be used for predictive analytics
§  Error handling becomes very easy since errors can be captured before in the


Hypothesis testing
Based on the hypothesis testing, the ETL team developed ARIMA analysis and Market Basket analysis in SQL using seamless integrated set of stored procedures. That is the ARIMA analysis flowed from the datawarehouse A,B,C categorization. The ETL team thus reduced the requirement for high end R and Python developers to code over the datawarehouse thereby presenting a seamless solution to the end client on a 8GB RAM machine.

Benefits to the end client:
·         The end client gets immediate and confirmed peace of mind and satisfaction through immediate deployment of predictive and forecasting analytics modules.
·         No additional hardware/software requirements need to be taken
·         The costs are way lower for the end client.
·         Large scale deployment is possible with the given set of solutions.
Please check the attached video for the same:
A more detailed video is attached herewith:


Example of predictive analytics with Turbodata: Example of predictive analytics-Turbodata


Epilogue
The ETL team has been inspired by the following management books:
·         ‘Profit Beyond Measure’ by Thomas Johnson and Anders Brohms
·         McKinsey Mind by Ethan Rasiel and Paul Friga.
·         Blue Ocean Strategy by W. Chan Kim and Renee Mauborgne

·         Better India, Better World by Narayana Murthy


Capturing Data Entry/Data Audit Errors


A number of times the end client types in wrong data into the source ERP system thereby resulting in wrong outputs and results. Junk inputs imply junk outputs.  The ETL team would recommend an auditable output from Turbodata to be used as part of the reporting purposes.  Wrong data inputs can impact the end client in one or more of the following ways:
  •        Wrong tax filing specifically in online scenario.
  •         Wrong business picture
  •         Wrong predictive analytics.
As per the Toyota ProductionSystem, bad inputs should not be processed further as it adds to the final costs.
The ETL team(my firm) has found the following errors with regards to the data entry inputs specifically with Tally ERP 9.0.  

·         Stock input has been in one godown but stock outward movement has been from other godowns:





·         Missing purchase or sales order entries resulting in negative stocks at given points in time. One cannot have negative stock balances at any point in time.



Other data input errors that we have commonly seen are as follows:

  •       Duplicate payment entries
  •      Duplicate sales entries
  •        Receipt note entries but no purchase invoice entries
  •         Payments not having the required bill reference numbers.
How to resolve the errors:
·         In an object oriented program it is difficult to catch the errors on a real time basis. The ETL team recommends using the relational databases for catching the errors. The real time extraction module for Turbodata should be used for the same.
·         Transferring the data onto the third normal database is recommended. This helps catch data duplicity based on the composite keys.
For example if an end client has made the same amount payment for a given voucher on a given fiscal date, then the same should come as part of the discrepancy report. It is possible that the end client could be correct. There is also a possibility that the payment entries have been made by 2 different resources. Further handling of the given situation is as follows:
·         If the end client desires to catch the following error then the username by which the data entries have been done shall not be added to the composite key. In such a scenario there is a discrepancy between the Turbodata ledger balance output and the Tally report. The end client to approve the discrepant entry before the data is input into the system for auditing purposes.
Using perpetual valuations for ledger and inventory instead of periodic valuations. For example if an end client relies on periodic valuations for ledger balances then a duplicate payment entry then the periodic balances at the end of the fiscal month are difficult to catch. For example if an end client has a duplicate entry of Rs. 100k(One hundred thousand  only) over a balance of say Rs. 15000k(One fifty million only).
However using the perpetual system it is easy to catch the data entry errors.

Matching the consolidated trial balances and closing stock balances at the database level with the on fly calculations at the software level.

A small story for the end user: as Yuval Harari is Sapians says that mankind is primarily driven by myths. Hence many a managers are driven by myths regarding software or the consulting companies having the right audit numbers(with the managers inputting junk numbers).
A small story from one of my favourite books(Raag Darbari by Srilal Shukla) could best illustrate the point.
The protagonist Ranganath had gone from the city to visit his relative, an aunt’s husband , in the village. During the course of the village fair, it was suggested that the group goes and sees the village temple for the local goddess. At the temple Ranganath found that the statue instead of been of a goddess was of a soldier( for a goddess he was looking for two lumps  in front and two lumps in the back). The priest asked for donations for the goddess. To this request Ranganath refused saying that the statue was not of a goddess but of a man. There was an ensuing scuffle between the villagers and Ranganath. Ranganath was eventually rescued by his cousin. On going out and meeting other people, the cousin mentioned the following:
"My cousin has come from the city and is very well read. That is why he talks like a fool."
The author has always associated himself with Ranganath.




Deployment of Turbodata for Retail company based out of Western India


Source system: multiple installation of Tally ERP 9.1.
Problem : The end client desired to have a custom installation of Turbodata based on the unique requirements of its business. The product shall be used for designing a custom web interface for customer interaction. The key tenets of the solution that differed from the standard deployment of turbodata were as follows:
·         Standard price list instead of weighted average or FIFO pricelist.
·         Closing stock value was to be calculated at a godown and item level.
·         The solution was to work across multiple locations seamlessly with maximum RAM usage been 1 GB for both initial and incremental data loads.
·         Custom masters extraction for item, stock group, category .
·         GST classification to be extracted for the end client.
Time duration of the project: 2 weeks.
Approach of the ETL team:
·         Choosing the appropriate set of attributes to be loaded based on the modular approach. That is the required fields to be loaded for ledger and inventory were chosen.
·         Custom extraction for the tables: The process of normalization helped in the same since the attribute is to be loaded only once.
·         Testing of initial and incremental data loads in terms of time and load on the system. The incremental data load process helped at reducing the time of data load.
·         Data cleansing: special characters were removed from the item names. Also separation of the numeric values from the character fields
·         Data consolidation: multiple types of voucher types were loaded onto the datawarehouse.

Project has been done successfully. Hereafter the end client shall go for a MVC interface over the datawarehouse for reporting and customer interaction purposes.


Facebook post: https://www.facebook.com/Consolidation-in-MS-Excel-2007-524999194181956/




Data insertion-Tally(approaches)

 Problem statement: Many softwares look to insert data into Tally from their application.This blog looks at issues and approaches for the sa...