Consolidation of MS Excel spreadsheets

Name: Apoorv Chaturvedi
Phone: +91-8802466356

Most of the businesses use MS Excel spreadsheets for data capture, data reporting. A common set of problems that arise during spreadsheet consolidation are as follows:
  •       Data consolidation takes times
  •       Manual process.
  •        Error prone: auditing of data is an issue.
  •        Inflexibility of  analysis: the end clients do not get graphs and dashboards
  •        The process is very time consuming

Turbodata helps resolve the problem by taking the following approach.
Turbodata has readymade datawarehouses associated with inventory, ledger where the on fly calculations are matching with a standard accounting package such as Tally ERP. The product has been designed using the philosophy of Goldratt, Deming and Toyota Production system. There are readymade extractors using C#,.Net for extracting data from ERPs such as Tally, SAP, Navision and even custom spreadsheets.
How does Turbodata help solve the problem?
  •        The input spreadsheets could be imported in an xml format into a shared folder. The condition is that the input spreadsheet formats should not change for the end client.
  •        The xml inputs are then imported to the database from wherein the reporting takes place.

Why use Turbodata?

  •       Turbodata has a readymade datawarehouse for ledger and inventory with built in on fly calculations.  Turbodata inventory and Turbodata ledger have readymade data with perpetual valuations to re generate any type of MS Excel report.
  •        The data extraction shall take place in such a way that minimal RAM is used. This shall help the end client in the following ways:
  •         Multiple end users can update the spreadsheet data while data extraction is happening without any problems
  •           The source machine shall not be stuck.
  •        The solution shall be plug and play and very easy to use.
  •        The data audit step of matching the output with the source table entries shall be part of the solution
  •        The end client can get dashboards using a Business Intelligence tool of its choice. If the consolidated data is below 1 GB then Microsoft PowerBI can be used for getting best in class dashboards
  •        Data entry and data validation shall be designed as part of the solution.
  •     Lower cost of the solution through data compression and sql reduction.
  •        Reduced manual data entry time and quick turnaround times for development of the final reports.

Next steps:
The end client could send the following to the above email address:
·       Sample output spreadsheet to be automated
·       Sample input spreadsheets to be consolidated/The details of the source systems to be consolidated.

How Turbodata helped lower the costs of developing a datawarehouse and helped the end clients do predictive analytics with quickness and ease-applicable for retail sales and inventory(Website:

Purpose of the development of the product: The Turbodata team intends to reduce the costs of the analytic solutions by creating a single platform for ETL, Reporting, Reporting development and predictive analytics. The team also intends to provide the best in class analytics on the same machine on which the ERP is running or with the addition of minimum hardware requirements for the end client. This has been done to develop scalable systems that can be deployed over a large number of customers(with limited budgets) with ease(deployment, delivery and usage) and convenience(maintenance).
The end goal is to increase derisking and predictability for the end clients at lower costs.

Methodology for achieving the required ends for the end client:
·         Turbodata adopted the Inmon methodology for datawarehouse development so that multiple data sources could be added onto the same datawarehouse. That is the change from one data source to another was done with ease. More details on the attached web page link:

o   The benefits of the normalization of data were as follows:
§  The incremental data load took minimum time and had minimum impact on the source system. The ETL team was able to commit the incremental data load to a maximum of 2GB RAM from multiple source systems. The source systems did not hang with the incremental data load working.
§  Error handling was done with ease at staging layer.
§  Massive data compression took place due to reduced data redundancies.
§  The business logic was coded between staging and the ODS layers thereby reducing the length of the final sql code.
The attached video shows a more detailed description of the benefits listed above:
The joins were reduced in the data mart layer(over which a reporting layer was built).

The ETL team was able to develop extremely complex reports using the datawarehouse as in the attached sample:

Due to the data compression for most projects the ETL team are able to bring the data within 1 GB. Hence the desktop version of Microsoft Power BI could be used free of cost for the end client.

Reducing the cost of predictive analytics solutions
 Most of the end clients use high end predictive tools over the datawarehouse/ over the direct data extract from various source databases. With large datasets predictive analytics using in memory solutions entails high usage of RAM. The ETL team has gone around this issue in the following manner:
o   A seamless environment was created for ETL, reporting and thereafter predictive analytics on SQL/C# and .Net. The reasons for the same are attached herewith:
§  Maintenance becomes easier since there is a single platform for all aspects.
§  The cost comes down since the resources to be used for ETL can also be used for predictive analytics
§  Error handling becomes very easy since errors can be captured before in the

Hypothesis testing
Based on the hypothesis testing, the ETL team developed ARIMA analysis and Market Basket analysis in SQL using seamless integrated set of stored procedures. That is the ARIMA analysis flowed from the datawarehouse A,B,C categorization. The ETL team thus reduced the requirement for high end R and Python developers to code over the datawarehouse thereby presenting a seamless solution to the end client on a 8GB RAM machine.

Benefits to the end client:
·         The end client gets immediate and confirmed peace of mind and satisfaction through immediate deployment of predictive and forecasting analytics modules.
·         No additional hardware/software requirements need to be taken
·         The costs are way lower for the end client.
·         Large scale deployment is possible with the given set of solutions.
Please check the attached video for the same:
A more detailed video is attached herewith:

Example of predictive analytics with Turbodata: Example of predictive analytics-Turbodata

The ETL team has been inspired by the following management books:
·         ‘Profit Beyond Measure’ by Thomas Johnson and Anders Brohms
·         McKinsey Mind by Ethan Rasiel and Paul Friga.
·         Blue Ocean Strategy by W. Chan Kim and Renee Mauborgne

·         Better India, Better World by Narayana Murthy

Capturing Data Entry/Data Audit Errors

A number of times the end client types in wrong data into the source ERP system thereby resulting in wrong outputs and results. Junk inputs imply junk outputs.  The ETL team would recommend an auditable output from Turbodata to be used as part of the reporting purposes.  Wrong data inputs can impact the end client in one or more of the following ways:
  •        Wrong tax filing specifically in online scenario.
  •         Wrong business picture
  •         Wrong predictive analytics.
As per the Toyota ProductionSystem, bad inputs should not be processed further as it adds to the final costs.
The ETL team(my firm) has found the following errors with regards to the data entry inputs specifically with Tally ERP 9.0.  

·         Stock input has been in one godown but stock outward movement has been from other godowns:

·         Missing purchase or sales order entries resulting in negative stocks at given points in time. One cannot have negative stock balances at any point in time.

Other data input errors that we have commonly seen are as follows:

  •       Duplicate payment entries
  •      Duplicate sales entries
  •        Receipt note entries but no purchase invoice entries
  •         Payments not having the required bill reference numbers.
How to resolve the errors:
·         In an object oriented program it is difficult to catch the errors on a real time basis. The ETL team recommends using the relational databases for catching the errors. The real time extraction module for Turbodata should be used for the same.
·         Transferring the data onto the third normal database is recommended. This helps catch data duplicity based on the composite keys.
For example if an end client has made the same amount payment for a given voucher on a given fiscal date, then the same should come as part of the discrepancy report. It is possible that the end client could be correct. There is also a possibility that the payment entries have been made by 2 different resources. Further handling of the given situation is as follows:
·         If the end client desires to catch the following error then the username by which the data entries have been done shall not be added to the composite key. In such a scenario there is a discrepancy between the Turbodata ledger balance output and the Tally report. The end client to approve the discrepant entry before the data is input into the system for auditing purposes.
Using perpetual valuations for ledger and inventory instead of periodic valuations. For example if an end client relies on periodic valuations for ledger balances then a duplicate payment entry then the periodic balances at the end of the fiscal month are difficult to catch. For example if an end client has a duplicate entry of Rs. 100k(One hundred thousand  only) over a balance of say Rs. 15000k(One fifty million only).
However using the perpetual system it is easy to catch the data entry errors.

Matching the consolidated trial balances and closing stock balances at the database level with the on fly calculations at the software level.

A small story for the end user: as Yuval Harari is Sapians says that mankind is primarily driven by myths. Hence many a managers are driven by myths regarding software or the consulting companies having the right audit numbers(with the managers inputting junk numbers).
A small story from one of my favourite books(Raag Darbari by Srilal Shukla) could best illustrate the point.
The protagonist Ranganath had gone from the city to visit his relative, an aunt’s husband , in the village. During the course of the village fair, it was suggested that the group goes and sees the village temple for the local goddess. At the temple Ranganath found that the statue instead of been of a goddess was of a soldier( for a goddess he was looking for two lumps  in front and two lumps in the back). The priest asked for donations for the goddess. To this request Ranganath refused saying that the statue was not of a goddess but of a man. There was an ensuing scuffle between the villagers and Ranganath. Ranganath was eventually rescued by his cousin. On going out and meeting other people, the cousin mentioned the following:
"My cousin has come from the city and is very well read. That is why he talks like a fool."
The author has always associated himself with Ranganath.

Deployment of Turbodata for Retail company based out of Western India

Source system: multiple installation of Tally ERP 9.1.
Problem : The end client desired to have a custom installation of Turbodata based on the unique requirements of its business. The product shall be used for designing a custom web interface for customer interaction. The key tenets of the solution that differed from the standard deployment of turbodata were as follows:
·         Standard price list instead of weighted average or FIFO pricelist.
·         Closing stock value was to be calculated at a godown and item level.
·         The solution was to work across multiple locations seamlessly with maximum RAM usage been 1 GB for both initial and incremental data loads.
·         Custom masters extraction for item, stock group, category .
·         GST classification to be extracted for the end client.
Time duration of the project: 2 weeks.
Approach of the ETL team:
·         Choosing the appropriate set of attributes to be loaded based on the modular approach. That is the required fields to be loaded for ledger and inventory were chosen.
·         Custom extraction for the tables: The process of normalization helped in the same since the attribute is to be loaded only once.
·         Testing of initial and incremental data loads in terms of time and load on the system. The incremental data load process helped at reducing the time of data load.
·         Data cleansing: special characters were removed from the item names. Also separation of the numeric values from the character fields
·         Data consolidation: multiple types of voucher types were loaded onto the datawarehouse.

Project has been done successfully. Hereafter the end client shall go for a MVC interface over the datawarehouse for reporting and customer interaction purposes.

Facebook post:

Popular posts from this blog

Inventory optimization through Turbodata

Capturing Data Entry errors for audit purposes

Predictive analytics and forecasting at lower costs through data compression and sql reduction