Many personnel believe that datawarehouse is simply a copy of the trasactional system. However there are some key differences between the transaction system and datawarehouse.
a.) Datawarehouse can be used for consolidation of data across multiple data sources. Datawarehouse can have two layers, one an ODS[operational data store] and the second is star schema layer[denormalization layer to minimize the joins]
Why is the normalization layer required?
a.) Normalization layer helps consolidate the data from multiple data sources in a single database layer.
b.) Used for incremental data loads. One does not need to do a full load after normalization layer. Only inserts, updates and deletes will do.
c.) The on fly calculations of the software are done in the normalized layer.
Thus the ODS layer is different from the transaction layer.
Why not simply use the transaction system code?
a.) The transaction system copy will have only the base data
b.) The on fly calculations of the software will not be there in the copy of the transaction system. The business logic would need to be worked out.
The purpose of star schema.
The purpose of the star schema is to reduce the joins. Lower the number of joins the faster shall be the speed of query processing.
What about snowflake schema? snowflake schema is normalization layer. It is done when the dimension tables are very large.
Where do we start?
Understand what is a datawarehouse and see if it fits into your scheme of things.
Name: Apoorv Chaturvedi
email: support@mndatasolutions.com;support@turbodatatool.com
Phone:+91-8802466356
No comments:
Post a Comment