What is Data Warehouse ?

Data Warehouse is a collection of data designed to support management decision making. The primary  goal of a data warehouse is providing access to the data of an organization, data consistency, capacity to  separate and combine data, inclusion of tools set to query, analyze and present information, publishing  user data, driving business engineering etc. 

A data warehouse essentially combines information from several sources into one comprehensive  database.

For example, in the business world, a data warehouse might incorporate customer information  from a company's point-of-sale systems (the cash registers), its website, its mailing lists and its comment  cards. Alternatively, it might incorporate all the information about employees, including time cards,  demographic data, salary information, etc. 

By combining all of this information in one place, a company can analyze its customers in a more holistic  way, ensuring that it has considered all the information available.

Data warehousing also makes data  mining possible, which is the task of looking for patterns in the data that could lead to higher sales and  profits. 

The collection of data used by data warehouse may be characterized as subject-oriented, integrated, non volatile and time-variants. 

  1. Subject Oriented 
  2. Integrated
  3. Time Variant
  4. Non-Volatile 
  5. Multiple Sources  
  6. Diverse Sources 
  7. Diverse Formats  

1. Subject Oriented 

Data is arranged and optimized to provide answer to questions from diverse functional areas. 

Data is organized and summarized by topic like Sales/Marketing/Finance/Distribution etc. 

2. Integrated

The data warehouse is a centralized, consolidated database that integrates data derived from the  entire organization. 

3. Time Variant 

The Data Warehouse represents the flow of data through time. It contains projected data from  statistical models. Data is periodically uploaded then time-dependent data is recomputed. 

4. Non-Volatile 

Once data is entered it is NEVER removed. It represents the company’s entire history–Near term  history is continually added to it.

It is always growing and must support terabyte databases and  multiprocessors. It is Read-Only database for data analysis and query processing 

Data Warehouse Architecture 

The main benefits of implementing a data warehouse are cost-effective decision-making, better business  intelligence, enhanced customer service, business re-engineering, information system re-engineering etc.