Tuesday, April 1, 2008

What Are Data Warehouses? Looking Beyond The Obvious Definition

During my research on relational databases and database privacy issues, I kept getting search results for data warehouses and business intelligence. When I searched for the term, some searches gave me more technical information than I could digest. I used other resources to get a basic but official definition, but prior to that I just imagined that this was a SAT exam. Thus, I chose to use context clues in the abstract search results and my basic eye-balling of the word. Normally, I would call upon skills I acquired from some Latin in highschool to decompose the words, but I mean it seemed simple enough. So, I gather that it is basically information inventory (data) that is physically stored in a virtual warehouse. Although this inventory of data may be physically stored on computers and their harddrives, these physical storage systems are mobile unlike a physical warehouse of more tangible inventory. Data inventory in the sense of relocating an entire warehouse is more feasible via data transmission technologies than the transmission of tangible goods traditionally stored in warehouses. It sounds simple enough although some what contradictory.

Now, the following is the official definition that I got from Wikipedia, which seemed reasonable enough to prevent an instant headache.

Data Warehouses:
"A 'data warehouse' is a repository of an organization's electronically stored data. Data warehouses are designed to facilitate reporting and analysis. [1]

This classic definition of the data warehouse focuses on data storage. However, the means to retrieve and analyze data, to extract, transform and load data, and to manage dictionary data are also considered essential components of a data warehousing system. Many references to data warehousing use this broader context. An expanded definition for data warehousing includes tools for business intelligence, tools to extract, transform, and load data into the repository, and tools to manage and retrieve metadata (Wikipedia.com 02-Apr-08)."


After reading a simplified, but official definition then I realized the limitations of my definition of data warehouses. I discovered the limitations of my data warehouse definition rested with the purpose and implementation of data warehouses. I found an article called "The Case for Data Warehousing (Greenfield 1)" that helped me to understand what data warehouses are and why they are implemented.

What I gained from this article was that data warehouses can be massive storage for data about data or metadata in terms of storing data definitions in a database that represents a model for how data is to be used. They are stored separately from the operating system files so that data retrieval is faster.
During my research on relational databases and database privacy issues, I kept getting search results for data warehouses and business intelligence. When I searched a for the term, some searches gave me more technical information than I could digest. I used other resources to get a basic but official definition, but prior to that I just imagined that if this was a SAT exam. Thus, I chose to use context clues in the abstract search results and my basic eye-balling of the word. Normally, I would call upon skills I acquired from taking two years of Latin in highschool to decompose the words, but I mean it seemed simple enough. So, I gather is basically information inventory (data) that is physically stored in a virtual warehouse. Although this inventory of data may be phsically stored on computers and their harddrives, these physical storage systems are mobile unlike a physical warehouse of more tangible inventory. Data inventory in the sense of relocating an entire warehouse is more feasible via data transmission technologies than the transmission of tangible goods traditionally stored in warehouses. It sounds simple enough although some what contradictory.

Now, the following is the official definition that I got from Wikipedia, which seemed reasonable enough to prevent an instant headache.

Data Warehouses:
"A 'data warehouse' is a repository of an organization's electronically stored data. Data warehouses are designed to facilitate reporting and analysis. [1]

This classic definition of the data warehouse focuses on data storage. However, the means to retrieve and analyze data, to extract, transform and load data, and to manage dictionary data are also considered essential components of a data warehousing system. Many references to data warehousing use this broader context. An expanded definition for data warehousing includes tools for business intelligence, tools to extract, transform, and load data into the repository, and tools to manage and retrieve metadata (Wikipedia.com 02-Apr-08)."


After reading a simplified, but official definition then I realize the limitations of my definition of data warehouses. I discovered the limitations of my data warehouse definition were founded upon the purpose and implementation of data warehouses. I found an article called "The Case for Data Warehousing (Greenfield 1)" that helped me to understand what data warehouses are and why they are implemented.

What I gained from this article was that data warehouses can be massive storage for data about data or metadata in terms of storing data definitions in a database that represents a model for how data is to be used. They are stored separately from the operating system files so that data retrieval is faster. Basically, when companies decide to implement data warehouses they do so in hopes of improving the integrity, accuracy, and consistency of data and minimizing time required for processing database transactions. Overall, it is about database optimization.

Most companies who decide to implement a data warehouse do so to optimize the overall performance of their business by optimizing the management of data critical to their business or what mostly considered as business intelligence. The case for data warehouses can have serious implications on business if not executed successfully I gathered from the article, but mostly when I think about data modeling. I recall just how important that task was when I had perform it for information systems analysis and design. Each piece of data coming in and going out has to be processed, stored, and managed properly or the system of processes are virtually inefficient.

Still, I remain limited in my overall knowledge of subject. Therefore, I will simply say that I get the jist of what data warehouses are and their significance to business. I think that I may also have some good examples of data warehouses, but have no idea of how well they are being optimized. Perhaps, I can research two very popular ones (e.g. Social Security Administration and Credit Bureaus) to see what I can discover about how the data is being managed to create business intelligence.



Referencing Article:
http://www.dwinfocenter.org/casefor.html

No comments: