Thursday, January 15, 2015

What is a Data Warehouse?

A Data Warehouse is, generally, a large repository of structured historical data. This definition was carefully constructed because there are two prevailing and competing views of data warehouses, and I wanted the initial definition to cover both of them. To understand the distinction, consider the following anecdote.

Several years ago, when I was teaching a course on data warehousing, a student in our program who was currently a data warehousing practitioner, came in my office and said,

"I hear you are teaching a course on data warehousing."

"Yes," I replied, "are you interested in taking it?"

"Well, I wanted to know," she continued, "are you an Inmonite or a Kimballite?"

From the mouths of practitioners, therein lies the difference.

Bill Inmon offers a view of data warehousing as a large repository of historical data derived from source transaction processing systems. This historical data can be analyzed and studied in support of important business decisions.

Ralph Kimball, on the other hand sees the data warehouse as a collection of historical data designed and collected to model measurable business processes.

Most people involved in data warehousing adhere to either the Inmon view or the Kimball view. Many who do so, do it unknowingly.

In the next few posts, I will elaborate on the differences.

No comments:

Post a Comment