META
DATA:
THE
KEY
TO
DATA
WAREHOUSE
DESIGN
( ENSE623
PROJECT
PROPOSAL
)
ABSTRACT
"The data warehouse concept sprang from the growing competitive need
to quickly analyze business information." Typical relational databases which were designed for on-line
transactional processing (OLTP) do not meet the requirements for effective on-line analytical processing (OLAP). As
a result, data warehouses are designed differently then traditional relational databases. As the value of data
warehouses and
their associated OLAP
capabilities have increased, comprehensive meta data management has proven to be a vital element to the
success or
failure of data warehouses. Without accurate meta data, a data warehouse rapidly becomes unmanageable and
ineffective.
Meta data is literally "data about data". It describes the kind of
information in the
warehouse, where it stored, how it relates to other information, where it comes from, and how it is related to the
business. The topic of standardizing meta data across various products and applying a systems engineering
approach to this process in order to facilitate data warehouse design is what this
project intends
to address.
PROPOSAL
- What will the system do?
A data warehouse stores current and historical data from disparate operational systems (i.e. transactional
databases) into one consolidated system where data is cleansed and restructured to support data analysis. The meta
data system we intend to create will provide a framework to organize and design the data warehouse. It will be used
initially to help define the data warehouse requirements and it will them be used iteratively during the life of
the data warehouse to update and integrate new dimensions of the warehouse.
- What are the systems issues and challenges in the project?
In order to be effective, the user of the data warehouse must have access to meta data that is accurate and up to
date. Without a good source of meta data to operate from, the job of the analyst is much more difficult and the work
required for analysis is compounded significantly.
Understanding the role of the enterprise data model relative to the data warehouse is critical to a successful
design and implementation of a data warehouse and its meta data management. Some specific challenges which involve
the physical design tradeoffs of a data warehouse include:
- Granularity of data - refers to the level of detail held in the unit of data
- Partitioning of data - refers to the breakup of data into separate physical units that can be handled
independently.
- Performance issues
- Data structures inside the warehouse
- Migration
- Why is the system needed?
For years, data has aggregated in the corporate world as a by-product of transaction processing and other uses of
the computer. For the most part, the data that has accumulated has served only the immediate set of requirements
for which is was initially intended. The data warehouse presents an alternative in data processing that represent
the integrated information requirements of the enterprise. It is designed to support analytical processing for the
entire business organization. Analytical processing looks across either various pieces of or the entire enterprise
and identifies trends and patterns otherwise not apparent. These trends and patterns are absolutely vital to the
vision of management in directing the organization.
Meta data itself is arguably the most critical element in effective data management. Effective tools must be used
to make use of the meta data generated by the various systems.
- Who will use it?
The users of both the data warehouse and its associated meta data are at all levels of an organization: managers,
DSS (decision support system) analysts, programmer analysts, developers and planners.
- What other related work has been developed?
Many consulting firms are attempting to create automated meta data systems which will document meta data while a data
warehouse is being created. However, it remains to be seen whether or not this process can be completely
automated. An organization called the Metadata Coalition has been
organized to standardize meta data and improve the interchange between different systems as more and more
data warehouses are created.
- Related Links:
AUTHORS:
Patrick Burton
Stephanie Green
[Back to Project Home Page]