|
|
Data Warehousing >
Tools >
Metadata Tool Selection
Buy vs. Build
Only in the rarest of cases does it make sense to build a metadata tool
from scratch. This is because doing so requires resources that are intimately
familiar with the operational, technical, and business aspects of the data
warehouse system, and such resources are difficult to come by. Even when such
resources are available, there are often other tasks that can provide more
value to the organization than to build a metadata tool from scratch.
In fact, the question is often whether any type of metadata tool is needed at all.
Although metadata plays an extremely important role in a successful data warehousing
implementation, this does not always mean that a tool is needed to keep all the
"data about data." It is possible to, say, keey such information in the
repository of other tools used, in a text documentation, or even in a presentation
or a spreadsheet.
Having said the above, though, it is author's believe that having a solid
metadata foundation is one of the keys to the success of a data warehousing
project. Therefore, even if a metadata tool is not selected at the beginning of
the project, it is essential to have a metadata strategy; that is, how metadata in
the data warehousing system will be stored.
Metadata Tool Functionalities
This is the most difficult tool to choose, because there is clearly no standard. In fact, it might be
better to call this a selection of the metadata strategy. Traditionally, people have put the data
modeling information into a tool such as ERWin and Oracle Designer, but it is difficult to extract
information out of such data modeling tools. For example, one of the goals for your metadata selection
is to provide information to the end users. Clearly this is a difficult task with a data modeling tool.
So typically what is likely to happen is that additional efforts are spent to create a layer of
metadata that is aimed at the end users. While this allows the end users to gain the required insight
into what the data and reports they are looking at means, it is clearly inefficient because all that
information already resides somewhere in the data warehouse system, whether it be the ETL tool, the
data modeling tool, the OLAP tool, or the reporting tool.
There are efforts among data warehousing tool vendors to unify on a metadata model. In June of 2000,
the OMG released a metadata standard called
CWM (Common Warehouse Metamodel), and some of the vendors
such as Oracle have claimed to have implemented it. This standard incorporates the latest technology
such as XML, UML, and SOAP, and, if accepted widely, is truly the best thing that can happen to the
data warehousing industry. As of right now, though, the author has not really seen that many tools
leveraging this standard, so clearly it has not quite caught on yet.
So what does this mean about your metadata efforts? In the absence of everything else, I would
recommend that whatever tool you choose for your metadata support supports XML, and that whatever other
tool that needs to leverage the metadata also supports XML. Then it is a matter of defining your DTD
across your data warehousing system. At the same time, there is no need to worry about criteria that
typically is important for the other tools such as performance and support for parallelism because the
size of the metadata is typically small relative to the size of the data warehouse.
Copyright © 2012 1keydata.com All Rights Reserved Privacy Policy
|