One thing which has struck me for some time is the degree of overlap between the master data management (MDM) and data quality markets. Certainly every MDM project has a DQ component, although the converse may not be true. Many of the MDM platform vendors have some form of data quality component, often achieved by an OEM or partnership with one or more data quality vendors. A recent example of this is Kalido's OEM of the Netrics matching engine. Such vendors have started at the "top" of the functionality stack and are gradually filling out some of the lower levels.
It is unusual, though, to see a data quality vendor moving up the stack into the MDM market. Global IDs is an interesting company which, although founded in 2001, have taken some time to become very visible in the market. The New York-based company has concentrated on building up a wide ranging set of technology in this area, initially in the data analysis and profiling area, but gradually extending this up the stack. The have a small but impressive customer list, and are used by companies with very large and complex data management problems, such as D&B, Merck, McGraw Hill and Nomura. The highly parallel architecture of the product, based on "agents", is designed for high volume operation where hardware can be efficiently thrown at the problem of performance.
In a recent demonstration I was particularly impressed with the depth of functionality in their data analysis tools. They work with companies by identifying all of the data entities in the application landscape, which usually number in the hundreds of thousands or more. This is done through a series of "scanners" for various data sources. Advanced profiling is then done based on this analysis, and goes much further into relationship discovery than basic statistical profiling. They then work to identify the semantic meaning associated with the data, allowing it to be grouped into logical buckets, and the tool can construct a tentative logical model from this initial phase. The vendor provides further tools to allow rules to be developed which can aid in the de-duplication and integration of this data.
The toolset is ambitious, covering the full gamut from data movement, right up to a master data repository and data governance. In some cases they sensibly OEM best-of-breed solutions, such as the Netrics matching engine, and can co-exist with existing infrastructure such as Informatica. Known mainly for their advanced data analysis and profiling technology, their technology does extend right through the spectrum of MDM, though some parts are clearly more mature than others. The company should do more to publicise what seems like an interesting and ambitious piece of technology.
It is encouraging to see more entrants to the master data management space, and in this case is evidence of the increasing convergence between MDM and data quality. Although currently treated as two distinct markets, there are more links occurring as organisations realise that data quality is an inherent and critical part of any MDM project.