IT-Analysis.com
IT-Analysis.com Logo
Enterprise SME Business Issues Technology Services Channels
Module Header
David TebbuttTeblog
David Tebbutt
19th November - Collaboration: the old way. Why not?
Martin BanksBanks Statement
Martin Banks
18th November - This Cloud has a silver lining
Peter AbrahamsAbrahams Accessibility
Peter Abrahams
18th November - Major new accessibility features in Firefox 3.0.4
Martin BanksBanks Statement
Martin Banks
17th November - Psychology of data ownership may be changing at last
Tony LockFreeform Comment
Tony Lock
16th November - Clouds yet to fill the IT skies
Module Header
Q. How many email addresses do you have?
 
  • addtomyyahoo4
  • Subscribe in NewsGator Online
  • Add to My AOL
  • Subscribe with Bloglines
  • Add to netvibes
  • Add to Google
Blogs > Philip Howard
Reference data, Master data and conundrums
Philip Howard By: Philip Howard, Research Director - Data Management, Bloor Research
Published: 31st August 2006
Copyright Bloor Research © 2006
Logo for Bloor Research

This blog is about conundrums. Or should that be conundra? I guess it depends on the derivation: if it's from Latin then it should be conundra but otherwise not. Unfortunately, my dictionary says that the origin has been lost, which doesn't help. It does, however, say that the original meaning of the word was as a crotchet monger. What's a crotchet monger? Presumably somebody who sells crotchets. But what's a crotchet? An angry old man? A musical note? Further research suggests that a crotchet was a whimsical device. And talking of whimsical devices did you see about the CEO of the Irish IT company that claims to have invented a perpetual motion machine? There's a joke there somewhere but not a conundrum.

And talking about conundrums, here's one: I have been re-reading (for reasons we need not go into) some sixties science fiction and have found something odd. The authors include Clifford Simak, Poul Anderson, Michael Moorcock (“The Black Corridor” not the fantasy Elric stuff) and Robert Heinlein so we're not talking about lightweights here. Between them they predict faster-than-light travel, portable atomic devices, wall-sized TVs, microwave ovens, voice-controlled devices, various computer-based automation and sundry other gadgets. But they have one thing in common: they all failed (at least in the books I have re-read to date) to predict the use of the screen as an output device for a computer—instead, they all refer to paper output only. Indeed, a couple of them still hint at punched tape.

So, the conundrum is this: why the blind spot? It shouldn't have been too difficult to marry the concept of the TV with the computer—I recall a conversation I had in the late seventies or early eighties discussing the possibility of downloading music via the phone onto a computer, which seems sort of comparable. I think it would be a useful area to analyse: what is it about potential inventions that we can envisage and what do we find difficult? If we knew that we could spend time looking at the latter and perhaps get better at it.

Anyway, enough of that, here's another conundrum or, at least, an issue: what is the difference between reference data and master data, and does it matter? I think I can assume that you know what master data looks like, so here are some examples of reference data. One classic is the definition of “profit”. I once ran across a company that, when it looked into the issue, found that it had 14 different definitions of profit in use across the organisation. Some of these were valid (profit before tax and after, and definitions dependent upon local laws) but many were not. For obvious reasons any organisation needs to be absolutely clear about how you define such things as profit. So, you create a reference definition that can be accessed by any corporate application, as required. Another, and rather different, example of reference data might be the dimensions of a loading bay. However, the point is that these dimensions define the loading bay in much the same way that “sales less costs” defines (simplistically) profits.

This then, is the difference between reference data and master data: reference data is definitional whereas master data is descriptive. Typically, reference data changes rarely if at all and typically (but not always) it is non-numeric. On the other hand, master data is usually changeable and often expressed as a value.

Finally, as to whether the difference matters: that depends. Clearly, you could treat reference data as a specialised subset of master data (or, actually, the reverse) but you can also treat them separately. In particular, because reference data hardly ever changes you can simply store in it a file, which can be accessed by relevant applications via a web service. If you use a database table then you could replicate the reference data around the organisation. In either case, the deployment of reference data is not as complex as master data (you don't need synchronisation capabilities for example), particularly if you prefer a hub-based approach to the latter, and it is therefore cheaper and simpler.

Reader Comments

We are no longer accepting comments against this item. We suggest contacting the author directly.

8th May 2007: 'J. B. Ryan' said:

Your description of the definition of Profit sounds like classic Metadata not reference data. I interpret reference data as simply code description where the code itself is an attribute that is associated with a master file key value. For example a customer with an id of 12345 is associated with a business type id of 2120. The customer id and business type association are master file components. The descriptions of the customer id 12345 as WalMart and the business type id 2020 as Mass Merchandiser is reference data.

Reply to J. B. Ryan?

Advertisement



Published by: IT Analysis Communications Ltd.
T: +44 (0)203 051 5760 | F: +44 (0)870 345 9922
Email: