At its recent user conference Netezza unveiled the Netezza Developer Network (NDN). What is it all about? Well, frankly, I have been waiting to write this article for a year, ever since Netezza described (confidentially) its development roadmap going forward.
What the NDN is leveraging is the introduction of support for user defined functions (UDFs) and user defined aggregates (UDAs). I am not going to talk about them in detail because no NDN products have actually been released yet (it shouldn't be too long), but hopefully I can explain why they are important.
UDFs are not, of course, a new concept, various vendors support them. However, whereas these are typically features of the database (say), in the case of Netezza UDFs and UDAs are embedded within the SPU (snippet processing unit) that takes streamed data off disk. I'd better explain what that means.
Typically, you have your BI application sitting on an application server in front of the data warehouse. When you initialise a query the warehouse extracts the relevant data and passes it back to the application sitting on that application server, where you can then process that data to do clever things with it. Now, with a UDF you don't do that: you process the data on the SPU so that all you have to do on the application server is organise the visualisation and presentation of the data. A good analogy here is to the trend in data mining over the last few years to embed scoring within the database. One of the main advantages of this is that it is much faster to score against algorithms within the database than to extract the information and then score it. What Netezza is doing is, in effect, going a step further: score the data as it is streamed into the appliance and before it even hits the database. So, UDFs and UDAs should provide a further step change in performance—in practice, talking to a number of the companies doing it—an order of magnitude improvement in performance.
However, it is not just the performance gain that is significant. This initiative means that developers (who can get a SPU box with 4 nodes in it for development purposes) are embedding analytic software into the Netezza Data Warehouse Appliance so that it becomes, in effect, an application appliance. These could be generic appliances for, say, data mining or they could be so-called Edge Appliances for things like real-time re-pricing or RFID analysis (other examples shown at the conference included geospatial analysis, XML-based analysis, video recognition technology and more) but in either case this potentially adds a significant additional route to market for Netezza through its various partners within the NDN.
On this note, I should add one caveat: despite their names, UDFs and UDAs are not really for users. While some of the members of the NDN are commercial (or government) customers of Netezza, they are very experienced users with strong development teams. Other members of NDN are companies like SAS and SPSS (which is embedding functions in such a way that there is no change to existing SAS programs).
Finally, the interesting question is whether any of Netezza's competitors can match this sort of capability? Certainly, some of them are already talking about Edge Appliances. However, their architectures are such that they have nothing equivalent to an SPU. You could embed functionality at the node level but this does not get you much further forward as this is typically an Intel processor running a version or subset of the database, which pretty much leaves you where you started. Moreover, Netezza automatically parallelises queries across SPUs and it will do the same thing with UDFs, which is a function you would have to design for other environments. Put simply, no other vendor has the hardware component (SPU) for you to embed a UDF or UDA in, so it's difficult to see them getting the sort of performance benefits I have alluded to.
The bottom line is that while everybody can think of clever ways of improving their performance on an ongoing basis, UDFs and UDAs will not only give Netezza a new channel to market but also a competitive advantage that its rivals are going to find hard to match.
We automatically stop accepting comments 180 days after a post is published. If you would like to know more about this subject, please contact us and we'll try to help.