The data warehouse appliance market has become a crowded space, with around a dozen vendors of all sizes now offering specialist databases designed for high-speed analytic processing of large data warehouses. Teradata laid the groundwork for this market (although there were precursors to it, such as Red Brick and Britton-Lee) but the last few years have seen an explosion of interest. Many of these newer vendors are stronger on technology than they are on customer references, so it is interesting to see the progress of Vertica. From a company with just one customer in 2006, there are now 86 companies using Vertica's technology, with 30 of these signing up in the first half of 2009 alone.
Vertica uses a columnar approach (like Sybase IQ) with a highly parallel MPP approach. Its largest customers have been, as is often the case with appliances, in financial services and Telco (JP Morgan and Verizon are customers), but recently it has seen an upsurge in clients in the marketing analytics verticals, with companies like Experian and Unica. The largest implementations of Vertica today are over 200 TB in size, with several heading over 500 TB by year end (one on a 19 node cluster).
The latest version of Vertica's technology is version 3.5, in early release right now and in general availability later in the year. The most interesting new feature (called "Flexstore") is around improved storage management. In particular there is now the ability to group logically related columns together e.g. in financial services, a bid, ask price and date are usually all accessed together, so it can make sense to treat these as a logical unit, so reducing I/O. The Vertica technology now detects the speed of the disks available to it within its catalog, and attempts to place the most frequently-searched data on the fastest disks, though a human being can also prioritise columns. Another smart idea is to take advantage of the backup data used for high availability in case of hardware failure. Rather than just be a pure copy of the original, the copy or copies can be structured in different sequences, and the optimiser can take advantage of this for certain queries.
There are assorted other goodies in the new release, but the new storage management shows the company's increasing maturity in its R&D as customers start to put it in new and ever more challenging situations. The rapid commercial progress of the company is impressive, and in some ways the current economic storm may offer a silver lining to appliance vendors, as companies more actively seek out cost savings from their large data warehouses.