IT-Analysis.com
IT-Analysis.com Logo
Technology Big Data
Business Issues Channels Enterprise Services SME Technology
Module Header
Craig WentworthMWD Advisors
Craig Wentworth
16th April - Egnyte the blue touchpaper...
Louella FernandesLouella Fernandes
Louella Fernandes
11th April - Managed Print Services: Are SMBs Ready?
Louella FernandesLouella Fernandes
Louella Fernandes
11th April - The Managed Print Services (MPS) Opportunity for SMBs
Simon HollowayThe Holloway Angle
Simon Holloway
11th April - Intellinote - capture anything!
David NorfolkThe Norfolk Punt
David Norfolk
11th April - On the road to Morocco

Analysis

BI for hybrid (big) data
Philip Howard By: Philip Howard, Research Director - Data Management, Bloor Research
Published: 16th June 2011
Copyright Bloor Research © 2011
Logo for Bloor Research

Most companies exploring the use of big data for business intelligence purposes do not simply want to analyse unstructured data, they also want to combine the results of that analysis with relevant structured data. They want their analytics to span all sorts of data, which we may refer to as hybrid data.

Unfortunately, NoSQL data stores are not really suitable for storing (and therefore analysing) structured data while conventional data warehouses are not very good at storing (and therefore analysing) unstructured data. As a result, the architecture that is emerging from a data warehousing perspective is that you store unstructured data in something like Hadoop, do basic analysis work on that data and create summary information that can be passed to the formal data warehouse, where that information can be further analysed. You can either do this through direct integration between the different environments or by means of a federated query environment (such as Composite Software’s) that supports Hadoop.

For large organisations this approach makes sense, but smaller companies with smaller budgets may have an issue with such a potentially expensive solution. One alternative is to store all the information in a warehouse such as that provided by Aster Data (Teradata) or Greenplum (EMC), which support native MapReduce capabilities. However, there are potential scalability issues if you try and do this. The real problem is that conventional BI tools do not support the analysis of both structured and unstructured data within the same query—which is what you would really like to do. Instead, you have to use MapReduce on the one hand and some SQL-based tool on the other.

However, that does not mean that suitable hybrid tools do not exist. In particular, Endeca Latitude and Connexica (previously ArdentiaSearch) CXAIR, both support query capabilities that span structured and unstructured data. The two products have different implementations but the same basic philosophy, which is to extract structure from unstructured data and can then combine that with directly structured data, by means of indexes (search-based indexes not database style indexes). Both products are very easy to use (and special emphasis is placed by both companies on how easy it is for end users) and both have a focus on allowing users to explore the data rather than just reporting on it.

However,  they are rather different  when it comes to their approach to the market. Specifically, Latitude is aimed at companies that want to develop analytic applications to support exploration of hybrid data while CXAIR (which stands for ConneXica Ad-hoc Interactive Reporting) is more aimed at the traditional BI market, albeit that the product is being OEM’d by a number of third parties that have embedded the tool in their own products (in place of, for example, Crystal Reports). I expect to be writing more about Latitude and CXAIR in the future but to go back to my initial point it seems there is no one-size-fits-all solution to the problem of how to provide BI that spans hybrid data.

There is clearly a choice of warehousing architectures and, no doubt, the leading BI vendors will bolt on unstructured capabilities that will compete with the built-for-purpose technologies from Endeca and Connexica. Quite how all this plays out remains to be seen but if you are interested in hybrid-structured BI right now you should check out Latitude and CXAIR.

Advertisement



Published by: IT Analysis Communications Ltd.
T: +44 (0)190 888 0760 | F: +44 (0)190 888 0761
Email: