IT-Analysis.com
IT-Analysis.com Logo
Technology Big Data
Business Issues Channels Enterprise Services SME Technology
Module Header
Louella FernandesLouella Fernandes
Louella Fernandes
22nd April - Internet of Things: A New Era for Smart Printing?
Simon HollowayThe Holloway Angle
Simon Holloway
18th April - Virgin Media expose private email addresses
Craig WentworthMWD Advisors
Craig Wentworth
17th April - Box's enterprise customers step forward to be counted
Craig WentworthMWD Advisors
Craig Wentworth
16th April - Egnyte the blue touchpaper...

Analysis

What Made Big Data Possible?
[No Image] By: Robin Bloor, Co-Founder, The Bloor Group
Published: 28th March 2013
Copyright The Bloor Group © 2013

We would have been processing Big Data long ago had it been possible. There have been many factors that contributed to making it possible. I'll list the technical factors in the order in which they come to mind:

  1. Scale-out database technology: Scale out architectures have been around for quite a while, so there's more to this than just the appearance of column store databases, but that's part of it. The point is that the old relational database architecture that depended on a row-oriented storage of data didn't scale out well. When the column store databases (Netezza, Vertica, etc.) appeared, it was possible to scale out to large data volumes and still deliver performance. Larger volumes of data could be accessed quickly.
  2. Hadoop: First Hadoop was Open Souce and free, second it had scale-out parallelism built in. Naturally companies began to experiment with it. With the advent of UNIX we lost the availability of key-value stores which were always useful. With Hadoop, a key-value store, the capability returned. Hadoop also became an ecosystem. Not particularly fast for most of what it does, but very versatile for anything that smacked of Big Data.
  3. The Cloud: It took a while for cloud providers to think of Big Data as an opportunity, but they cottoned on. The point was that you could assemble a grid of servers much more quickly than if you bought physical servers, and it might be cheaper too. There was still the problem of getting large volumes of data into the cloud, but it wasn't an insuperable barrier. The cloud enables Big Data applications, but not the biggest.
  4. Hardware: The fall in hardware costs (per unit of power) continues apace. This means smaller grids of servers for Big Data of any given volume. It's not just multicore processors, it's also the fall in memory costs and more configurable memory, plus flash storage and also hybrid flash (see here for an example) and also, software that enables the use of grids. See also Big Data and In-Memory: Are They Related? for a discussion on the relevance of in-memory to data.
  5. Data Analytics evolution: Most Big Data applications are analytics applications. There has been an explosion in software in this area, including the advent of Mahout and KNIME as Open Source capabilities and the explosion in the use of the R language. All of these work with Hadoop.

Those are probably the major factors but they have conspired to create a vibrant technology ecosystem, so many products have piled in on top of these to amplify the noise around Big Data and its capability.

Oh, and there's usually business advantage to be gained in mining those terabytes and petabytes, but there was always business advantage in data mining.

Advertisement



Published by: IT Analysis Communications Ltd.
T: +44 (0)190 888 0760 | F: +44 (0)190 888 0761
Email: