You may have heard people say, “in the last 2 years, we have generated more data then all the years previous.” It is globally true and almost always true for corporations as well, but what does this mean to your company?

In 2010 we passed a hallmark of relevant stored data, the zettabyte mark — that is equal to a trillion   gigabytes. Our interconnected, location aware world is driving the extreme growth in data.  And it is only the beginning.

By 2020, mobile devices, social media, banking, entertainment, sensors and more will result in >1 Billion terabytes of data. Think 1.5 billion personal computers of relevant data, or 619 billion iPads. That is equivalent to each person on the planet having 100 iPads of data.

Walmart alone had 1.5 Billion RFID sensors in 2005 and in 2012, that grew to 30 Billion. The New York Stock Exchange produces a Terabyte of data each trading day. And Twitter generates 12 Terabytes per tweeting day.

Perhaps governments will need to sort through this much live data to track Snowden’s current location, cyber attacks from Anonymous and potential terrorist attacks.  But realistically, the average enterprise has only a few petabytes of relevant data to manage and analyze. A meager petabyte of data is still a huge challenge.

Mike Gualtiere of Forrester states that "Big Data is the frontier of a firm's ability to store, process, and access (SPA) all of the data it needs to operate, make decisions, reduce risks, and serve customers."

For most enterprises that means analyzing the following sources of data:

  • Application Data (ERP, CRM, HR, Custom) – Generally falls into traditional BI category with trending information on structured data – this could be in house or in the cloud.
  • Service Desk Data – Understanding your customer experience
  • WWW Data – Consumer behavior on your website and how users come to your site, where they prioritize their time, how long they stay, etc.
  • Log Data – Separate log data for the WWW and your internal IT Infrastructure.
  • Web Services – This could be anything and everything that matters
  • Location Data – Where your consumers are located
  • Social Data – Generally used for sentiment analysis, marketing campaign analysis, target marketing
  • Sensor Data – If you have RFID sensors, meters, or any other device that collects relevant data, this data can tell any number of stories. The future of retail, homes, outdoor activity, transportation, supply chain, healthcare, and much more will have a lot of sensor data.
  • API Data – If you exposes software as a service and allow 3rd parties to connect to your systems, this data helps you to understand how your platform is used.

If you want to know how much Big Data is relevant to you start by mapping out your data sources and then calculate how much data you generate from these sources on average (daily, weekly, or monthly). Ask yourself the question, how much of the data is relevant? Do you need the last 6 months or the last 5 years to answer the relevant question you want answered.

Let me know, how much data is relevant to you?