Wednesday, December 17, 2014

What is Big Data?

Let's start with a question that a lot of people are wondering about. What is Big Data? First, I want to say that Big Data is a Big Deal. While technology has fueled the engines of transformation for the past few decades, data will fuel the engines of transformation for the next few. But, don't worry. I am not going to go off on a philosophical rant. Let's get right down to the brass tacks.

There is a lot of hype surrounding Big Data and a lot of misuse of the term. There are many definitions of Big Data none of which are particularly satisfying. But I will make use of two of my favorites.

Many people define Big Data in terms of three V's which are volume, variety and velocity. This means that Big Data is a huge amount (volume) of complicated data (variety) coming at you very fast (velocity). I have read papers and seen presentations where more V's are added. For example, veracity and value are popular as well. And both of these V's raise important issues. But, they are not central to the essence of Big Data.

Another definition that I like is that Data is big when it cannot be processed using traditional relational database technology. Relational databases require information to be highly structured (i.e. anti-variety) and the transaction models used to update the database have limitations on transactions (or updates) per second (i.e. anti-volume and anti-velocity).

It is probably best to think of Big Data as large volume of raw material from which data products can be made. These data products, in turn, can be used to make decisions which create value for a company (another V). These decisions can be large strategic decision or small individual decisions. A problem with Big Data is that it is unclear what it refers to. This is the veracity problem (yes, I snuck another V in there). Until it is tamed (i.e. we know what it refers to) it is difficult to use it in decisions.


Note that Big Data is largely defined by the amount of it. If there were a gigantic improvement in processing power, say parallel or quantum computers,  which led to computers tens of thousands of times faster, there would no longer be such a thing as Big Data. It would just be data. Unlike relational databases which contain a particular kind of data (categorical) Big Data is largely defined by the amount and messiness of it both of which lead to processing constraints.

Should you be concerned with Big Data? As I said in the first paragraph, Big Data is a Big Deal. However, there is a lot of very valuable data that does not rise to the level of Big Data. If you are not yet doing everything you can with your Not So Big Data (Terabytes and less), it makes more sense to focus on that first. Once you are getting all the value you can from that, it would be appropriate to start taking on Big Data. 


3 comments:

  1. You have certainly explained that Big data analytics is the process of examining big data to uncover hidden patterns, unknown correlations and other useful information that can be used to make better decisions..The big data analytics is the major part to be understood regarding Hadoop Training Chennai program. Via your quality content i get to know about that in deep. Thanks for sharing this here.

    ReplyDelete
  2. Learning new technology would give oneself a true confidence in the current emerging Information Technology domain. With the knowledge of big data the most magnificent cloud computing technology one can go the peek of data processing. As there is a drastic improvement in this field everyone are showing much interest in pursuing this technology. Your content tells the same about evolving technology. Thanks for sharing this.

    Hadoop Training in Chennai | Best hadoop training institute in chennai | Big Data Hadoop Training in Chennai | Hadoop Course in Chennai

    ReplyDelete
  3. SAS stands for statistical analysis system which is a analysis tool developed by SAS institute and with the help of this tool data driven decisions can be taken which is helpful for the bsuiness.
    SAS training in Chennai | SAS course in Chennai | SAS training institute in Chennai

    ReplyDelete