Let's start with a question that a lot of people are wondering about. What is Big Data? First, I want to say that Big Data is a Big Deal. While technology has fueled the engines of transformation for the past few decades, data will fuel the engines of transformation for the next few. But, don't worry. I am not going to go off on a philosophical rant. Let's get right down to the brass tacks.
There is a lot of hype surrounding Big Data and a lot of misuse of the term. There are many definitions of Big Data none of which are particularly satisfying. But I will make use of two of my favorites.
Many people define Big Data in terms of three V's which are volume, variety and velocity. This means that Big Data is a huge amount (volume) of complicated data (variety) coming at you very fast (velocity). I have read papers and seen presentations where more V's are added. For example, veracity and value are popular as well. And both of these V's raise important issues. But, they are not central to the essence of Big Data.
Another definition that I like is that Data is big when it cannot be processed using traditional relational database technology. Relational databases require information to be highly structured (i.e. anti-variety) and the transaction models used to update the database have limitations on transactions (or updates) per second (i.e. anti-volume and anti-velocity).
It is probably best to think of Big Data as large volume of raw material from which data products can be made. These data products, in turn, can be used to make decisions which create value for a company (another V). These decisions can be large strategic decision or small individual decisions. A problem with Big Data is that it is unclear what it refers to. This is the veracity problem (yes, I snuck another V in there). Until it is tamed (i.e. we know what it refers to) it is difficult to use it in decisions.
Note that Big Data is largely defined by the amount of it. If there were a gigantic improvement in processing power, say parallel or quantum computers, which led to computers tens of thousands of times faster, there would no longer be such a thing as Big Data. It would just be data. Unlike relational databases which contain a particular kind of data (categorical) Big Data is largely defined by the amount and messiness of it both of which lead to processing constraints.
Should you be concerned with Big Data? As I said in the first paragraph, Big Data is a Big Deal. However, there is a lot of very valuable data that does not rise to the level of Big Data. If you are not yet doing everything you can with your Not So Big Data (Terabytes and less), it makes more sense to focus on that first. Once you are getting all the value you can from that, it would be appropriate to start taking on Big Data.
Showing posts with label emerging technologies. Show all posts
Showing posts with label emerging technologies. Show all posts
Wednesday, December 17, 2014
Monday, December 15, 2014
Making Sense of Information Technology
When I first started in Information Systems, more years ago than I care to admit, there were only a few technologies we had to worry about. There were operating systems, teleprocessing monitors, databases, applications and programming languages. Everybody knew how to program and everybody specialized in one of the preceding other four. It was still daunting, but nothing like it is today.
Since then we have had to adjust to personal computers, networks, artificial intelligence, web technologies, social interaction technologies, mobile devices, and more new programming and scripting languages than I even want to think about. But, as if that were not enough to worry about, we now have analytics, and big data to contend with. And on the horizon we have virtual worlds, video games, drones, a resurgence of artificial intelligence. A bit further off we have complexity theory and agent based modelling threatening to change a game that has already changed so many times that it can hardly even be considered the same game. This list, by the way, is by no means comprehensive. I am doing this off the top of my head. So I apologize if I have left out your pet emerging technology.
How does one keep up with all this stuff? How does one know what to be concerned about and what to ignore? I routinely hear people confusing Big Data with Analytics or Relational Databases with Data Warehousing. Most people know that Facebook is a Social Interaction Technology but what about YouTube and Wikipedia? And what is the difference between a Wiki and Wikipedia. While we are at it, what is the difference between a wiki, a blog and a forum? What is the difference between a web server and a web service? If your business had $10,000 to play around with an emerging information technology which one would it be? What about $100,000 or a million?
My biggest challenge since those salad days of mainframes has been to keep up with emerging technologies. And, in the process, I have learned a few things and learned a few tricks. I routinely explain things like this in my classes. So, I thought I would create a blog to reach a wider audience. This is not my first blog. In fact I have many. But I love to write and I love to figure things out. When I can figure things out and write about them, that is as good as it gets.
I should warn you, upfront, about my eratic blogging habits based on the other blogs that I have created. I write when and where I feel like it because I do my best work that way. Often, I will post a flurry of pieces to a blog and then ignore it for a while while I use other outlets for my writing. Eventually, I will come back and write some more. My goal with this blog will be to post something of interest every week or two on the average. So, if this look interesting, please book mark it or follow it. I also have a twitter account @DrJohnArtz which you can follow. The only thing I post to the twitter account is when there are new postings to a blog that has been fallow for a while. So, I won't fill your inbox with tweets about what I had for breakfast.
Since then we have had to adjust to personal computers, networks, artificial intelligence, web technologies, social interaction technologies, mobile devices, and more new programming and scripting languages than I even want to think about. But, as if that were not enough to worry about, we now have analytics, and big data to contend with. And on the horizon we have virtual worlds, video games, drones, a resurgence of artificial intelligence. A bit further off we have complexity theory and agent based modelling threatening to change a game that has already changed so many times that it can hardly even be considered the same game. This list, by the way, is by no means comprehensive. I am doing this off the top of my head. So I apologize if I have left out your pet emerging technology.
How does one keep up with all this stuff? How does one know what to be concerned about and what to ignore? I routinely hear people confusing Big Data with Analytics or Relational Databases with Data Warehousing. Most people know that Facebook is a Social Interaction Technology but what about YouTube and Wikipedia? And what is the difference between a Wiki and Wikipedia. While we are at it, what is the difference between a wiki, a blog and a forum? What is the difference between a web server and a web service? If your business had $10,000 to play around with an emerging information technology which one would it be? What about $100,000 or a million?
My biggest challenge since those salad days of mainframes has been to keep up with emerging technologies. And, in the process, I have learned a few things and learned a few tricks. I routinely explain things like this in my classes. So, I thought I would create a blog to reach a wider audience. This is not my first blog. In fact I have many. But I love to write and I love to figure things out. When I can figure things out and write about them, that is as good as it gets.
I should warn you, upfront, about my eratic blogging habits based on the other blogs that I have created. I write when and where I feel like it because I do my best work that way. Often, I will post a flurry of pieces to a blog and then ignore it for a while while I use other outlets for my writing. Eventually, I will come back and write some more. My goal with this blog will be to post something of interest every week or two on the average. So, if this look interesting, please book mark it or follow it. I also have a twitter account @DrJohnArtz which you can follow. The only thing I post to the twitter account is when there are new postings to a blog that has been fallow for a while. So, I won't fill your inbox with tweets about what I had for breakfast.
Subscribe to:
Posts (Atom)