Being on several conferences and fairs this year, I came to realize that a lot of people have quite an arbitrary understanding of what is meant by Big Data. The term becomes a buzzword and lots of people use it in their pitches when they explain their business models.
Some talk about Big Data as all the different customer data that they collect. Some others are using the term when talking about streaming real-time data into their platforms. In less sophisticated environments Big Data is broken down to a specific number like 100 Terabytes or 1-2 petabytes. However, what is Big Data now specifically?
Data is creating new jobs and changes our economy. A study by McKinsey Global Institute predicts that by next year the U.S. alone will fall short of nearly 200.000 jobs in this sector. Hence, the need for knowing Big Data is increasing.
This article shares a more scientific definition and understanding of Big Data. In general, we could say that the term Big Data is evolutionary. With the constant increase in computational capabilities, the amount of data that is handable is also changing.
Historically, a one terrabyte data warehouse used to be big data. However, nowadays have data warehouses that re able to store petabyte of data. We have analytic tools that can handle huge amounts of data.
In general, Big Data always expressed some sort of overwhelming amount of data. Something that can not be handled by common information technologies. During my studies I came across a definition I am more fond of.
The 3-5 Vs of Big Data
The 5 Vs are more accurate in describing this overwhelming amount of data. They characterize Big Data in a way that makes it easier to understand when data becomes big.
Variety, Volume and Velocity
The main three are variety, volume, and velocity. Variety describes the different forms this data can have. We have unstructured and semi-structured data that became as strategic as traditional structured data.
Volume referes to all the different sources that data can come from. All of these streams need to be captured and they need to be stored over longer periods.
Finally, the velocity describes the speed that this data is coming in. If we think about machine data, like for example sensor data of cars, we get new inputs every millisecond.
The three Vs are explained by Hugh Watson in his “Tutorial: Big Data Analytics: Concepts, Technologies, and Applications.
The two additional Vs
While the three previous vs describe the basic characteristics of this data. The following two Vs are explaining the importance of filtering and handling the data appropriately.
Even if we want to capture everything we can, we still need to ensure that we can trust the data source. Hence, the Veracity of the data needs to be ensured.
Additionally, we need to be able to extract strategic information from this data. Ultimately we need to provide Value.
My series of posts is about making you think a little deeper about every day concepts. I look forward to having you follow along and reading what you throw at me.
Peace!
Twitter: @tkronsbein
Instagram: @tizian_kronsbein
Website: www.tiziankronsbein.com
References:
Watson, Hugh J. (2014) "Tutorial: Big Data Analytics: Concepts, Technologies, and Applications," Communications of the Association
for Information Systems: Vol. 34, Article 65.
Available at: http://aisel.aisnet.org/cais/vol34/iss1/65
Manyika, J., M. Chui, B. Brown, J. Bughin, R. Dobbs, C. Roxburgh, and A.H. Byers (2011) “Big Data: The Next
Frontier of Innovation, Competition, and Productivity,” McKinsey Global Institute, May.
http://www.mckinsey.com/Insights/MGI/Research/Technology_and_Innovation/Big_data_The_next_frontier_for_innovation
So it is more about usefulness of data when its size.
Downvoting a post can decrease pending rewards and make it less visible. Common reasons:
Submit
Hey @cron
absolutely. I mean everyone can collect data. However, retrieving information and building or improve products and services based on this information is what is the holy grail!
Downvoting a post can decrease pending rewards and make it less visible. Common reasons:
Submit
buen post bro
Downvoting a post can decrease pending rewards and make it less visible. Common reasons:
Submit
Cheers buddy
Downvoting a post can decrease pending rewards and make it less visible. Common reasons:
Submit
great and informative post
thanks for sharing and keep up the hard work
Downvoting a post can decrease pending rewards and make it less visible. Common reasons:
Submit
Thank you @hauntedbrain
I will. I hope you follow my journey. I will check out your profile as well
Downvoting a post can decrease pending rewards and make it less visible. Common reasons:
Submit
Hi, thanks, trying to get my head around some of this... do you see the jobs created in big data as separate from analytics, or do they crossover?
Downvoting a post can decrease pending rewards and make it less visible. Common reasons:
Submit
The spectrum is huge. I think there are seveal intersection. If you go for analytics of traffic data or anything real.time based it is likely you have to deal with big data.
However, it really comes down to what you want to do and what happens under the hood. You can do Analytics without getting in touch with it. Rather it would have be steps that are coming before you start your analytics.
Downvoting a post can decrease pending rewards and make it less visible. Common reasons:
Submit
Ok, Ty
Downvoting a post can decrease pending rewards and make it less visible. Common reasons:
Submit
You are welcome. If you have more questions or if I can help you with anything related to that topic please let me know.
Downvoting a post can decrease pending rewards and make it less visible. Common reasons:
Submit
Congratulations @tkronsbein! You have completed some achievement on Steemit and have been rewarded with new badge(s) :
Award for the number of comments
Click on any badge to view your own Board of Honor on SteemitBoard.
For more information about SteemitBoard, click here
If you no longer want to receive notifications, reply to this comment with the word
STOP
Downvoting a post can decrease pending rewards and make it less visible. Common reasons:
Submit