Nowadays everyone is talking about social media, twitter, digitalization of the society including internet of things (IoT) and of course big data! It seem to be a trending topic within IT, everyone is talking about it but yet has to arrive at most of the customer sites. In this blog I would like to explain which aspects or qualities of data distinguishes regular data from data processed by Big Data solutions. This is some basic knowledge that I gained during the Big Data Science program by Arcitura.*
But first let’s set a clear definition of Big Data:
Following the Big Data Science program by Arcitura* “Big Data is a field dedicated to the analysis, processing and storage of large collections of data that frequently originate from disparate sources.”
And now into the data..
The characteristics that distinguishes regular data from data processed by Big Data solutions are commonly known as the “Five V’s”.
Volume of data that is processed by Big Data solutions
The speed at which data arrives and potentially accumulates within short time. From enterprise point of perspective the velocity translates into the amount of time it takes for data to be processed once it enters the enterprise’s perimeter.
Variety refers to the multiple types and format of data processed by the Big Data solution.
These large collections of data can be either human- or machine-generated data, might have a structured or unstructured character and can be of all types.
Simply the quality of data, it the data meaningful. Within Big Data we either have noise or signal data. Noise is data which has absolutely no value to the business whereas signal data bears value leading to potential meaningful information.
Usefulness of data for an enterprise. There is a direct relationship with the veracity characteristic in the sense that how more meaningfull the data, the higer the potential value for the business. Value also depends on the amount of time it takes to process the data. The longer it takes to extract the information from the data, the less value it might have for the business since it might be simply too late. So having the correct data slightly too late will decrease the value significantly.
So this was my first post of hopely many more regarding Big Data solutions, I hope you enjoyed it and made you a little bit more wise.
*The Big Data Science Certified Professional (BDSCP) program from the Arcitura™ Big Data Science School is dedicated to excellence in the fields of Big Data science, analysis, analytics, business intelligence, and technology architecture, as well as design, development, and governance.