You might have heard of big data, but the term is often in the same league as artificial intelligence or the cloud – we only vaguely understand what it means. Yet perhaps this is rightfully so. The sheer amount of data processed daily on the Internet is beyond comprehension. Forbes reported that every minute, users watch 4.15 million YouTube videos, send 456,000 tweets on Twitter, post 46,740 photos on Instagram, and upload 510000 comments and 293,000 statuses on Facebook!
One of the key applications of big data is its ability to drive AI and ML (machine learning) automation for both consumer and enterprise needs. Automation software are key in the digital transformation of many companies, and they have proved to be a game-changer for those looking to leverage on data analytics to drive positive change within the business.
It is on this note that we must acknowledge the importance of big data to all professionals, not just IT employees. In order to understand big data, we can start with the 3Vs that frame this concept.
As illustrated above, we collect a massive amount of data every day. And it’s not just personal data on the Internet. Factories today use sensors and IIoT devices to track industrial processes, and analyse real-time data from machines.
Additionally, the pandemic has also triggered intensive data collection in the form of contact tracing. Offices, shopping malls, F&B outlets, entertainment venues, and all other physical entry points are all part of the contact tracing network in our bid to curb the spread of Covid-19.
It’s safe to say that Volume is a defining characteristic of Big Data.
Velocity is the measure of how fast data is produced and processed. For instance, you might want to monitor your enterprise IT systems for potential cyberattacks. The consistent flow of data generated from your business must be monitored, processed, and analysed at real-time speed to prevent an attack.
With the technological advancements of today, we can process and analyse huge volumes of data at real-time.
If you think that data refers to some form of code or numbers on an excel sheet, think again. Data produced on the Internet refers to photos, audio recordings, email messages, videos, sensor data, and more. Each data is distinct from the other, and different from every application. Such data is largely unstructured, which means that it does not fit into a spreadsheet or fields in a database software.
The diversity of data is therefore a key vector of big data.