top of page
References:

‌O'Reilly Media. (2012, March 11). Volume, Velocity, Variety: What You Need to Know About Big Data. Forbes. https://www.forbes.com/sites/oreillymedia/2012/01/19/volume-velocity-variety-what-you-need-to-know-about-big-data/

 

Understanding the 3 Vs of Big Data - Volume, Velocity and Variety - WHISHWORKS. (2017, September 8). WHISHWORKS. https://www.whishworks.com/blog/data-analytics/understanding-the-3-vs-of-big-data-volume-velocity-and-variety/

 

Foote, K. D. (2017, December 15). A Brief History of Big Data. DATAVERSITY. https://www.dataversity.net/brief-history-big-data/#

 

What is Big Data? What Is Big Data? | Oracle India. (n.d.). Retrieved October 14, 2021, from https://www.oracle.com/in/big-data/what-is-big-data/#history.

What is Big Data?

 

Big Data is a set of extremely large data that can be analyzed algebraically to show patterns or trends about human behavior and interactions. These sets of data are so ample to the point that traditional data processing software can’t manage them anymore.

​

History

​

In 1663, John Graunt studied the bubonic plague, an epidemic that had been devastating Europe. It was at this time when he was attributed as the first person to use statistical data analysis. Then, during the early 1800s, the field of statistics expanded into the collection and analysis of data. However, following shortly in 1880, the U.S. Census Bureau encountered some problems. The bureau estimated that the data collected during the 1880 census will take 8 years to process. Fortunately, Herman Hollerith, a man working for the bureau, created the Hollerith Tabulating Machine in 1881. His invention was designed to control patterns using punch cards—drastically reducing the time of labor for processing and handling data.

 

Austrian-German engineer, Fritz Pfleumer, invented a method of creating a magnetic strip to be used as a replacement wire for recording technology. In 1928, after a series of experiments, he ended up using a very thin paper, striped with iron oxide and coated with lacquer as the material for his invention.

 

During the World War II in 1943, the British was desperate to crack the Nazi codes. As a result, they invented the Colossus, the first data processor, that scanned patterns from messages intercepted from the Germans. Two years later, a paper was published on the Electronic Discrete Variable Automatic Computer (EDVAC), the first “documented” discussion in programming storage, and one of the foundations of computer architecture today. It was believed that the aforementioned events paved the way for the creation of the United States’ NSA (National Security Agency). US-NSA was formed in 1952 by President Truman and was assigned to decrypt messages intercepted during the Cold War. At that time, computers were already able to process data and operate automatically.

    

Year 2005 was when people started to realize that the data Facebook, YouTube, and other online services generate are too big for computers to manage productively. It was also when Hadoop and NoSQL began to gain popularity. 

    

The development of open-source frameworks such as Hadoop made a huge contribution in the growth of big data. They make big data much easier to work with and are affordable to store. Up until today, the volume of big data is skyrocketing, and users still generate a crazy amount of data.

​

The Three V’s

​

Volume

 

Data has been piling up like crazy these past years. The amount was so big that traditional methods of managing and storing data can’t keep up. Take social media as an example. On Facebook alone, 2 billion users upload images, videos, and other media types on the daily. If we add YouTube’s 1 billion, Instagram’s 700 million, and Twitter’s 350 million users to the mix, the amount of data balloons into a number that one might have a hard time comprehending. Imagine the amount of data these sites have to handle (Understanding the 3 Vs of Big Data - Volume, Velocity and Variety - WHISHWORKS, 2017). More importantly, think about how many cat videos were generated! Still, there are never enough cat videos, if you ask me. 

 

Velocity

 

The rate as to how fast data is being received and acted upon is very high. Why else, if not because we live in the Digital Age? Stuff like shopping can now be done within a mobile screen and from someone’s home. One tap and tada—you just got yourself an air fryer and a Dyson airwrap. Millions of consumer data enter the system for every second that passes. In the same way that shopping is made easier, online retailers can retrieve your previous shopping data in a single click. From your every purchase, they can derive various kinds of information, and tempt you into buying more—at high speed, too (O'Reilly Media, 2012).


Variety


Data do not come gift-wrapped; they are a collection of dollar tree purchases, gift wrappers, and ribbons thrown into a haphazard pile. In short, they are raw and diverse (O'Reilly Media, 2012). Social media users, for one, generate data with a wide array of variety ranging from Twitter posts to contents uploaded in YouTube. At first, these data are messy and in need of several processing before finally being integrated into an application (O'Reilly Media, 2012). Then, once they are finally translated into information that the system can understand, they will be used as a basis for recommendations to match the previous activity of the user.

​

Uses of Big Data

​

Genomic Research

​

Genomic research is the study of the functions and additional features of genomes and DNA. This research was developed by Frederick Sanger in 1977, which paved possibilities in the field of genetics. Data gathered from the experiments and investigation in this kind of research is highly important, which is why the need for data management and storage is justified. Information from these data can be used to find cure to currently incurable diseases. The information from the data of genomic research paved the way in understanding the history of the world from plants that were once part of the ecosystem to ancient animals that were once roaming and reigning the lands.
   

Healthcare Billing Analytics

​

During the current reign of Covid-19, hospitals have been full of patients 24/7. Because of this, hospitals have been generating an ample amount of health-related data. The data that these hospitals provide can be used for tracking possible viral contact with the patient. These data are also being used to notify the country about the death toll and number of recoveries during the pandemic. Data gathered from various illnesses can spark new information about a new virus and bacteria. Even a patient's reactions to a certain medicine can be used to create an antidote to counter the very same reactions. Information, especially during these times, is very vital for survival. With the information that the hospitals give, we can fight this virus and end its reign.

​

Summarization

​

In the digital age that we are currently living, data is being generated left and right in ample volumes and at a very high speed. This data is called big data. Big data is very big data with a wide array of variety. These data came from the users and are being translated into information in order to assist the users in their upcoming activities. These data are so important in some fields and are needed to store and manage properly and productively. Big data has been founded, honed, and improved from the past years. Technology, back then, wasn’t able to keep up with the trend of big data. Fortunately, the technology that we are using currently can manage this data productively. If we can use this big data in a way that is advantageous to us, then the information they hold can entirely change the future of humankind. We can cure any disease that may spur and we can improve our technologies much more to the point that even flying cars are attainable.
 

In The BIG-inning: Big Data Through The Years

© 2021 by Group 1. All rights reserved.

bottom of page