What is Big Data?
What
is big data? How is it different from conventional data? Big data is a new buzz
word in today’s technology field. Everyone is using it, but do they really know
what it is and what makes if different than everyday data? According to the Treehouse
Tech Group, 2021, the term big data actually does not refer to its size, but
more of how it is handled. While traditional data is based on a centralized
database architecture, big data uses a distributed architecture. Big data is
made more scalable than traditional because its computation is also distributed
among several computers in a network. We have been storing and processing data
for decades; however, the rate that we have been generating data has
accelerated greatly in recent years (think about cell phone photos, videos,
emails, media content, etc.). We have all increased our daily data generation,
and we are almost afraid to delete this data. This means it has to be stored
and accessed for later use.
The
term big data can refer to a complex and large data set, along with the methods
we use to process this data (Pure Storage, n.d.). Big data has four main characteristics,
known as, “the four V’s”:
·
Volume
– Big data isn’t always distinguished by its size, but also can be very high volume in
nature.
·
Variety
– Big data sets typically contains structured, semi-structured, and
unstructured data.
·
Velocity
– Big data is generated quickly and is often times processed in real-time.
·
Veracity
– Big data isn’t better than traditional data but its accuracy is extremely
important. Anomalies, biases, and noise can greatly impact the overall quality
of big data.
Many
companies believe that they have to collect their own data but that simply is
not true, there are tons of datasets online and available for public download (Marr,
2022). Five globally interesting datasets available for download are:
1. Data.gov – The US government
pledged to make all government data freely and available online. This dataset
includes interesting information on anything from crime to climate change: http://data.gov
2. US Census Bureau – this dataset
includes information on the lives of US citizens including geographic data, population
data, and education: http://www.census.gov/data.html
3. Socrata is a dataset that also
explores government related data and has some built in visualizations: https://www.tylertech.com/products/data-insights
4. The European Union Open Data
Portal also provides government type data but based on European Union
Institutions: http://open-data.europa.eu/en/data/
5. Data.gov.uk provides data
from the UK government that includes the British National Bibliography –
metadata from all UK books and publications since 1950: http://data.gov.uk/
References
Big Data vs. traditional
data: What's the difference? Treehouse Tech Group. (2021, May 20). Retrieved
January 18, 2023, from
https://treehousetechgroup.com/big-data-vs-traditional-data-whats-the-difference/#:~:text=While%20traditional%20data%20is%20based,better%20performance%20and%20cost%20benefits.
Marr, B. (2022, October 12). Big
data: 33 brilliant and free data sources anyone can use. Forbes. Retrieved
January 18, 2023, from
https://www.forbes.com/sites/bernardmarr/2016/02/12/big-data-35-brilliant-and-free-data-sources-for-2016/?sh=28483c2cb54d
Pure Storage. (n.d.). Big
Data vs. Traditional Data. THE BEGINNERS GUIDE TO BIG DATA. Retrieved January
18, 2023, from
https://www.purestorage.com/knowledge/big-data/big-data-vs-traditional-data.html
Comments
Post a Comment