Big Data and Cloud Computing
When discussing the relationship between big data and cloud computing, it's important to understand a few terms: big data, cloud computing, the "cloud", and data centers. Big data is defined as very large data sets
that are the output of various different programs. The term can refer to any
large data types that are usually too large to utilize on a single computer
system (Sharma, n.d.). Cloud computing refers to the processing of anything in
the cloud; this can include big data analytics (BDA). Essentially, big data
refers to the big data sets that are collected, where cloud computing takes in
the data remotely and performs any operation specified on that data. The
“cloud” is a set of high powered servers that can be owned by a single, or
multiple providers (Sharma, n.d.). A data center is a physical facility that
companies use to house their critical applications and data. The data center
design is based on a network of computing and storage resources that enable the
delivery of shared applications and data (CISCO, 2022).
It is important to understand that data
centers support the cloud, and cloud computing supports big data analytics. In
2021, the research firm Gartner stated that the data center is dead, and
predicted that 85% of corporate data centers will be closed by 2025. This is an
incorrect statement because data centers are still very much alive and
essential to the success of corporations. In 2021, at the Data Center World
Conference, keynote speaker Bill Kleyman, the Executive Vice President of
Digital Solutions at Switch, stated that the cloud is not replacing the data
center; it is complimenting and enhancing it (Miller, 2019).
“The State of the Data Center Report” backs
up this assessment by noting that enterprise IT is not going away, it is
becoming more cloud-like, with the adoption of containers, orchestration, and
the OpenStack platform (Miller, 2019). Kleyman later stated that the cloud is
becoming the new dominant data center model. Cloud technologies are extending
into the enterprise in the form of clouds and hybrid clouds that span both the public
and private cloud platforms. Similarly, with the introduction and adoption of
edge services, both data and data services are moved closer to the user, to
limit the latency experienced with current applications. With this supporting
data, data center networks are expected to expand both in the number of
facilities, as well as in square footage (Miller, 2019).
There possibilities are endless when big
data and cloud computing are merged! Solely, with big data alone, there would
be vast amounts of data with big potential but no practical way to process and
analyze it. Using computers alone to analyze the data would be unrealistic due
largely in part to the time that it would take to perform the analysis and the
computing resources needed to support this analysis. This is where cloud
computing is helpful, it provides the state of the art infrastructure that
supports the analysis, along with a pay-as-you-go option that makes the
capability to perform this analysis more obtainable for more corporations, and
smaller budgets (Sharma, n.d.).
Cloud computing application was largely
instrumental in fueling the adopting of big data, because big data is often
times collected by cloud-based applications. Big data is collected because it
supports the ability to decipher the data into valuable information in a matter
of seconds. Because of this, cloud
computing services exist largely in response of big data and big data analytics
(Sharma, n.d.).
According to Malhotra (2018), big data deals,
collects, stores, and processes huge amounts of structured, semi-structured, and
unstructured data. Otherwise referred to as the 5V’s, five aspects big data
include:
1.
Volume – the amount of data
2.
Variety – the different types of data
3.
Velocity – the data flow rate in the system
4.
Value – the value of the information based
within the data
5.
Veracity – the data confidentiality and
availability
Making cloud computing affordable, its
pay-as-you-go model offers three primary services:
·
Infrastructure as a Service (IAAS) – the entire
infrastructure and maintenance related activities is offered by the service
provider
·
Platform as a Service (PAAS) – resources such as
object storage, runtime, queuing, databases, etc. are offered by the service
provider but the configuration and implementation are dependent on the customer
·
Software as a Service (SAAS) – the most utilized
service, it is a software delivery model that is licensed on a subscription
bases and is centrally hosted
When combined, big data and cloud computing
can be categorized base on its different service types (Malhotra, 2018):
·
IAAS in a Public Cloud – Big data services offer
access to unlimited storage and computing power. This is a very cost-effective
solution for organizations where the cloud provider is responsible for all
costs associated with managing the underlying hardware.
·
PAAS in a Private Cloud – Big data technologies
are offered in this service model and eliminates the complexities associated
with managing single software or hardware elements which quickly can become
difficult when dealing with terabytes of data.
·
SAAS in a Hybrid Cloud – This model provides an essential
platform for conducting data analysis.
Some of the features that cloud computing
offers are: scalability, elasticity, resource pooling, self-service, low costs,
and fault tolerance (DataFlare, 2021). Scalability is provided by using
distributed computing and elasticity that allows customers to only pay for the
resource that they are using. In cloud computing, elasticity is refers to the
degree in which a system can autonomously adapt to workload changes. At any
time, the amount of resources matches the current demand of the system (DataFlare,
2021).
Following a multi-tenant model, resource
pooling allows multiple organizations to share the same resources. Self-service
provides customers with an easy to use interface where they can choose which
services or resources they want to utilize. Cloud computing charges the
customer with only the services they are using. This cost-model does away with
the average user needing to purchase expensive infrastructure to support the
computing resources they require. Finally, fault tolerance provides the ability
to recover in the event where part of the cloud system fails to respond
(DataFlare, 2021).
Because of the many cost saving benefits behind
cloud computing, it is currently one of the most applied methods used to
support big data analytics (BDA). In the past, BDA was only available to the
corporations that had a large enough budget to purchase the data center
infrastructure, software, and computing power that are needed to support BDA. This
also required a need for many data scientists to mine the data, and analyze the
data in hopes to find patterns within it that support, or provide a valuable,
informative analysis of the data.
With the incorporation of BDA to the cloud
computing model, it opened the doors for smaller corporations and regular users
to partake in big data analysis activities. At its essence, cloud computing
opened BDA up to the rest of the world. With the pay-as-you go cost model,
IAAS, PAAS, SAAS cloud models, public, private, and hybrid cloud models; it
allows both customers and organizations to generate a cloud-based BDA model
that fits their needs and their budget.
References
DataFlair
Team. (2021, May 28). Big Data and Cloud Computing - A Comprehensive Guide.
DataFlair. Retrieved February 19, 2023, from
https://data-flair.training/blogs/big-data-and-cloud-computing-comprehensive-guide/#google_vignette
Malhotra, A.
(2018, July 21). Big Data and cloud computing – A perfect combination. Whizlabs
Blog. Retrieved February 19, 2023, from https://www.whizlabs.com/blog/big-data-and-cloud-computing/
Miller, R.
(2019, March 21). AFCOM: No, the cloud isn’t replacing the Data Center. Data
Center Frontier. Retrieved February 19, 2023, from
https://www.datacenterfrontier.com/cloud/article/11429692/afcom-no-the-cloud-isn8217t-replacing-the-data-center
Sharma, G.
(n.d.). Big Data & Cloud Computing. Retrieved February 19, 2023, from
https://www.computer.org/publications/tech-news/trends/big-data-and-cloud-computing
Thorn Tech
Staff. (2022, July 21). Big Data in the Cloud. Thorn Technologies. Retrieved
February 19, 2023, from https://thorntech.com/big-data-in-the-cloud/
What is a
data center? Cisco. (2022, December 26). Retrieved February 19, 2023, from
https://www.cisco.com/c/en/us/solutions/data-center-virtualization/what-is-a-data-center.html
Comments
Post a Comment