Big Data and Cloud Computing

 

When discussing the relationship between big data and cloud computing, it's important to understand a few terms: big data, cloud computing, the "cloud", and data centers. Big data is defined as very large data sets that are the output of various different programs. The term can refer to any large data types that are usually too large to utilize on a single computer system (Sharma, n.d.). Cloud computing refers to the processing of anything in the cloud; this can include big data analytics (BDA). Essentially, big data refers to the big data sets that are collected, where cloud computing takes in the data remotely and performs any operation specified on that data. The “cloud” is a set of high powered servers that can be owned by a single, or multiple providers (Sharma, n.d.). A data center is a physical facility that companies use to house their critical applications and data. The data center design is based on a network of computing and storage resources that enable the delivery of shared applications and data (CISCO, 2022).

It is important to understand that data centers support the cloud, and cloud computing supports big data analytics. In 2021, the research firm Gartner stated that the data center is dead, and predicted that 85% of corporate data centers will be closed by 2025. This is an incorrect statement because data centers are still very much alive and essential to the success of corporations. In 2021, at the Data Center World Conference, keynote speaker Bill Kleyman, the Executive Vice President of Digital Solutions at Switch, stated that the cloud is not replacing the data center; it is complimenting and enhancing it (Miller, 2019).

“The State of the Data Center Report” backs up this assessment by noting that enterprise IT is not going away, it is becoming more cloud-like, with the adoption of containers, orchestration, and the OpenStack platform (Miller, 2019). Kleyman later stated that the cloud is becoming the new dominant data center model. Cloud technologies are extending into the enterprise in the form of clouds and hybrid clouds that span both the public and private cloud platforms. Similarly, with the introduction and adoption of edge services, both data and data services are moved closer to the user, to limit the latency experienced with current applications. With this supporting data, data center networks are expected to expand both in the number of facilities, as well as in square footage (Miller, 2019).

There possibilities are endless when big data and cloud computing are merged! Solely, with big data alone, there would be vast amounts of data with big potential but no practical way to process and analyze it. Using computers alone to analyze the data would be unrealistic due largely in part to the time that it would take to perform the analysis and the computing resources needed to support this analysis. This is where cloud computing is helpful, it provides the state of the art infrastructure that supports the analysis, along with a pay-as-you-go option that makes the capability to perform this analysis more obtainable for more corporations, and smaller budgets (Sharma, n.d.).

Cloud computing application was largely instrumental in fueling the adopting of big data, because big data is often times collected by cloud-based applications. Big data is collected because it supports the ability to decipher the data into valuable information in a matter of seconds.  Because of this, cloud computing services exist largely in response of big data and big data analytics (Sharma, n.d.).

According to Malhotra (2018), big data deals, collects, stores, and processes huge amounts of structured, semi-structured, and unstructured data. Otherwise referred to as the 5V’s, five aspects big data include:

1.     Volume – the amount of data

2.     Variety – the different types of data

3.     Velocity – the data flow rate in the system

4.     Value – the value of the information based within the data

5.     Veracity – the data confidentiality and availability

Making cloud computing affordable, its pay-as-you-go model offers three primary services:

·       Infrastructure as a Service (IAAS) – the entire infrastructure and maintenance related activities is offered by the service provider

·       Platform as a Service (PAAS) – resources such as object storage, runtime, queuing, databases, etc. are offered by the service provider but the configuration and implementation are dependent on the customer

·       Software as a Service (SAAS) – the most utilized service, it is a software delivery model that is licensed on a subscription bases and is centrally hosted

When combined, big data and cloud computing can be categorized base on its different service types (Malhotra, 2018):

·       IAAS in a Public Cloud – Big data services offer access to unlimited storage and computing power. This is a very cost-effective solution for organizations where the cloud provider is responsible for all costs associated with managing the underlying hardware.

·       PAAS in a Private Cloud – Big data technologies are offered in this service model and eliminates the complexities associated with managing single software or hardware elements which quickly can become difficult when dealing with terabytes of data.

·       SAAS in a Hybrid Cloud – This model provides an essential platform for conducting data analysis.

Some of the features that cloud computing offers are: scalability, elasticity, resource pooling, self-service, low costs, and fault tolerance (DataFlare, 2021). Scalability is provided by using distributed computing and elasticity that allows customers to only pay for the resource that they are using. In cloud computing, elasticity is refers to the degree in which a system can autonomously adapt to workload changes. At any time, the amount of resources matches the current demand of the system (DataFlare, 2021).

Following a multi-tenant model, resource pooling allows multiple organizations to share the same resources. Self-service provides customers with an easy to use interface where they can choose which services or resources they want to utilize. Cloud computing charges the customer with only the services they are using. This cost-model does away with the average user needing to purchase expensive infrastructure to support the computing resources they require. Finally, fault tolerance provides the ability to recover in the event where part of the cloud system fails to respond (DataFlare, 2021).

Because of the many cost saving benefits behind cloud computing, it is currently one of the most applied methods used to support big data analytics (BDA). In the past, BDA was only available to the corporations that had a large enough budget to purchase the data center infrastructure, software, and computing power that are needed to support BDA. This also required a need for many data scientists to mine the data, and analyze the data in hopes to find patterns within it that support, or provide a valuable, informative analysis of the data.

With the incorporation of BDA to the cloud computing model, it opened the doors for smaller corporations and regular users to partake in big data analysis activities. At its essence, cloud computing opened BDA up to the rest of the world. With the pay-as-you go cost model, IAAS, PAAS, SAAS cloud models, public, private, and hybrid cloud models; it allows both customers and organizations to generate a cloud-based BDA model that fits their needs and their budget.

 

References

DataFlair Team. (2021, May 28). Big Data and Cloud Computing - A Comprehensive Guide. DataFlair. Retrieved February 19, 2023, from https://data-flair.training/blogs/big-data-and-cloud-computing-comprehensive-guide/#google_vignette

Malhotra, A. (2018, July 21). Big Data and cloud computing – A perfect combination. Whizlabs Blog. Retrieved February 19, 2023, from https://www.whizlabs.com/blog/big-data-and-cloud-computing/

Miller, R. (2019, March 21). AFCOM: No, the cloud isn’t replacing the Data Center. Data Center Frontier. Retrieved February 19, 2023, from https://www.datacenterfrontier.com/cloud/article/11429692/afcom-no-the-cloud-isn8217t-replacing-the-data-center

Sharma, G. (n.d.). Big Data & Cloud Computing. Retrieved February 19, 2023, from https://www.computer.org/publications/tech-news/trends/big-data-and-cloud-computing

Thorn Tech Staff. (2022, July 21). Big Data in the Cloud. Thorn Technologies. Retrieved February 19, 2023, from https://thorntech.com/big-data-in-the-cloud/

What is a data center? Cisco. (2022, December 26). Retrieved February 19, 2023, from https://www.cisco.com/c/en/us/solutions/data-center-virtualization/what-is-a-data-center.html

 


Comments