Back in 2014, NASA launched a satellite called OCO-2 (Orbiting Carbon Observatory 2) to gain more insight into the Earth’s carbon uptake. Two years later, they had petabytes of gathered data that needed to be processed which would have taken over 3 months and cost about $200,000 if they had used on-premise data centers for this processing task. NASA, however, decided to go with AWS cloud services for this job, cutting costs to $7,000 and the whole thing was done in less than a week.
This is one of the most famous examples of using the cloud computing practice known as Cloud Bursting.
It perfectly exemplifies just how powerful this high-volume data processing solution really is, in terms of both cost-effectiveness and productive capacity. Luckily, NASA is not the only organization that has the privilege to utilize cloud bursting. You can do it, too.
What is Cloud Bursting?
Cloud bursting is a way of utilizing hybrid cloud infrastructures to more quickly and efficiently process high data volumes and spikes in traffic by spreading the computing load. The hybrid cloud architecture model enables companies to use both private cloud infrastructure and public cloud computing resources – like AWS, Microsoft Azure, or Google Cloud Platform.
Essentially, cloud bursting involves an app configuration that enables the private cloud to seamlessly and temporarily burst into the public cloud in order to use additional computing power and resources.
Although hybrid cloud environments and deployment practices can simply involve unrelated apps being statically hosted across different data centers, the process of cloud bursting refers to a dynamic deployment model that enables you to leverage the power and the elasticity of public cloud services in timeframes when the processing demand overwhelms the capacity of your private cloud infrastructure.
The companies that use cloud bursting still use the private cloud as their primary deployment but apply public cloud resources to accommodate traffic spikes. Once the traffic spike is over and the load goes back to normal levels, the public cloud is no longer used.
When to Use Cloud Bursting?
Both consumer websites and software development environments can face substantial traffic spikes that could lead to crashes. For example, when developing software, performing pre-release testing for complex apps often involves spinning up a staging environment that is capable of draining huge amounts of capacity, resulting in decreased performance of other business applications, which can sometimes last for weeks.
Numerous industries have evolved rapidly and now accrete and use vast amounts of data they gather from logs, IoT ecosystems, social media platforms, online transactions, etc. This practice offers huge and actionable potential for improving one’s business operations, but this doesn’t come without a caveat. The trade-off here is the aforementioned drainage of computing resources during high-volume data analyzing and processing.
The solution for this is cloud bursting as it allows companies to mitigate this negative impact on business-critical operations.
It is a cost-effective way for businesses to leverage the advantages of the public cloud without fully migrating their computing architecture to this cloud model, which is quite useful for those who want to avoid vendor lock-in, for example. Private clouds lack the much-needed scalability that enables storage and compute resources to be expanded according to current computing demands, which is something public cloud models excel at.
For What Purposes Do Companies Typically Use Cloud Bursting?
Software Development & Analytics
Although DevOps teams tend to frequently utilize multiple VMs for testing purposes, these machines are active only within short timeframes. Cloud bursting is also used for CI/CD-based processes as they require multiple and short one-off jobs during new commit pushes.
Powerful and well-thought-out marketing campaigns are capable of generating big traffic spikes that often require additional cloud resources. This is where cloud bursting comes into play to help the systems withstand these influxes of traffic.
Big Data Queries and Modeling Tasks
Businesses that operate within the Big Data industry often perform one-off queries and/or generate substantial models capable of exceeding private cloud capacities. Processes like high-fidelity 3D rendering, AI and ML model training, and autonomous vehicle simulation are great for using cloud bursting.
Organizations that experience spikes in traffic during specific seasons of the year tend to require additional computational power during peak periods. These include e-commerce platforms during the holiday rush, financial processing during end-of-business quarters, etc.
Cloud Bursting Advantages
Both large enterprises and small companies can benefit from using cloud bursting. The practice provides 3 main benefits:
- Cost Efficiency – the public cloud is used only for occasional peaks in demand, which keeps costs at a minimum; once the public cloud resources are no longer needed, they get decommissioned so you only pay for what you use while you’re using it. Additionally, public cloud providers offer multiple pricing and performance tiers so companies can choose optimal performance levels according to their needs.
- Flexibility and Scalability – Cloud bursting enables companies to quickly adjust to changes in capacity needs and leverage the scalability of public cloud services thus freeing up their private cloud resources.
- Business Continuity – apps can be seamlessly burst over into the public cloud without interrupting the users.
The Challenges of Cloud bursting
Compatibility – one of the main hurdles of cloud bursting is making sure that your apps are compatible with the public cloud architecture so it is capable of scaling seamlessly across the new environment. This allows for the load to be balanced and permissions managed the right way.
Networking – some businesses have trouble creating redundant connections between public and private clouds as these connections must feature low latency and high bandwidth.
Vendor Lock-in – the risk of vendor lock-in may be an issue for some companies as certain public cloud providers provide cloud bursting services as part of the package.
Security and Data Protection – security layers and backups may be challenging when they come from multiple sources.
How to Implement Cloud Bursting
Typically, there are three ways to implement cloud bursting:
- Distributed Load-Balancing
- Manual Bursting
- Automated Bursting
This approach provisions cloud resources (storage, compute instances, monitoring) and then deploys data center workloads to the cloud services that have been provisioned. When load monitoring is applied to local workloads, it provides the data needed for traffic redirecting. The company needs to set load thresholds and distribute accordingly.
When traffic exceeds the set threshold, the same workload environment is activated in the public cloud and the traffic is moved from the workload to the cloud. Similarly, when traffic decreases below the set threshold, it gets redirected to the local data center, while the public cloud resources for that particular load get decommissioned.
As this method deploys workloads both locally and in the cloud, the traffic is shared with the cloud when needed. This means that a standby limited-capacity cloud-based deployment needs to be set up and scaled up according to traffic requirements, potentially leading to overheads during periods when the cloud workload is inactive.
This method lets you manually provision/de-provision public cloud resources and is typically used to create large, temporary cloud deployments that get destroyed when they are no longer needed in order to minimize costs. This technique is beneficial for testing and proof-of-concept cloud bursting projects, but it does involve an increased risk of human error. The downside involves potential notifications delays and deployment oversights.
This approach includes setting up policies that determine the way cloud bursting is managed and then letting the software conduct the task. Automated bursting platforms are capable of automatically scaling and removing cloud resources according to traffic needs and typically use APIs to facilitate dynamic interactions with cloud infrastructure and resources. Automated bursting minimizes the risk of human error.
Banks are perhaps the best example of cloud bursting used the right way. Banks process high-volume data sets that require short, frequent and powerful computing bursts (to meet compliance reporting requirements). Now, it is not efficient to invest in on-premise systems to only intermittently process large data sets because the servers remain idle most of the time. Similarly, it is not cost-efficient to run public cloud stacks non-stop as that would significantly bump up operating expenses.
This is why cloud bursting is a perfect solution as it optimizes storage capacity and infrastructure costs, while it also provides data scientists with numerous handy insights.
Here’s a use case example of an effective cloud bursting method deployed by a certain bank:
- A data integration platform is used to automatically move and load data to AWS storage.
- The data is processed in batches.
- A script is deployed to use a web server that converts data from one structure to another using Amazon Elastic Map Reduce (EMR) in Hadoop or the Amazon Redshift data warehouse.
- When the data is loaded and ready in the temporary cloud, it is used by data analysts to come up with and test different risk and interest rate return rates scenarios.
Is Cloud Bursting for me?
The use of cloud bursting provides numerous benefits like cost reduction, operation efficiency boost, improved performance and increased overall productivity. However, not all companies can reap and leverage all these advantages.
That said, here’s what cloud bursting is ideal for:
- Apps that are mainly used for reading storage data (like content delivery systems).
- Database apps that optimize performance by performing sharding – a method of splitting and storing a single logical dataset in multiple databases.
- Apps for big data analyses (as these quickly process large data volumes).
- Artificial Intelligence and Machine Learning models that use large-scale infrastructures for model training.
- Data streams that fluctuate on a daily basis (but some pre-processing is required to mitigate numerous data migrations).
And here’s for what cloud bursting is not an optimal solution:
- Apps that use low-latency write operations.
- Science-based apps that use simulations with high node-to-node traffic that cannot be sustained by cloud bursting methods.