Sitting on (at least) two cloud chairs at once

Case Study Overview:

Our Company: Super Admins (a hybrid cloud management company, for all intents and purposes of this case study)
Client Industry: Real Estate Image Processing
Goal: Provide a seamless transition to AWS
Challenges: Have much of the code on AWS, but data, API, bastion and monitoring on-prem still
Services Provided: AWS Cloud architecting, networking, automation
Campaign Duration: ~ 30 hours
Results: Gained elasticity of the services (horizontal scaling, managed) and prepared for complete automation

Problem Presented

The client contacted us with the request to move to the public cloud. Their infrastructure was spread across two DC (Data Centers), and they needed a hybrid cloud management company to make additional PoP in the cloud possible. The plan is to eventually move their production, as well as every other environment, into the cloud, but for now, the hybrid model will have to do.

Our goal was to make everything work from three locations and prepare for a seamless public cloud transition that would eventually take place.

We were given just 3 days to come up with a plan, ways to connect 3 clouds, spin the POC environment, move the data from on-prem NFS storage to ‘ethereal’ EFS and continue to sync those, and lastly – monitor it all.

To put it all together, we had to run the show from 3 cloud providers, do a lot of VPNs, and somehow make it ready for a seamless transition when the launch date came. That date was unknown to us until the very end, and when a hybrid cloud management company doesn’t have a launch date, that means we had to make every day count.

Moving to AWS – All Hands on Deck

As with most migrations, we’re lucky to be able to build a house first, buy furniture, and later plan on how to actually make the new environment look exactly like the old place – but better! In this case, however, the old place had a big old library which was used around the clock, and for all intents and purposes would continue to be used. We need it in the new environment without anyone noticing it being gone. We need it in the cloud, we need it up to date and readily available.

EFS to the rescue!

While AWS offers many storage options, they can all roughly be classified into three categories:

Object-based storage solutions
Block-based storage solutions
File-based storage solutions

While all of us are programmed to reach for the lowest hanging fruit first, object-based storage like S3 is not the right solution for us. It does provide low latency access, but only in certain situations and scenarios, and it cannot be mounted by OS, which is something we needed for this client. Similar goes for EBS; being block-based storage exclusively attached to EC2 instances, it lacks the ability to be mounted at multiple locations at the same time, and performance is not that great. At least not for the shared environment.

This leaves us with EFS – AWS’s solution for low and consistent latency file-based storage.

AWS describes it as: “Web serving and content management, enterprise applications, media and entertainment, home directories, database backups, developer tools, container storage, big data analytics”. The first part fits our scenario perfectly.

Time to move the data. AWS boasts with elasticity bit of the service, advertising the option to mount EFS to the on-prem servers, which may sound better than it really is. At your disposal, you have AWS DataSync service, which spins up specialized EC2 instance utilizing the in-house developed transfer protocol to move your data quickly, and DirectConnect which is exactly what it says. Sort of a direct connection between your VPC and your data centre.

Since both options offered by AWS involved additional costs, we agreed with the client to discard those and get crafty instead.

Our setup featured four EC2 instances, one of which had an elastic IP address, EFS mounted on all of them, and finally 1 RDS. All of that comfortably living in a brand new VPC, sporting private and public subnets for better security according to the Well-Architected Framework.

Cloud-Connected

Since time was of the essence, copying data from NFS share to the EFS share had to be done as quickly as possible. Our solution was to initially simply connect from the on-prem server to the server with the elastic IP using SSHFS, mount the already mounted EFS share, and do the local Rsync between two directories.

There were two more obstacles to overcome. The data was purposefully owned by a certain user, which was joined in a certain group. Those were recreated on AWS, however, copying the data with SSHFS would let us connect with all other users but the one we needed. The owner of the files was never to have the ability to log in, and the root was inaccessible for the SSH use by default on the EC2 instance. Utilizing any other user required us to copy the files as that user and do chown and chmod afterwards. Unfortunately, with some 40 million files, that was not an option. Every subsequent synchronization would, again, give us files with bad ownership and permissions.

The solution was to do the operation as the owning user account and use the ssh keypair so we can automate this later on. This preserved all the attributes and made every new synchronization have it right. With new settings in place, data was on EFS in a mere 3 hours with all the ownership and permissions intact.

The second obstacle was to keep the data in sync until the time came to switch the environments.

Good old cron service made sure repeated synchronizations kept running in a timely manner, and that data was in sync and ready. Due to the fact that we were using the keypair, our sync operation never needed any interaction.

By now, we had current production in one DC, future production environment in AWS, and bastion with monitoring server in the third DC. Setup was not ours to question, so we went with it. Everything had to be securely interconnected for us to start monitoring, and our client’s team to be able to start working on the new environment.

It was time for some VPN exercise.

We opted for IPSEC in this case since it is the default Site2Site solution on AWS. One of the DCs natively supports it, and the other got an Ubuntu VM with Strongswan to act as a gateway for AWS to all the VMs in the subnet that needed the route.

Strongswan VM connected the client’s old production environment with AWS by means of redundant connections, both active, always up, and switched by the bash script provided by AWS in case one of them fails. Having that set-up and tested the failover, we turned our attention to the other DC. This one was VMware Cloud implementation of sorts that unfortunately did not support redundant connections, so we had to set up one, do the static routing – and voilà – our bastion host was then able to reach instances in the private subnet, and more importantly, the server monitoring was then able to reach them as well.

IPSEC served one more purpose and a very important one at that. Client’s API servers were still in the old DC, with no intention of moving anytime soon. This made the redundant nature of Strongswan IPSEC configuration a big plus, as we are now able to sit in at least two cloud chairs for a while. API servers are a crucial part of the system and they always have to be accessible from AWS part of the infrastructure.

Case Study Results: A Brave New World

Let us recap. We’re on AWS now. Officially in the cloud. Many of the once-managed-by-us services are now managed by AWS. VPN is made possible by Amazon’s implementation of IPSEC which was very easy to setup. The database was quickly moved to RDS MySQL 8 managed service. VMs are replaced by a consumption-based EC2 model which gives tremendous opportunities when it comes to both vertical and horizontal scaling (a thing we’ll be using soon with this client). Monitoring and a bastion host are now in a separate DC and have access to both old and new infrastructure, making it independent in case of failures. This setup also gave us the opportunity to have dependencies across DCs while monitoring hosts and services.

Finally, our infrastructure roughly looks something like this:

Having a migration scenario like this one can be a real challenge. It’s a merry-go-round of changes, connections, dependencies, and costs. You take one thing out, introduce another, and gradually move into the cloud, either completely or migrate the crucial part of services, while leaving certain bits independent and located elsewhere; like we did with the monitoring and bastion servers.

This case was very interesting and presented quite a few challenges for us. To name a few, we took it upon ourselves to do VPN and networking jobs instead of employing our dedicated networking team, and AWS rewarded us with simple and easy-to-use service that was a breeze to set. Next, we saw directly how EFS behaves, how easy it is to get started with it and how it behaves in terms of connection, lag, and security. And lastly, we got an opportunity to go with the “by-the-book” architecture following all the best practices, which includes separating parts of service to the different subnets, leaving only what’s needed in the public ones.

If for some reason, migrating to AWS in full is still not an option, the hybrid is a way to go.

After all, each journey starts with the first step. Even the cloud one.

Sitting on (at least) two cloud chairs at once

Case Study Overview:

Problem Presented

Moving to AWS – All Hands on Deck

Cloud-Connected

Case Study Results: A Brave New World

pages

services

industries