Every once in a while, it just so happens that you are about to wrap up a DevOps project: Everything is automated and works like a charm. All of a sudden, the client approaches you, asking if you could scale up the instance to one larger after all, and maybe add another security group.
“Surely that little change won’t make us re-provision, right? I mean Mike’s up to his elbows in the staging environment already…!”
Sure, it can be done, do not fret, loyal client. We got you.
How it works
So, we’re using Terraform in order to provision infrastructures far and wide. Terraform is a cool, relatively simple and very sophisticated tool. It works by using configuration and state files. In state file, it keeps everything it does. The state file is ideally the faithful reflection of what’s “up there in the cloud”. The configuration file is what will eventually be created when you run terraform apply.
If you go and manually switch the instance to m5.xlarge, change that GP2 EBS to provisioned IOPS storage and add a new security group for Mike, you’ll end up having one thing in the cloud, another in the state file, while your configuration file will be outdated by now. In other words, Terraform still thinks the infrastructure looks like it did this morning. Now, if a week from now, the client goes and tears down the infrastructure in order to spin up a new one with fresh code from GIT, can you guess what’ll happen? You are guessing right.
Terraform will happily destroy something that wasn’t reflected locally. EBS has the same resource ID but is different from the EC2 instance, too. Security group? What security group, Terraform asks? Oh, btw, Terraform is reporting it’s not able to destroy that VPC. Has no idea why. Bye. Well, it’s because there’s a rogue security group there, and it didn’t get deleted once the client decided to tear down everything. It is now locking up the entire process.
What can we do?!
Terraform has two switches called “refresh” and “import”. What refresh does is, it connects to your environment and tries to find differences between resources it provisioned before, and those same resources now. Although this will happen automatically once you run either plan, apply or destroy – what if someone changed something and never reported it? Running a refresh command will take into account all the changes in the resources that were provisioned before, and the old values will be copied over in terraform.state.backup. One diff command, and now you’re the hero of the day, thinking ahead and everything!
Let’s take a look at refresh. As I already stated, it’ll pick up changes. Changed EC2 and EBS, and it will reflect all of that in the state file.
# terraform refresh
As we can see, it went over every resource and refreshed its state file. ec2 instance is now t3.micro and not t2.small anymore, EBS would also have changed.
We’re left with that rogue security group, which will have to be imported. New resources will have to be imported first. Refresh can’t refresh something that never was there in the first place. For that to take place you’ll need to make configuration for it.
Since you need that new security group that is already in place, let’s go and make the resource in our configuration file.
Good. Now we have that new security group in configuration and on AWS we have that security group ID. Let’s merge those and sort the state file so everyone’s up to date.
# terraform import aws_security_group.sg-mike sg-00d1aabf669d21610
The only thing that’s left is to make sure the configuration file(s) have the changes. Remember that diff you did? Go and change the EC2 instance to be t3.small in the configuration file, as well as EBS.
If you don’t, everything will work just fine. It’s just, your newly provisioned infrastructure will again have the old values even though you refreshed them before. Remember, it only updates the state file.
PROTIP: For all the necessary details you can always consult your state file! It has everything you need, just the format is different. But you’re on your way to becoming Terraform provisioning master and you add 2 and 2 like nobody’s business and convert that pure JSON to that slightly weird Hashicorp’s version of it, and you’re on your way to updating configurations!
What have we learned?
Boy oh boy, the client couldn’t be happier, everything works, his infrastructure works, he can re-provision to his heart’s content, and you came out of it as crafty and knowledgeable. Good for you!
But remember, the whole point of this blog post was to show you what CAN be done, not establish a new routine. Importing manual changes into your workflow 10 times a day really defeats the purpose of automation, doesn’t it? Explain to your client that this is the last resort option and should be planned better in the future. Think of the resources in front, don’t be afraid to disturb Mike a bit, he’ll live. Good organization comes in time and with experience.
If it’s all a bit too much, no worries, we’re here to help you out!