What Terraform Doesn't Do
Background
I love terraform. It’s not without its faults, but I believe it, or things like it, are a fundamental addtion to your devops stack. However, in a lot of conversations i’ve had involving terraform I have come to notice a pattern by where people make an assumption that terraform somehow can maintain resources that have a transient state by nature.
What Terraform Does Do
Before we define a resource that might be transient, let’s start with a reasource that doesn’t have a transient property. Let’s take for example a load balancer. When we define this resource in terraform, and provide some basic inputs to the provider, we expect terraform to deploy our load balancer as we’ve described it. Now let’s pretent our load balancer has had one of those inputs changed by someone logging in through the gui. Of course, when we re-apply terraform, it would reconcille the state to what we had described in code - easy enough.
What Terraform Might Do
Now, let’s talk about something that might be a little bit more transient. Let’s take for example an ec2 instance. This is something that is very common to deploy with terraform, but there is a key thing we need to know about this particular ec2 instance before we might be comfortable deploying this host with tarraform. What if we wanted to have this instance to maintain some degree of ephemerality? In this case, terraform doesn’t quite know what we want. Do we want the instance gone or do we want it there? There is no option for in between.
Here, we start to get into an interesting situation. We could reason this out as let’s just write a little tool, script, or job to have terraform do some kind of blue/green deployment. This is actually a pretty common pattern, and let’s be clear that introducing tooling to solve this problem isn’t a bad idea by any stretch, but it does highlight that terraform couldn’t solve for a state that was somewhere between on or off.
What Terraform Doesn’t Do
As you can imagine, resources in our infrastructure can be in a state that is neither present nor absent, but rather in a transit one. As reasoned above, we could introduce some extra tooling to make terraform do what we want. However, a better approach here is just to leave terraform out of the conversation, and use the right tool for the right situation. There are so many better options for handling transient resources. In our ec2 example, we have the entire aws sdk and a huge array of services at our disposal to craft something that handles our specific use case.
Conclusion
Take a look at what you’re doing today with terraform. Bucket all of your resources into things that are always in the on/off state, and throw those into terraform. For all of the other remaining resources, consider what the necessary investments might be to build a system that can manage these things outside of terraform, because that’s not really what it does.