diff --git a/content/tf-project-structure.rst b/content/tf-project-structure.rst new file mode 100644 index 0000000000000000000000000000000000000000..58d924c731372d5f3be8b6bb1374d1fffedb7c28 --- /dev/null +++ b/content/tf-project-structure.rst @@ -0,0 +1,135 @@ +Terraform project structure +########################### + +:date: 2021-09-14 +:summary: My preferred Terraform project structure. + + +Recently I've been using `Terragrunt <https://terragrunt.gruntwork.io/>`_ and I +have thoughts on what it offers and is it useful. My usage has been in an +existing project that follows the Gruntworks guidelines closely and with the +paid subscription to the Gruntworks library. These opinions are my own and +they're the result of managing small and medium infrastructure with Terraform +for the last few years both as a single developer and part of small team. + +The main point of Terragrunt as I understand it is keeping from repeating +yourself in code. I am not a fan of copying and pasting big blocks of code nor +of having to change the same value in a few different places. So keeping code +DRY is a worthwhile endeavor. + +Keeping modules DRY +------------------- + +Terragrunt works by using modules. I like Terraform modules. Even the Terraform +documentation suggests that you don't have a single top level module for your +entire infrastructure. It makes development more difficult with more merge +conflicts. It makes deploying for testing purposes more difficult because +Terraform will keep trying to delete resources that aren't your code (because +someone else working in a different branch has made changes for some other +reason). You can work around that by specifying the target you're interested in +but that error-prone and can be tiresome after a while. + +In a previous project I worked on we had a module for roughly each service. We +had quite a lot of code that was copied from one module to another (like +creating a new RDS instance we also created the subnet group, the security group +for the client, etc.). Over time we saw clearly what code was shared between the +different modules, we created a :code:`library` directory and started adding +sub-modules there and after a while we had a nice library of reusable +sub-modules and things were nice. + +Because we waited a bit before creating a new sub-module they were pretty +stable. When we did have a change to the a sub-module that we wanted to deploy +across the entire infrastructure, we would open a branch, work on all the needed +changes there, test them in one of the testing environments, open a PR that has +all of the changes. + +This process fit us nicely. The PR had the entire picture and we could really +see if the change improved anything (like adding an output to a module and +fetching it in a different module would be clear if you see it being used). We +did on occasion had conflicting changes we did had to use targeted :code:`plan` +and :code:`apply` but as far as I can remember no more than once a quarter. + +Terragrunt recommends setting up 2 repositories, one for sub-modules and one for +actually deployed modules. Then you create :code:`terragrunt.hcl` files that +list the sub-modules needed with the Git ref used. This allows you to use the +RDS database sub-module from today but the auto-scaling group from last year. I +see little point in this. + +The change process goes as follows, 1 PR for the sub-modules repository and (at +least) 1 for the live repository. Now I hear that the recommendation has +changed. The new recommendation is that each sub-module will be in a separate +repository. So more PRs for each change (that one change of adding an output and +using it became less obvious but requires more work, I wouldn't call it a win). +I wonder if there's any place that has 2 repositories, 1 for code 1 for the +tests and you change the code, and when it's merged you go to the tests repo and +update the tests there to use the new code to see if it passes? + +Another outcome from this way of working I keep seeing is that because changes +are not applied (or planned) before merging the changes to the sub-module, +errors and issues are only found out later which triggers more PRs. + +Environments, remote states and workspaces, oh my +------------------------------------------------- + +Another way that Terragrunt keeps your code DRY is by generating the Terraform +backend configuration, because Terraform can't use variables in there. So you +save less than ~7 lines. Cool. Also, you won't have by accident (because you +copied that code from another module) used the same location for 2 modules and +have them delete each others resources. It happened to me more than once, but +you see it clearly when you first run :code:`terraform plan` so it's very easy +to catch. + +Now, the folks at Gruntworks suggest you create a directory for each +environment. From what I can see, that means you copy your +:code:`terragrunt.hcl` file to each directory and you modify it slightly (I +think you can see where I'm going with this). If your project has a different +module for each environment, this is a win. no doubt about it. I've seen +projects like that and it's really a pain to manage. + +Before I ever heard about Terragrunt, I had this exact problem. I solved it +using Terraform workspaces and a simple convention. Each environment would have +its own workspace (let's say that the default workspace is the sandbox but +that's up to you). Each module would have a bunch of :code:`tfvars` files for +each environment. The workflow for deploying to the :code:`dev` environment +would look like this: + +.. code:: shell + + terraform workspace select dev + terraform plan -tfvars dev.tfvars -out tfplan + # Review the changes. + terraform apply tfplan + +For making life a little easier I added the following snippet to each module: + +.. code:: terraform + + locals { + module = "${basename(path.module)}" + env = "${terraform.workspace == "default" ? "sandbox" : terraform.workspace}" + } + +Yes, this is copied code and along with the backend configuration, over 10 lines +of code mostly that is mostly duplicated. However, when I compare it the +:code:`terragrunt.hcl` files, this is peanuts. I checked a few modules in the +codebase I'm working on and we have :code:`terragrunt.hcl` files that are 100s +of lines long and share all but a few lines. + +I found that this convention is easy to document, easy for new developers to +pick up, uses existing tools so you can use your existing knowledge and all of +the benefits of avoiding to use another tool in your workflow. + +Conclusions +----------- + +This post is a critique of the Gruntworks recommended setup and workflow and I +think that if you read it all you would see that I think that there are better +and easier ways. You can compare Terragrunt to a badly managed Terraform project +and find that it helps you. I didn't plan on reviewing Terragrunt until I used +it. Terragrunt makes life less enjoyable. It has a convoluted workflow locally +(with those bloody git clones), it makes debugging issues difficult and the +upside is just not there. I would recommend to anyone who thinks about adopting +Terrgrunt to first read the `workspaces documentation +<https://www.terraform.io/docs/cli/workspaces/index.html>`_ before going with +Terragrunt and think hard on the code review, the testing and development +workflows.