GitHub Automation with Terraform Pt 2

Making a GitHub repo module in the style of AVM

Posted %b %e, %Y

13 min read

In the first post in the series I wrote about creating a Terraform module for GitHub repositories and child resources. The goal is to write this module in the style of Azure Verified Modules in the open to share my thinking and approach.

In this post I’ll explore submodules, which in Terraform parlance is simply modules nested in modules.

Reminder: This is work in progress!

A recap

A GitHub repository has a number of components besides the repository itself, for examples branches + branch protection policies, variables and secrets, rulesets, etc.

Last time, I proposed something like this:

Making secrets

We’ll take the example of the Repository (the primary resource created by the module), and the “Secrets and Variables” submodule.

If you’re thinking you can spot an error in my approach vs the AVM specification, hold that thought. I said this would be as-I-go, and surely, there is a turn in the road ahead.

But before that…

Why submodules?

The reasons for sub-modules are not well covered in the AVM spec, as far as I can see.

I’ll do my best to explain.

There are situations where the main resource (the Repository in this case) is pre-created and provided to you. It could be for reasons of permissions, or role separation within your company, or because of existing automations. Submodules can be called independently of the parent, allowing these components to be created and attached to a pre-existing resource.
At the same time, the parent resource re-uses the sub-modules, which supports the more typical scenario where you do create the Repository and sub-components at the same time.
By grouping together, it avoids a proliferation of very small modules that simply wrap a resource (an anti-pattern Hashicorp advise against).
It allows the logic for each subcomponent to be more self-contained, as opposed to having the logic in the root of the module.

There are some disadvantages, I’ll get to that later, for now we keep smelling the roses.

What does a submodule look like?

AVM places submodules locally in a “modules” directory. This is standard Terraform, i.e. it typically has a main.tf, variables.tf and outputs.tf. The AVM spec mandates that submodule must specify their versions too, typically via terraform.tf, just as the parent must.

Here it is, in the current solution:

How do you call it?

Let’s start with the example where you create the repository and the secret, then we’ll get to the “existing parent” scenario afterwards.

 
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
module "github_repository" {
  source = "https: //github.com/kewalaka/terraform-github-avm-res-githubrepository"
  name   = random_pet.repo_name.id

  organization_name    = "kewalaka-org"
  visibility           = "public"
  vulnerability_alerts = false
  archive_on_destroy   = false
  secrets = {
    "deploy_secret" = {
      name      = "DEPLOY_SECRET"
      plaintext = "super_secret"
    }
  }
  variables = {
    "deploy_variable" = {
      name  = "DEPLOY_VARIABLE"
      value = "not_a_secret"
    }
  }
}

In the above, you can see the secrets and variables are simply parameters to the module, from the caller’s perspective the submodule implementation is hidden when using the module this way.

Oh no! There can be only one

All was going well, I had my ‘secrets and variables’ submodule accepting a map, which I carefully unpacked using a locals and created each of the secrets and variables inside the submodule.

I was quite pleased with the result (it always helps when the example works too!).

But, as I pondered existing work, I realised I had drifted from the specification.

AVM modules, and submodules, should only make one resource.

If a caller wishes to make more, it is up to them to iterate over the module to make more.

Now, you might be thinking “this can’t be right, because a repository can plainly have lots of secrets and variables!”, and you’re absolutely right.

In the case of a submodule, the caller is the module, so the root module is where the for_each logic should be.

To make this more concrete, and hopefully make sense, here’s the example call within the root for the secrets:

 
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
module "secret" {
  source   = ". /modules/secret"
  for_each = var.secrets
  repository = {
    id = github_repository.this.id
  }
  name                 = each.value.name
  plaintext_value      = each.value.plaintext_value
  encrypted_value      = each.value.encrypted_value
  environment          = each.value.environment
  is_codespaces_secret = each.value.is_codespaces_secret
  is_dependabot_secret = each.value.is_dependabot_secret
}

The root module accepts multiple secrets, it passes individually to the submodule. This might seem a little reductive and excessive, but it simplifies the logic of the module if it knows it only has to create one thing, and I think the end result is quite elegant.

Calling a submodule directly

If the module is published to the gallery, a submodule can be called like this:

 
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
module "avm_res_githubrepository_secret" {
  source          = "Azure/avm-res-githubrepository/github//modules/secret"
  version         = "x.y.z"
  for_each        = local.secrets
  repository      = { id = github_repository.this.id }
  name            = each.value.name
  plaintext_value = each.value.plaintext_value
  encrypted_value = each.value.encrypted_value
  environment     = each.value.environmnet
}

Plainly, the ideal mechanism is to call the repository module if you can, as otherwise you are calling each submodule individually to make things, remembering, the submodule pattern is for scenarios where the parent has already been pre-created, or perhaps you only need to add a few sub-components.

Have I “over-submoduled” the things?

This occurred to me, and I don’t have an answer yet. The consistency is appealing, is the scope of the individual submodules too narrow and unrealistic? I’m not sure (feedback welcome in the comments!)

The disadvantages with the submodule approach

As I mentioned earlier, using submodules is a little repetitive, in the root module variables you typically have a variable block that accepts a map, since a GitHub repository can have multiple secrets.

 
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
variable "secrets" {
  type = map(object({
    name = string
    plaintext_value = optional(string)
    encrypted_value = optional(string)
    environment = optional(string)
    is_dependabot_secret = optional(bool, false)
    is_codespaces_secret = optional(bool, false)
  }))
  description = <<DESCRIPTION

Map of github action secrets to be created.

- `name` - The name of the secret.
- `plaintext_value` - The plaintext value of the secret.
- `encrypted_value` - The encrypted value of the secret.
- `environment` - The environment to create the secret in. If not set, the secret will be created at the repository level.
- `is_dependabot_secret` - If set to true, the secret will be created at the repository level and will be used by dependabot.
- `is_codespaces_secret` - If set to true, the secret will be created at the repository level and will be used by codespaces.

DESCRIPTION

  default = {}
  nullable = false
}

Within the module you’re repeating the variable definitions.

If you didn’t want to isolate the ability to make the repository and subcomponents, you could do away with the submodules and save yourself some additional line of code.

Updates to the code

The end state is looking a little closer to done now - I’ve added submodules for these features:

branches (possibly excessive!)
branches protection
environments
secrets
variables

I’ve only dealt with examples for the simpler defaults, but the framework is in place to add more.

I’ve also done a little bit of tidying around the docs and removed much of the dead code (it now passes lint checks!)

You can see the latest changes here:

https://github.com/kewalaka/terraform-github-avm-res-githubrepository

What’s next?

We’ll look at what linting means in the context of an AVM module.

Edit! Added Bonus

Someone asked me about the style being used to pass in the repository ID, i.e. in the case of the secret, like this:

 
1
2
3
4
5
6
7
8
9
module "secret" {
  source   = ". /modules/secret"
  for_each = var.secrets
  # wrapping the id in a block
  repository = {
    id = github_repository.this.id
    # etc
  }
}

Rather than:

 
1
2
3
4
5
6
7
module "secret" {
  source        = ". /modules/secret"
  for_each      = var.secrets
  # .. vs pass the ID directly
  repository_id = github_repository.this.id
  # etc
}

The reason for this is ID of the repository is not known until after it is made (i.e. it is “known after apply”) and thus Terraform will taint the child resources when this changes, leading to situations where child items are destroyed and re-created unnecessarily.

For a more detailed explanation and examples, check out the AVM spec: TFNFR11 - Null Comparison Toggle | AVM.

Edit: in part 3, we do a little housekeeping and explore linting.

iac

github avm

This post is licensed under CC BY 4.0 by the author.