How It's Made

Terraforming RDS — Part 3

See Part 1 for an overview of RDS and Terraform, and Part 2 to get the basics of using Terraform with RDS and modules. This post covers parameter groups.

A parameter group is just a list of parameters and values, which you can see in the AWS console:

If you were administering your own PostgreSQL instance, you would set these values in various ways:

  • In the server configuration file, postgresql.conf
  • On the command line when starting the server
  • In the database directly, using SQL to set values

Using AWS RDS, you don’t have access to the configuration file or the server startup command so AWS provides the “parameter group” resource to configure your RDS instance on startup.

Notice the “Apply type” column in the screenshot above. If the value in this column is “dynamic” then the value can be set or updated while the server is running. If it is “static” then the server must be restarted for the parameter to take effect. Since a parameter group is a separate resource from the RDS instance, you can update a static parameter value in the parameter group without restarting the server; AWS will store the change to be applied later.

apply_method

Dynamic and static parameters are handled in Terraform using the apply_method when defining the parameters.

  • For static use apply_method = "pending-reboot"
  • For dynamic use apply_method = "immediate"

Given that each parameter is either static or dynamic and will be applied according to its type, why do we need to specify these? The AWS provider leaves contextual validation to the AWS API; it will only warn you about syntax errors. The API call to add parameters, modify-db-parameter-grouprequires that the ApplyMethod value be provided. If you did not pass this to the aws_db_parameter_group resource then the AWS provider would have to maintain a list of all possible parameters and their types, which would become a big maintenance problem.

If you change an immediate value in the parameter group, it will be applied to the database as soon as you apply the change to the parameter group:

resource "aws_db_parameter_group" "muffy-pg" {
  family = "postgres11"parameter {
    apply_method = "immediate"
    name         = "autovacuum_naptime"
    value        = "30"
  }parameter {
    apply_method = "pending-reboot"
    name         = "autovacuum_max_workers"
    value        = "15"
  }
}

terraform plan

# aws_db_parameter_group.muffy-pg will be updated in-place
~ resource "aws_db_parameter_group" "muffy-pg" {
  + parameter {
    + apply_method = "immediate"
    + name         = "autovacuum_naptime"
    + value        = "30"
  }- parameter {
    - apply_method = "immediate" -> null
    - name         = "autovacuum_naptime" -> null
    - value        = "15" -> null
  }parameter {
    apply_method = "pending-reboot"
    name         = "autovacuum_max_workers"
    value        = "15"
  }
}

An aside on plans

Notice that the plan diffs for even this simple change can be a little hard to read, because Terraform removes the old parameter and adds a new parameter rather than simply updating the value. The changes are not grouped in any particular way, so with even a medium-sized parameter group a deletion may not be adjacent to the addition with the new value.

It turns out there is also a bug in how parameters are updated which gave us a few sleepless nights. It’s described at length in the GitHub issue, but tl;dr: parameters to be added are are added, then parameters to be removed are removed, meaning that you can end up nulling out a parameter you were trying to update.

Okay, back to our example.

terraform apply

Apply the change and check the database configuration in the AWS console. AWS will be automatically applying the change, and soon your DB will have the new value.

database configuration

Now let’s change a static parameter.

resource "aws_db_parameter_group" "muffy-pg" {
  family = "postgres11"parameter {
    apply_method = "pending-reboot"
    name         = "autovacuum_max_workers"
    value        = "5"
  }
}

terraform plan; terraform apply

Plan and apply, then check the configuration in the AWS console. You will see that the change has not been applied, and the parameter group is marked as “pending-reboot”. You will have to reboot the database for the changes to take effect. After the reboot the parameter group will be “in-sync” again.

database configuration

It’s easy enough to understand how the apply_methodvalue works in these cases, but if you specify the wrong apply_method for a parameter, you get some unexpected results. Let’s start by specifying immediate for a staticparameter.

resource "aws_db_parameter_group" "muffy-pg" {
  family = "postgres11"parameter {
    apply_method = "immediate"
    name         = "autovacuum_naptime"
    value        = "15"
  }parameter {
    apply_method = "immediate"
    name         = "autovacuum_max_workers"
    value        = "5"
  }
}

atlantis plan

# aws_db_parameter_group.muffy-pg will be updated in-place
~ resource "aws_db_parameter_group" "muffy-pg" {
  + parameter {
    + apply_method = "immediate"
    + name         = "autovacuum_max_workers"
    + value        = "5"
  }- parameter {
    - apply_method = "immediate" -> null
    - name         = "autovacuum_max_workers" -> null
    - value        = "15" -> null
  }
}

There’s no indication in the plan that this is not the right apply_method. You don’t find out anything is wrong until you try to apply.

atlantis apply

Acquiring state lock. This may take a few moments...
aws_db_parameter_group.muffy-pg: Modifying... [id=terraform-20200115155214596300000001]Error: Error modifying DB Parameter Group: InvalidParameterCombination: cannot use immediate apply method for static parameter
 status code: 400, request id: 936bf177-669f-4792-93b0-eac0f65ce004on main.tf line 16, in resource "aws_db_parameter_group" "muffy-pg":
  16: resource "aws_db_parameter_group" "muffy-pg" {Releasing state lock. This may take a few moments...

AWS warns you and won’t let you change the value.

How about the other way around?

resource "aws_db_parameter_group" "muffy-pg" {
  family = "postgres11"parameter {
    apply_method = "pending-reboot"
    name         = "autovacuum_naptime"
    value        = "30"
  }
}

Plan and apply the changes:

# aws_db_parameter_group.muffy-pg will be updated in-place
~ resource "aws_db_parameter_group" "muffy-pg" {
  + parameter {
    + apply_method = "immediate"
    + name         = "autovacuum_naptime"
    + value        = "30"
  }- parameter {
    - apply_method = "immediate" -> null
    - name         = "autovacuum_naptime" -> null
    - value        = "15" -> null
  }
}% terraform apply
Acquiring state lock. This may take a few moments...
aws_db_parameter_group.muffy-pg: Modifying... [id=terraform-20200115155214596300000001]
aws_db_parameter_group.muffy-pg: Modifications complete after 4s [id=terraform-20200115155214596300000001]Apply complete! Resources: 0 added, 1 changed, 0 destroyed.
Releasing state lock. This may take a few moments...

There is no objection from AWS. If you look in the AWS console, you will see that the parameter value is being applied right away, even though that is not what you specified in the config.

Unfortunately, since the apply_methodattribute is part of the parameter block, you will now see a diff every time you plan, because Terraform will note that what you have specified in your HCL does not match what is in AWS.

So, if apply_method is set in a way that does not match the parameter type the AWS provider will not do what you expect.

Our Terraform team got a Slack message recently, with a section of a plan that looked odd:

        + parameter {
          + apply_method = "immediate"
          + name         = "checkpoint_timeout"
          + value        = "75"
        }[...]
        parameter {
            apply_method = "immediate"
            name         = "checkpoint_timeout"
            value        = "900"
        }

What’s going on here? Is it really trying to add a parameter that is already there? Taking a look at the HCL for the parameter group, sure enough the parameter had been added to the HCL twice, and the AWS provider happily compared the values and decided we must know what we were doing, so it left the existing value alone, since it had not changed, and tried to add the new one even though it was clearly intended as an update.

Once again it is left to AWS to decide what to do with contradictory input, the provider makes as few judgements as possible about the content of your config.

Since parameter groups are separate resources in AWS they are defined separately in your Terraform as well, but parameter group changes are tied very closely to db changes in AWS. You show this dependency in your HCL by using the output of the aws_db_parameter_group resource as the input to the aws_db_instance resource.

resource aws_db_parameter_group "muffy-pg" {
  name   = "muffy-pg"
  family = "postgres9.6"
}resource aws_db_instance "muffy-test-good" {
  parameter_group_name = aws_db_parameter_group.muffy-pg.name
  [...]
}

It is also valid HCL to specify the parameter group by name, but in this case Terraform would not be able to deduce that there is a dependency between these resources:

resource aws_db_parameter_group "muffy-pg" {
  name   = "muffy-pg"
  family = "postgres9.6"
}resource aws_db_instance "muffy-test-bad" {
  parameter_group_name = "muffy-pg"
  [...]
}

This dependency can cause a problem when you are making a major change to the parameter group such as changing the version of Postgres. In this case, Terraform will want to replace the parameter group. Terraform does this by deleting and then creating a new version of the resource.

terraform plan

# aws_db_parameter_group.muffy-pg must be replaced
-/+ resource "aws_db_parameter_group" "muffy-pg" {
  ~ family      = "postgres9.6" -> "postgres11" # forces replacement
    [...]
}

terraform apply

aws_db_parameter_group.muffy-pg: Destroying... [id=terraform-20200115031710299600000001]
[...]
aws_db_parameter_group.muffy-pg: Still destroying... [id=terraform-20200115031710299600000001, 2m50s elapsed]Error: Error deleting DB parameter group: InvalidDBParameterGroupState: One or more database instances are still members of this parameter group terraform-20200115031710299600000001, so the group cannot be deletedstatus code: 400, request id: 0e99a7be-4b2d-43d7-ac96-5b18af81c307

The parameter group resource is separate from the RDS instance, but it is attached to the instance so AWS considers it to be in use and it will not allow you to delete a resource that is in use. The Terraform AWS provider doesn’t check this, so you don’t find out until Terraform tries to apply the changes.

If you want to make a change like this, you need to create a new parameter group and attach it to the database instance. Then you can remove the old parameter group.

Modules are a great feature of Terraform, but they are a difficult fit with parameter groups. Parameters in the HCL for parameter groups are blocks rather than attributes:

resource "aws_db_parameter_group" "muffy-pg" {
  family = "postgres11"parameter {
    apply_method = "pending-reboot"
    name         = "autovacuum_naptime"
    value        = var.autovacuum_naptime
  }
}

We could have exhaustively enumerated every possible parameter in the module inputs, but we don’t want to set most of those values, so we added only inputs for values we changed commonly. However, we then needed to allow for other values that someone might want to change.

Blocks can’t be passed as variable values, but it turns out that a group of blocks turns into a list of maps, so we were able to handle this by creating a list out of all the parameters created with variables and using concat to merge it with the other parameters:

local {
  standard_params = [
    {
      apply_method = "pending-reboot"
      name         = "autovacuum_naptime"
      value        = var.autovacuum_naptime
    },
    [...]
  ]
}resource "aws_db_parameter_group" "muffy-pg" {
  family = "postgres11"parameters = concat(local.standard_params, var.extra_params)
}

This works, but it is very confusing to the user. We ended up with situations like this:

module "my-pg" {
  source = "github.com/tf-mods/rds-parameter-group?ref=v1.0.0"autovacuum_naptime = 10
  [...]extra_params = [
    {
      apply_method = "pending-reboot"
      name         = "autovacuum_naptime"
      value        = 20
    }
  ]
}

Which value for autovacuum_naptime is the intended value?

Maybe later

It is certainly possible to make a useful parameter group module, but in the end we decided to forgo using a module because it provided relatively little value while making the interface much more confusing for the user.

If we revisit creating a parameter group module, I will recommend enumerating all the parameters we would ever allow to be set in the variables. We would do this if we determine that the majority of parameters can be computed from a small number of inputs and we want to standardize these computations. However, this would mean we would need one module per major version of PostgreSQL, as the available parameters can change significantly across major versions.

Come back again and I’ll tell you about that time an abstraction bit us really hard!

Want to work on challenges like these? Surprise, Instacart is hiring!
Check out our current openings.

Muffy Barkocy

Author

Muffy Barkocy is a member of the Instacart team. To read more of Muffy Barkocy's posts, you can browse the company blog or search by keyword using the search bar at the top of the page.

Most Recent in How It's Made

One Model to Serve Them All: How Instacart deployed a single Deep Learning pCTR model for multiple surfaces with improved operations and performance along the way

How It's Made

One Model to Serve Them All: How Instacart deployed a single Deep Learning pCTR model for multiple surfaces with improved operations and performance along the way

Authors: Cheng Jia, Peng Qi, Joseph Haraldson, Adway Dhillon, Qiao Jiang, Sharath Rao Introduction Instacart Ads and Ranking Models At Instacart Ads, our focus lies in delivering the utmost relevance in advertisements to our customers, facilitating novel product discovery and enhancing…...

Dec 19, 2023
Monte Carlo, Puppetry and Laughter: The Unexpected Joys of Prompt Engineering

How It's Made

Monte Carlo, Puppetry and Laughter: The Unexpected Joys of Prompt Engineering

Author: Ben Bader The universe of the current Large Language Models (LLMs) engineering is electrifying, to say the least. The industry has been on fire with change since the launch of ChatGPT in November of…...

Dec 19, 2023
Unveiling the Core of Instacart’s Griffin 2.0: A Deep Dive into the Machine Learning Training Platform

How It's Made

Unveiling the Core of Instacart’s Griffin 2.0: A Deep Dive into the Machine Learning Training Platform

Authors: Han Li, Sahil Khanna, Jocelyn De La Rosa, Moping Dou, Sharad Gupta, Chenyang Yu and Rajpal Paryani Background About a year ago, we introduced the first version of Griffin, Instacart’s first ML Platform, detailing its development and support for end-to-end ML in…...

Nov 22, 2023