Infrastructure Provisioning Tools: Pulumi vs Terraform

Infrastructure as Code (IaC) is an essential requirement for DevOps and SRE teams. Over the last few years Terraform has been quick to become the defacto infrastructure provisioning tool. Pulumi is a relatively new arrival on the scene, and is starting to gain some traction. Both Terraform and Pulumi offer a way of provisioning and managing immutable infrastructure across multiple cloud vendors.

We've been using Terraform extensively to provision cloud environments over the last 4 years. More recently, we've been using Pulumi to provision our infrastructure. This has given us great insight into the positives and negatives of each, so here's a showdown of Pulumi vs. Terraform based on our experience, looking at:

  • Their main philosophical difference
  • Functionality
  • State management
  • Structuring large projects
  • Handling inconsistent or corrupt state
  • Secret management
  • Testing
  • Documentation

Programming language

The major difference between Terraform and Pulumi is the programming language. Terraform uses its own declarative language — Hashicorp Configuration Language (HCL). Pulumi's unique selling point is that it supports languages you already know — JavaScript, TypeScript, Python, Go and .Net.

This means you get all the benefits of a general purpose programming language — conditionals, loops, functions, classes, etc. With version 0.12, Terraform has added support for some programming constructs, but it's still fairly limited.

Where it gets a little complicated is that although you are using imperative languages, Pulumi itself is declarative. Pulumi executes the code to build up the desired state of the infrastructure and then creates/updates any resources as required. The following examples show the differences.

Terraform (HCL) example:

data "aws_ami" "ubuntu" {
  most_recent = true

  filter {
    name   = "name"
    values = ["ubuntu/images/hvm-ssd/ubuntu-trusty-14.04-amd64-server-*"]
  }

  filter {
    name   = "virtualization-type"
    values = ["hvm"]
  }

  owners = ["099720109477"] # Canonical
}

resource "aws_instance" "webservers" {
  count         = 3
  instance_type = "t3.micro"
  ami           = "${data.aws_ami.ubuntu.id}"

  tags = {
    Name = "web-${count.index}"
  }
}

output "public_ip" {
  value = ["${aws_instance.webservers.*.public_ip}"]
}

Pulumi example with TypeScript:

import * as aws from '@pulumi/aws';

const ubuntuAmi = aws.getAmi({
  filters: [
    {
      name: 'name',
      values: ['ubuntu/images/hvm-ssd/ubuntu-trusty-14.04-amd64-server-*'],
    },
    {
      name: 'virtualization-type',
      values: ['hvm'],
    },
  ],
  owners: ['099720109477'], // Canonical
  mostRecent: true,
});

let webServers: aws.ec2.Instance[] = [];
for (let i = 0; i < 3; i++) {
  webServers.push(new aws.ec2.Instance("webserver", {
    instanceType: 't3.micro',
    ami: ubuntuAmi.id,
    tags: {
      Name: `web-${i}`,
    },
  }));
}

export const publicIps = webServers.map(s => s.publicIp);

My choice: Pulumi — Although given I'm coming from a software development background, it might be biased.

Functionality

Not only does Terraform have a large number of official supported providers, but it also has a large ecosystem of community providers.

Pulumi's support for providers has been growing and it now has a fairly impressive list. From the start they focussed on making it easy to switch from Terraform by providing a Terraform Bridge, which allows you to piggyback off the Terraform ecosystem by integrating existing Terraform provider plugins. Pulumi have also introduced functionality to consume Terraform state into your Pulumi program to enable you to gradually migrate your existing IaC.

Where Pulumi really excels is with its extensions that abstract over the cloud resources and enable cloud native applications to be developed in a single codebase. This simple yet important feature removes the divide between application code and infrastructure code and makes it easier to understand an application's functionality.

For AWS they have three different libraries you can use:

  • aws – This is the base AWS library that allows you to create AWS resources directly.
  • awsx – This is a higher-level library that has abstractions over AWS resources, creating any convenient dependent resources automatically.
  • cloud – This is currently in preview. It abstracts common resources over multiple cloud vendors, such as APIs, buckets and tasks.

For example a simple API on AWS can be expressed as:

new awsx.apigateway.API('basicApi', {
  routes: [{
    path: '/',
    method: 'GET',
    eventHandler: async () => {
      return {
        statusCode: 200,
        body: JSON.stringify({ message: 'Hello world!' }),
      };
    },
  }],
});

Or if you want to get a Slack notification every time an object is uploaded to an S3 bucket:

const bucket = new aws.s3.Bucket('uploads');

bucket.onObjectCreated('notifySlack', async (event) => {
  const client = new slack.WebClient(config.slackToken);
  for (const rec of event.Records) {
    await client.chat.postMessage({
      channel: config.slackChannel,
      text: `File uploaded to uploads: ${rec.s3.object.key}`,
    });
  }
});

This will take care of creating the bucket, the required IAM roles and permissions and a Lambda function. This level of integration between code and infrastructure makes it really useful for things like handling webhooks, running scheduled cleanup tasks, writing data pipelines, etc.

My choice: Pulumi — Allows you to get up and running quickly with best practices, while also allowing you the same control and functionality offered by Terraform.

State management

By default Pulumi uses its own hosted service for storing and syncing state. They offer a free individual community version, and paid plans for teams and enterprises. They also offer options for managing the state with other services, currently supporting local files, AWS S3, Google Cloud Storage (GCS) and Azure Blob Storage.

By default Terraform uses a local file to store the state, but also offers a larger choice of different backend options including S3, GCS, Postgres, etcd, consul. In addition, Hashicorp also recently started offering a free version of Terraform Cloud for small teams.

Both Pulumi and Terraform offer enterprise options for self hosting their cloud platforms.

My choice: Terraform — Overall there are more state options, and the free Terraform Cloud account allows up to 5 users vs Pulumi's single team member.

Structuring large projects

Terraform projects can be split across multiple files and modules, which allows you to create reusable components. Terraform also has the concept of workspaces which allows you to use the same Terraform program for different environments (development, staging, production).

There's two methods for gluing together projects:

  • remote_state: allows you to read outputs from other remote states
data "terraform_remote_state" "base" {
  backend = "remote"

  config = {
    organization = "circuitops"
    workspaces = {
      name = "prod"
    }
  }
}

resource "aws_instance" "my_instance" {
  subnet_id = data.terraform_remote_state.base.outputs.subnet_id
  # ...
}
  • Data sources: allows you to reference existing cloud resources
data "aws_subnet_ids" "base_subnet_ids" {
  vpc_id = 'vpc-1234'
}

resource "aws_instance" "my_instance" {
  subnet_id =  data.aws_subnet_ids.base.ids[0]
  # ...
}

Pulumi supports structuring your infrastructure as either a monolithic project or micro-projects, with different stacks acting as different environments (development, staging, production).

The one advantage Pulumi has over Terraform is that you can make use of the existing features of programming languages, like functions and classes to structure your code in a more reusable way.

Pulumi offers equivalent functionality for gluing projects together.

You can either reference from other stacks using a StackReference.

const baseStack = new pulumi.StackReference(`example/base/prod`);

new aws.ec2.Instance('myInstance', {
  subnetId: baseStack.getOutput('subnetId');
  // ...
});

or use .get to reference from the cloud directly.

const subnet = aws.ec2.Subnet.get('baseSubnet', 'subnet-1234');

new aws.ec2.Instance('myInstance', {
  subnetId: subnet.id,
  // ...
});

This functionality is still a bit rough around the edges: for example, there's currently no way to get the subnet IDs for an existing VPC.

There's also no way to deserialize the stack references back into Pulumi resources — for most cases this is easy to do by referencing the id of the resource and calling .get, but it's more complicated for any resources defined using the higher-level Pulumi extensions that map to multiple cloud resources.

My choice: Terraform — Currently offers a more polished solution.

Handling inconsistent or corrupt state

It seems inevitable that with either of these tools at some stage you'll need to handle a corrupt or inconsistent state — whether that's due to a crash while updating, a bug or drift caused by an ill-considered manual change.

Terraform has various CLI commands for dealing with this:

  • refresh can handle state drift by reconciling known state with the real infrastructure state
  • state {rm,mv} can be used to manually modify the state file
  • import can be used find an existing cloud resource and import it into your state
  • taint/untaint can mark individual resources as requiring recreation.

Pulumi also has some CLI commands for these cases:

  • refresh is similar to Terraform's refresh
  • state delete deletes the resource from the state file

However, Pulumi has no equivalent of taint/untaint and for any failed updates you may need to manually edit the state file to remove any pending operations using their stack import/export tool.

My choice: Terraform — Unlike Pulumi, with Terraform you should never need to manually edit the state file.

Secret management

With Pulumi you can store config values as secrets encrypted with stored in an existing secret management provider such as AWS KMS or HashiCorp Vault.

pulumi stack init my-stack \
  --secrets-provider="awskms://1234abcd-1234-acde-1234-1234acde1234?region=us-east-1"
pulumi config set --secret password p@$$word123

This uses the key provided by AWS KMS to encrypt the secret, and only stores the encrypted value in the config YAML and state files. When the Pulumi program runs, the value will be decrypted again using the same provider.

Terraform recommends that you store secrets in a separate variable file that is not committed to version control — this means that you will need to add your own integration for any secret management provider you are using. The secret values will be available in the state, but it's recommended to use a state backend that will encrypt the entire state at rest.

My choice: Pulumi — Tightly integrated secrets management is super easy to get started with.

Testing

Testing IaC is often neglected, but as the complexity grows it becomes more and more important. Since Pulumi uses regular programming languages it supports unit tests, using any test framework supported by those languages. For integration, however, Pulumi currently only supports writing these in Go.

Terraform has no official testing support, but there are third-party libraries for integration tests: Terratest, Kitchen-Terraform.

I'm not aware of any method of unit testing Terraform but because HCL is a declarative language with less imperative features, this may be less important — unlike Pulumi which allows more complicated logic.

My choice: Pulumi — There's still improvements needed, most of the focus seems to be on unit-testing at the moment, but it seems to be heading in the right direction.

Documentation

Pulumi have recently updated their documentation format, which solves my biggest pain of having to navigate through some incredibly large pages to find the correct resources. However, there's still some rough spots in the documentation, and since Pulumi doesn't yet have the same community as Terraform, finding good examples can be difficult. The best resources I've found so far are the Pulumi examples on GitHub and the Pulumi Slack.

I've generally found the Terraform documentation to be a lot more manageable and clearer, and they also have the advantage of the larger community.

My choice: Terraform — Clearly laid out, no shortage of examples.

Conclusion

Okay, let's tally up the points! 4 points to Pulumi and 4 points to Terraform — there is no clear winner. Both Pulumi and Terraform offer fairly equivalent functionality. You can't go wrong with Terraform — it's more mature, better documented and has a huge ecosystem and helpful community.

I like Pulumi's philosophy — expressing infrastructure using real languages helps bridge the gap between development and operations, and makes it easier to embrace good software engineering practices. I'm excited to see where it goes!

If you're hiring a DevOps or SRE and want to test candidates on their Pulumi or Terraform knowledge, then CircuitOps can help you with live running scenarios. Sign up for a free CircuitOps account and get started!

Sign up

Sign up to CircuitOps. We can also help you interview your current candidates.

Sign up now