25 Apr 2024 · Software Engineering

    Taming Cloud Costs with Infracost

    14 min read
    Contents

    When we combine the cloud with IaC tools like Terraform and continuous deployment we get the almost magical ability to create resources on demand. For all its benefits, however, the cloud is not without its share of problems — like estimating cloud costs accurately. Can we automate this tedious task? Yes, thanks to open source Infracost.

    Why Cloud Costs Are Hard

    Cloud providers have complex cost structures that are constantly changing. AWS, for example, offers more than 700 types of Linux machines on EC2. Many of them have similar names and features. Take for example “m6g.2xlarge” and “m6gd.2xlarge” (one comes with an SSD while the other doesn’t). Making a mistake in a Terraform file can cause your bill to balloon unexpectedly.

    Man in front of a blackboaard filled with equations. The text says: Calculating AWS Resources Cost
    It’s so easy to go above budget.

    We can set up billing alerts, but they are reactive and with no guarantee it they will work. It has already happened multiple time to developers and companies.

    What is Infracost?

    Infracost is an open-source project that helps us understand how and where we’re spending our money. It gives a detailed breakdown of our infrastructure costs and calculates the impact of changes. In a nutshell, Infracost is a git diff for cloud billing.

    Infracost has two versions: a VSCode extension and a command line program. Both do the same thing: parse Terraform or Terragrunt, pull the current cost price points from a the Infracost cloud pricing API, and output an estimate.

    How Infracost works. It reads code from the repository and pulls the appropriate service costs from an API service. The IDE version prints the estimates on the screen while editing. The CLI version prints the result in the terminal, posts comments on GitHub, Bitbucket, or Gitlab, and can stop a CI/CD deployment if limits are exceeded.
    You can use Infracost pricing API for free or host your own. The paid tier includes a cloud dashboard to track changes over time.

    With the extension, we can see the estimates right in the IDE as we make changes.

    A GIF showing how Infracost shows cost estimates in real-time as a developer changes a Terraform file.
    Real-time cost estimation on VSCode.

    The command line tool lets us post comments in pull requests and commits in our repos.

    A GitHub Pull Request conversation showing an automated message from Infracost with the cost estimate.
    Cost change information in the PR.

    Infracost also has an optional (a paid) Infracost Cloud, which acts as a central hub for cost management. Here, we can set usage estimates, set custom price books and keep track of costs over time.

    A Screenshot of the Infracost Cloud Dashboard. It shows percentage usages for compute, storage, logging as well as pie charts on cost impact and cost issue tracking.
    The Infracost Cloud Dashboard

    Getting the Infracost Cloud also gives us some extra perks such as Slack Integration, setting up FinOps policies and establishing guardrails to prevent deployments beyond a preset budget.

    Slack Alert generated from the Infracost App. The alert shows an increase of costs of 1152% in a project.

    Installing up Infracost

    To try out Infracost, we’ll need the following:

    • An Infracost API key. You can get one by signing up for free by installing the Infracost CLI and running: infracost auth login
    • Some Terraform files.
    • A GitHub account to post estimates messages.

    The first command we’ll try is infracost breakdown. It analyzes Terraform plans and prints out a cost estimate. The --path variable must point to the folder containing your Terraform files. For example, imagine we want to provision an “a1.medium” EC2 instance with the following:

    provider "aws" {
      region                      = "us-east-1"
      skip_credentials_validation = true
      skip_requesting_account_id  = true
    }
    
    resource "aws_instance" "myserver" {
      ami           = "ami-674cbc1e"
      instance_type = "a1.medium"
    
      root_block_device {
        volume_size = 100
      }
    }

    When we run `infracost breakdown –path .` we get an estimate for this instance:

    $ infracost breakdown --path .
    
    Evaluating Terraform directory at .
      ✔ Downloading Terraform modules 
      ✔ Evaluating Terraform directory 
      ✔ Retrieving cloud prices to calculate costs 
    
    Project: TomFern/infracost-demo/ec2
    
     Name                                                  Monthly Qty  Unit   Monthly Cost 
                                                                                            
     aws_instance.myserver                                                                  
     ├─ Instance usage (Linux/UNIX, on-demand, a1.medium)          730  hours        $18.62 
     └─ root_block_device                                                                   
        └─ Storage (general purpose SSD, gp2)                      100  GB           $10.00 
                                                                                            
     OVERALL TOTAL                                                                   $28.62 
    ──────────────────────────────────
    1 cloud resource was detected:
    ∙ 1 was estimated
    
    ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┓
    ┃ Project                                            ┃ Monthly cost ┃
    ┣━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╋━━━━━━━━━━━━━━┫
    ┃ TomFern/infracost-demo/ec2                         ┃ $29          ┃
    ┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┻━━━━━━━━━━━━━━┛

    ​If we add some extra storage (600GB of EBS), the cost increases to $89, as shown below:

    infracost breakdown --path .
    Evaluating Terraform directory at .
      ✔ Downloading Terraform modules 
      ✔ Evaluating Terraform directory 
      ✔ Retrieving cloud prices to calculate costs 
    
    Project: TomFern/infracost-demo/ec2-disk
    
     Name                                                  Monthly Qty  Unit   Monthly Cost 
                                                                                            
     aws_ebs_volume.extra_storage                                                           
     └─ Storage (general purpose SSD, gp2)                         600  GB           $60.00 
                                                                                            
     aws_instance.myserver                                                                  
     ├─ Instance usage (Linux/UNIX, on-demand, a1.medium)          730  hours        $18.62 
     └─ root_block_device                                                                   
        └─ Storage (general purpose SSD, gp2)                      100  GB           $10.00 
                                                                                            
     OVERALL TOTAL                                                                   $88.62 
    ──────────────────────────────────
    3 cloud resources were detected:
    ∙ 2 were estimated
    ∙ 1 was free
    
    ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┓
    ┃ Project                                            ┃ Monthly cost ┃
    ┣━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╋━━━━━━━━━━━━━━┫
    ┃ TomFern/infracost-demo/ec2-disk                    ┃ $89          ┃
    ┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┻━━━━━━━━━━━━━━┛

    Infracost can also calculate usage-based resources like AWS Lambda. Let’s see what happens when we swap the EC2 instance for serverless functions:

    provider "aws" {
      region                      = "us-east-1"
      skip_credentials_validation = true
      skip_requesting_account_id  = true
    }
    
    resource "aws_lambda_function" "my_lambda" {
      function_name = "my_lambda"
      role          = "arn:aws:lambda:us-east-1:account-id:resource-id"
      handler       = "exports.test"
      runtime       = "nodejs12.x"
      memory_size   = 1024
    }

    Running infracost breakdown yields a total cost of 0 dollars:

    $ infracost breakdown --path .
    
    Evaluating Terraform directory at .
      ✔ Downloading Terraform modules 
      ✔ Evaluating Terraform directory 
      ✔ Retrieving cloud prices to calculate costs 
    
    Project: TomFern/infracost-demo/lambda
    
     Name                                   Monthly Qty  Unit                        Monthly Cost 
                                                                                                  
     aws_lambda_function.my_lambda                                                                
     ├─ Requests                    Monthly cost depends on usage: $0.20 per 1M requests          
     ├─ Ephemeral storage           Monthly cost depends on usage: $0.0000000309 per GB-seconds   
     └─ Duration (first 6B)         Monthly cost depends on usage: $0.0000166667 per GB-seconds   
                                                                                                  
     OVERALL TOTAL                                                                          $0.00 
    ──────────────────────────────────
    1 cloud resource was detected:
    ∙ 1 was estimated
    
    ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┓
    ┃ Project                                            ┃ Monthly cost ┃
    ┣━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╋━━━━━━━━━━━━━━┫
    ┃ TomFern/infracost-demo/lambda                      ┃ $0.00        ┃
    ┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┻━━━━━━━━━━━━━━┛

    That can’t be right unless no one uses our Lambda function, which is precisely what the tool assumed here. We can fix this by providing an estimate via a usage file. Run this command to create the usage file:

    $ infracost breakdown --sync-usage-file --usage-file usage.yml --path .

    We can now provide estimates by editing usage.yml. The following example consists of 5 million requests with an average runtime of 300 ms:

    # usage.yml
    
    resource_usage:
      aws_lambda_function.my_lambda:
        monthly_requests: 5000000 
        request_duration_ms: 300 

    We’ll tell infracost to use the usage file with --usage-file to get a proper cost estimate:

    $ infracost breakdown --path . --usage-file usage.yml
    
    Evaluating Terraform directory at .
      ✔ Downloading Terraform modules 
      ✔ Evaluating Terraform directory 
      ✔ Syncing usage data from cloud 
        └─ Synced 0 of 1 resource
      ✔ Downloading Terraform modules 
      ✔ Evaluating Terraform directory
      ✔ Retrieving cloud prices to calculate costs 
    
    Project: TomFern/infracost-demo/lambda
    
     Name                           Monthly Qty  Unit         Monthly Cost 
                                                                           
     aws_lambda_function.my_lambda                                         
     ├─ Requests                              5  1M requests         $1.00 
     └─ Duration (first 6B)           1,500,000  GB-seconds         $25.00 
                                                                           
     OVERALL TOTAL                                                  $26.00 
    ──────────────────────────────────
    1 cloud resource was detected:
    ∙ 1 was estimated
    
    ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┓
    ┃ Project                                            ┃ Monthly cost ┃
    ┣━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╋━━━━━━━━━━━━━━┫
    ┃ TomFern/infracost-demo/lambda                      ┃ $26          ┃
    ┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┻━━━━━━━━━━━━━━┛

    That’s much better. Of course, this is accurate as long as our usage file is correct. If you’re unsure, you can integrate Infracost with the cloud provider and pull the utilization metrics from the source.

    Estimating The Impact of Changes

    Infracost can save results in JSON by providing the --format json and --out-file options. This gives us a baseline file we can check in source control:

    $ infracost breakdown --path . --format json --usage-file usage.yml --out-file baseline.json

    We can now compare changes by running infracost diff. Let’s see what happens if the Lambda execution time goes from 300 to 350 ms:

    $ infracost diff --path . --compare-to baseline.json --usage-file usage.yml
    
    Evaluating Terraform directory at .
      ✔ Downloading Terraform modules 
      ✔ Evaluating Terraform directory 
      ✔ Retrieving cloud prices to calculate costs 
    
    Key: * usage cost, ~ changed, + added, - removed
    
    ──────────────────────────────────
    Project: TomFern/infracost-demo/lambda
    
    ~ aws_lambda_function.my_lambda
      +$4 ($26 → $30)
    
        ~ Duration (first 6B)
          +$4 ($25 → $29), +250,000 GB-seconds (1,500,000 → 1,750,000)*
    
    Monthly cost change for TomFern/infracost-demo/lambda
    Amount:  +$4 ($26 → $30)
    Percent: +16%
    
    ──────────────────────────────────
    Key: * usage cost, ~ changed, + added, - removed
    
    1 cloud resource was detected:
    ∙ 1 was estimated
    
    Infracost estimate: Monthly cost will increase by $4 ↑
    ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┓
    ┃ Project                                            ┃ Cost change ┃ New monthly cost ┃
    ┣━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╋━━━━━━━━━━━━━╋━━━━━━━━━━━━━━━━━━┫
    ┃ TomFern/infracost-demo/lambda                      ┃  +$4 (+16%) ┃ $30              ┃
    ┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┻━━━━━━━━━━━━━┻━━━━━━━━━━━━━━━━━━┛

    As you can see, the impact is a 16% increase.

    Running Infracost on CI/CD

    We’ve seen how this tool can help us estimate cloud costs. That’s valuable information, but what role does Infracost take in continuous integration? To answer that, we must understand what infracost comment does.

    The comment command takes a JSON file generated by infracost diff and posts its contents directly into GitHub, Bitbucket, Azure Pipelines, or Gitlab. Running Infracost inside CI makes relevant cost information available to everyone on the team.

    An automated comment on GitHub with cost differences caused by the commit.
    Infracost comment on the cost difference in a GitHub commit.

    How to run Infracost on Semaphore

    ⚠️ Before going into pipeline configuration, commit any files you’ve created into the repo. Also, if this is your first time using Semaphore, I suggest signing up for a free Infracost.io account and going through the getting started guide.

    In this section, we’ll add two cost-control jobs to our CI pipeline. They will:

    • Analyze pull requests and post a comment in GitHub with the cost difference.
    • Comment on every commit that changes the infrastructure. The job will fail if it breaks the pre-established policy.

    Pipeline running two Infracost jobs: Analyze PRs (which is skipped) and Infracost comment commits (which failed because the cost policy was exceeded).
    Infracost preventing the deployment of resources exceeding a policy.

    For this section of the tutorial, you’ll need to create a GitHub token (choose the classic token) or a Bitbucket app password with read+write access to the repository.

    We’ll save the Git provider password or token as a Semaphore Secret. To do that, go to your Organization menu in the top right corner and click Settings. Then, go to Secrets > Create new.

    Define the following secret environment variables:

    • INFRACOST_API_KEY: you can retrieve it by running infracost configure get api_key in your machine.
    • GITHUB_API_KEYor BITBUCKET_API_KEY: an API token with read and write access to the repository. Note that, for Bitbucket, the key takes the form of username:api-key.

    A screenshot of Semaphore secret showing two environment variables: INFRACOST_API_KEY and BITBUCKET_API_KEY
    Semaphore Secret with Infracost and Bitbucket access tokens.

    Commenting commits

    We’ll begin by adding a job that comments on every commit changing a Terraform file.

    A GitHub comment posted by Infracost.
    Infracost comment on GitHub.

    To do so, open or add your project to Semaphore. To keep things simple, we’ll assume we already have a CI pipeline that builds and tests the project.

    An example CI pipeline with a build and test blocks.
    Our starting pipeline builds and tests the code.

    Add a new block with the following commands:

    curl -fsSL https://raw.githubusercontent.com/infracost/infracost/master/scripts/install.sh | sh
    checkout
    infracost diff --path . --usage-file usage.yml --compare-to baseline.json --format json --out-file /tmp/infracost-diff-commit.json
    infracost comment github --path=/tmp/infracost-diff-commit.json --repo=$SEMAPHORE_GIT_REPO_SLUG --commit=$SEMAPHORE_GIT_SHA --github-token=$GITHUB_API_KEY --behavior=update

    Let’s see what the job does:

    • The first two commands install Infracost and clone the repository into the CI Machine.
    • The third line compares the current costs with the ones stored in baseline.json (which should have been already committed to the repository).
    • The last line compares the changes and posts a comment to GitHub.

    If you’re using Bitbucket instead of GitHub, the commands should be:

    curl -fsSL https://raw.githubusercontent.com/infracost/infracost/master/scripts/install.sh | sh
    checkout
    infracost diff --path . --usage-file usage.yml --compare-to baseline.json --format json --out-file /tmp/infracost-diff-commit.json
    infracost comment bitbucket --path=/tmp/infracost-diff-commit.json --repo=$SEMAPHORE_GIT_REPO_SLUG --commit=$SEMAPHORE_GIT_SHA --bitbucket-token=$BITBUCKET_API_KEY --behavior=update

    Remember to enable the secret you created earlier to ensure that the job has access to your API Keys.

    A new block was added to the pipeline with a job that posts the comments. The screen shows the job commands and the infracost secret created earlier enabled on the block.
    Add a new block with the commands and enable the infracost secret. This job does not need to depend on any others in the pipeline.

    Conditional execution

    Our infracost job does not need to run on every commit. Only when a Terraform file changes, which we can detect with change-based conditions.

    To turn on conditional execution, open the Skip/Run conditions section of the block and type: change_in('/**/*.tf') or change_in('/**/*.tfvars') so the job does not run unless a file ending with a tf or tfvars extension changes in the codebase.

    A Semaphore screenshot showing how to configure skip/run conditions. The relevant section is opened, and the change_in condition is filled in.
    With this condition, we only run the job when a Terraform file changes.

    ⚠️ If your project’s main branch is not called master, you need to provide additional options. For example, if the main trunk is called main use: change_in('/**/*.tf',{default_branch: 'main'}) or change_in('/**/*.tfvars',{default_branch: 'main'}).

    Commenting on pull requests

    Instead of comparing the costs against a baseline, we can compare changes across branches on pull requests. This will give the reviewer a summary of the new costs.

    An automated Infracost post on a pull request.
    Comments on pull requests give the reviewer insights on how the change impacts costs.

    To add comments on pull requests, we’ll create a new block with the following commands:

    curl -fsSL https://raw.githubusercontent.com/infracost/infracost/master/scripts/install.sh | sh
    checkout
    git checkout master
    infracost breakdown --path . --format json --out-file /tmp/infracost-master.json
    git checkout FETCH_HEAD
    infracost diff --path . --format json --compare-to /tmp/infracost-master.json --out-file /tmp/infracost-diff-master.json
    infracost comment github --path=/tmp/infracost-diff-master.json --repo=$SEMAPHORE_GIT_REPO_SLUG --pull-request=$SEMAPHORE_GIT_PR_NUMBER --github-token=$GITHUB_API_KEY --behavior=update

    Like in the previous job, we use infracost comment to post a comment, but this time we reference a pull request number and compare changes between the trunk and the committed branch.

    To finish configuring the block:

    1. Enable the infracost secret
    2. Set the run condition to pull_request =~ '.*'. As before, we can enable change detection with: pull_request =~ '.*' and (change_in('/**/*.tf') or change_in('/**/*.tfvars')).

    A Semaphore editor screenshot showing the new block added. It features a new job with the commands, runs conditions, and the infracost secret enabled.
    The pull request comment job does not depend on any other blocks in the pipeline.

    Check out Semaphore’s built-in support for monorepos to learn more about this feature.

    Infracost with Monorepos

    You will likely have separate Terraform files for each subproject if you work with a monorepo. In this case, you should add an infracost config file at the project’s root. This allows you to specify the project names and where Terraform and usage files are located. You can also set environment variables and other options.

    # infracost-config.yml
    
    version: 0.1
    
    projects:
      - path: dev
        usage_file: dev/infracost-usage.yml
        env:
          NODE_ENV: dev
    
      - path: prod
        usage_file: prod/infracost-usage.yml
        env:
          AWS_ACCESS_KEY_ID: ${PROD_AWS_ACCESS_KEY_ID}
          AWS_SECRET_ACCESS_KEY: ${PROD_AWS_SECRET_ACCESS_KEY}
          NODE_ENV: production

    When the config file is involved, you must replace the --path argument with --config-file in all your commands.

    Setting up policies

    One more trick the Infracost CLI has has up its sleeve is enforcing policies. Policies are rules that evaluate the output of infracost diff and stop the CI pipeline if a resource goes over budget. This feature allows managers and team leads to enforce limits. When the policy fails, the CI/CD pipeline stops with an error, preventing the infrastructure from being provisioned.

    A pull request on GitHub with a warning that a policy has been broken.
    When a policy is in place, Infracost warns us if any limits are exceeded.

    Infracost implements policies using Open Policy Agent (OPA), which uses the Rego language to encode policy rules.

    Rego has a ton of features, and it’s worth digging in to learn it thoroughly, but for our purposes, we only need to learn a few keywords:

    • deny[out] defines a new policy rule that fails if the out object has failed: true
    • msg: defines the error message shown when the policy fails.
    • out: defines the logic that makes the policy pass or fails.
    • input: references the contents of the JSON object generated with infracost diff.

    The following example shows a policy that fails when the total budget exceeds $1,000:

    # policy.rego
    
    package infracost
    
    deny[out] {
    
        # define a variable
    	maxMonthlyCost = 1000.0
    
    	msg := sprintf(
    		"Total monthly cost must be less than $%.2f (actual diff is $%.2f)",
    		[maxMonthlyCost, to_number(input.totalMonthlyCost)],
    	)
    
      	out := {
        	"msg": msg,
        	"failed": to_number(input.totalMonthlyCost) >= maxMonthlyCost
      	}
    }

    This is another example that fails if the cost difference is equal to or greater than $500.

    package infracost
    
    deny[out] {
    
      # maxDiff defines the threshold that you require the cost estimate to be below
      maxDiff = 500.0
    
      msg := sprintf(
        "Total monthly cost diff must be less than $%.2f (actual diff is $%.2f)",
        [maxDiff, to_number(input.diffTotalMonthlyCost)],
      )
    
      out := {
        "msg": msg,
        "failed": to_number(input.diffTotalMonthlyCost) >= maxDiff
      }
    }

    You can experiment and try several examples online on the OPA playground. To enforce a policy, you must add the --policy-path option in any of the infracost comment commands like this:

    curl -fsSL https://raw.githubusercontent.com/infracost/infracost/master/scripts/install.sh | sh
    checkout
    infracost diff --path . --usage-file usage.yml --compare-to baseline.json --format json --out-file /tmp/infracost-diff-commit.json
    infracost comment github --path=/tmp/infracost-diff-commit.json --repo=$SEMAPHORE_GIT_REPO_SLUG --commit=$SEMAPHORE_GIT_SHA --github-token=$GITHUB_API_KEY --policy-path policy.rego --behavior=update

    Conclusion

    The power to spin up resources instantly is a double-edged knife: a typo in a Terraform file can be a costly mistake. If you’re already automating deployment and managing services with Terraform, you may as well add Infracost to the mix to help you make more informed decisions and avoid surprises. Setting this up takes only a few minutes and can save thousands of dollars down the road.

    Other posts that might be relevant for you:

    Thanks for reading!

    Leave a Reply

    Your email address will not be published. Required fields are marked *

    Avatar
    Writen by:
    I picked up most of my skills during the years I worked at IBM. Was a DBA, developer, and cloud engineer for a time. After that, I went into freelancing, where I found the passion for writing. Now, I'm a full-time writer at Semaphore.