How to safely prevent your AWS users from making calls outside of your ecosystem.

Building a secure environment revolves around making incremental improvements targeted to thwart specific types of threats. A common way in which an intruder can infiltrate your network is through obtaining credentials which grant access to a restricted system. Credentials can leak in a myriad of ways, from developers accidentally committing them to source control, to services logging them in a widely accessible location, to users leaving them unencrypted on a compromised host. They could even leak through being sent over a channel incorrectly assumed to be secure.

Fortunately, there are several ways to combat these sorts of mistakes. The most common of these attempt to limit the window in which a leaked credential could be used by an attacker, either through frequently rotating credentials or by regularly scanning through source code and logs. But what if you could make the credentials effectively worthless outside of the environment in which you expect them to be used? And arguably more importantly, how can you enable such a restriction without causing pain to existing automation and users who may use their credentials in ways you hadn’t anticipated?

This guide will dive into some technical examples, so with that, we expect it to provide the most value to system administrators looking to improve their security posture.

Before we begin: Preparing your systems

Before starting this IAM credential lockdown, you'll need to setup the following:

All AWS resources must be running inside of one or more VPCs (Virtual Private Clouds)
VPC endpoints must be accessible to every host on which your services are running. Here are details on configuring them.
A CloudTrail Trail needs to be enabled for all API activity on your account

Already, we’ve thrown around quite a few AWS service names. This overall strategy can be adapted to work on other popular cloud providers like Google Cloud Platform or Microsoft Azure. Some service name mappings can be found in the table below:

Amazon Web Services (AWS)	Google Cloud Platform (GCP)	Microsoft Azure
Virtual Private Cloud	Virtual Private Cloud	Azure Virtual Network
CloudTrail	Cloud Audit Logs	Azure Audit Logs
Athena	BigQuery	Azure Synapse Analytics
Identity and Access Management (IAM)	Identity and Access Management	Azure Identity Management
Simple Storage Service (S3)	Cloud Storage	Azure Storage

With the above satisfied, the level of effectiveness and success of this strategy will depend on the percentage of IAM users calling out resources in the same region as the request’s origin. We’ll dive more into that limitation later.

Building the policy

AWS allows access control to be configured through policies attached to the user making a request or through policies attached to the target resource directly. In this case, it makes the most sense to apply this restriction via a policy assigned to IAM users given that if you limit power at the source, you don’t have to worry about limiting it at every destination. Say you’re in a scenario where you have a user with the S3 buckets they have permission to manage defined through a regex. In this case, it’s much easier to apply this restriction to the user itself than to apply the same restriction on every current and future bucket matching that regex.

We’ll dive right in by introducing the most basic version of the policy we wish to enforce, then gradually add elements as we try to account for more cases.

	{
	"Version": "2012-10-17",
	"Id": "restrict-to-set-of-vpcs",
	"Statement": [
	{
	"Sid": "DenyAllOutsideOfVpcsAndNotAllowlisted",
	"Effect": "Deny",
	"Resource": "*",
	"Condition": {
	"StringNotEquals": {
	"aws:sourceVpc": [
	"vpc-foo",
	"vpc-bar",
	"vpc-baz"
	]
	},
	"StringNotEqualsIgnoreCaseIfExists": {
	"aws:PrincipalTag/canMakeRequestsOutsideOfVpc": "true"
	}
	}
	}
	]
	}

view raw restrict_to_vpcs.json hosted with ❤ by GitHub

The primary function of this policy is to deny all requests not originating from a trusted VPC. On line 11, we take advantage of the aws:sourceVpc metadata AWS passes along with requests made through a VPC endpoint. The second condition beginning on line 17 allows us to quickly and easily maintain an allowlist of users who are exempt from this policy. We simply check for the presence of a boolean tag on the caller, and if not present or explicitly set to false, we enforce the first condition as usual.¹

So now we’re ready to apply this policy to every user in our account right? Not quite.

Obtaining and assessing usage patterns of your IAM users

Before enforcing this rule on a given IAM user, you first want to be confident that you are not going to block an existing workflow and suddenly cause their requests to return 403s. Here’s where having a CloudTrail configured to record AWS API activity comes in handy.

To recap, CloudTrail is AWS’s audit logging service that pipes logs into S3. Athena provides a way to make queries against S3 buckets. Together, they provide a powerful way to increase visibility into the events that take place in your account. More details, including how to partition CloudTrail logs for performance and scalability can be found in these AWS docs.

Using Athena to query CloudTrail’s dataset, we can determine with a level of confidence whether a given user is likely to be making its requests exclusively through a set of VPCs. If that is the case, they are a good candidate for the policy defined above.

Querying Athena

The table name can vary based on your CloudTrail and Athena configuration, as I mentioned earlier, but the basic idea is to query Athena for the specific fields needed to fully understand how users are calling out to AWS services:

    select distinct account, useridentity.username, vpcendpointid, sourceipaddress 
  from cloudtrail_logs 
  where useridentity.type = 'IAMUser'

  

view raw athena_query.md hosted with ❤ by GitHub

We created an automated job to run these queries hourly against the previous hours' data, and store the results in a database. This gave us the ability to quickly answer questions like:

Does user x exclusively make calls through VPC endpoints?
When was the last time user x made a call from outside of our tracked VPCs?
If user y is making calls from outside our tracked VPCs, which IP addresses or VPC endpoints are the requests originating from?

Current limitations of CloudTrail

At the time of this writing, CloudTrail only logs control plane calls for most services. This means that, with some exceptions like S3 for which CloudTrail logs both data and control plane actions, only a subset of calls to all other AWS services will be logged. Details on what events CloudTrail records can be found in the official documentation. In practice, this does not pose a real concern to our approach, given that if a user makes a control plane request to service A through a VPC endpoint, and their request is logged via CloudTrail, all data plane requests from the same user and client configuration can be expected to go through that same VPC endpoint for service A.

Safely locking down users

As data accumulates using the process highlighted above, the confidence in gauging how a user generally behaves will also increase. After defining a threshold at which to lock down a user, applying the policy is as easy as:

Attaching the policy to an IAM group
In batches, adding users as members to the group once deemed safe

It’s essential to roll the first few users into small batches with extensive monitoring before ramping up the lockdown rate.

Digging into special cases

What would an engineering blog post be without highlighting some edge cases we came across? No doubt every organization will use different sets of AWS services in slightly different ways, but the following were some of the use cases we considered.

Adding an additional dimension: Trusting VPCs across your organization’s AWS accounts and regions

There are use cases in which you may want to allow IAM users belonging to one account to be used from within another. The VPC metadata passed with each request (i.e. aws:sourceVpc), with which the policy constructed above is concerned, is passed along in calls made from VPC endpoints within the same region as the target resource. In practice, it appears AWS even passes along aws:sourceVpc for calls made from one account to another, as long as they are owned by the same underlying organization and the calls are made within a single region. Due to the number of differences in how organizations manage their AWS accounts, we decided to omit solving for cross-region calls in this guide.

Considering local development?

Developers often need to be able to run code locally that makes calls out to their AWS account. Unless they are doing so from an SSH session in an EC2 instance placed within one of their VPCs, they will be blocked by our policy in its current form. The simplest way we found to solve this was to direct our locally running AWS clients to route their requests through a proxy we created within our VPC. Properly restricting access to this proxy is essential to not introducing an unintended backdoor through this policy.

Making S3 objects publicly accessible

It may be desirable in some cases to allow public access to S3 objects directly. This can be accomplished through the use of presigned URLs. The current policy just needs a tweak to account for this specific type of request:

	{
	"Version": "2012-10-17",
	"Id": "restrict-to-set-of-vpcs-allow-presigned-get-object",
	"Statement": [
	{
	"Sid": "DenyAllOutsideOfVpcsAndNotAllowlisted",
	"Effect": "Deny",
	"NotAction": "s3:getObject",
	"Resource": "*",
	"Condition": {
	"StringNotEquals": {
	"aws:sourceVpc": [
	"vpc-foo",
	"vpc-bar",
	"vpc-baz"
	]
	},
	"StringNotEqualsIgnoreCaseIfExists": {
	"aws:PrincipalTag/canMakeRequestsOutsideOfVpc": "true"
	}
	}
	},
	{
	"Sid": "DenyGetObjectCallsOutsideOfVpcsAndNotPresignedAndNotAllowlisted",
	"Effect": "Deny",
	"Action": "s3:getObject",
	"Resource": "*",
	"Condition": {
	"StringNotEquals": {
	"aws:sourceVpc": [
	"vpc-foo",
	"vpc-bar",
	"vpc-baz"
	],
	"s3:authtype": "REST-QUERY-STRING"
	},
	"StringNotEqualsIgnoreCaseIfExists": {
	"aws:PrincipalTag/canMakeRequestsOutsideOfVpc": "true"
	}
	}
	}
	]
	}

view raw restrict_to_vpcs_v2.json hosted with ❤ by GitHub

Here, we want to have different logic depending on whether the user is calling s3:getObject using presigned URL auth or not. To break down how we did this, we split our policy into these two cases ⁠— the first of which (on line 8) denies all actions except s3:GetObject, and the second (line 26) denies s3:getObject exclusively. Besides the substitution of NotAction for Action, the second statement looks for a third condition to avoid getting denied for s3:getObject calls ⁠— that is, the s3:authtype of the request being REST-QUERY-STRING which evaluates to true for presigned URL authed calls.

Proxying API calls through other AWS services

Through experimenting with the above policies, we found that some Athena calls were still being denied despite having a properly configured Athena VPC endpoint in the region. Upon inspecting the metadata of these requests in CloudTrail, we found that they were actually being proxied through the public Athena AWS endpoint. Simply adding the additional condition on line 9 below allowed us to avoid hitting the deny condition on calls made on behalf of Athena.

	{
	"Condition": {
	"StringNotEquals": {
	"aws:sourceVpc": [
	"vpc-foo",
	"vpc-bar",
	"vpc-baz"
	],
	"aws:CalledVia": "athena.amazonaws.com"
	},
	"StringNotEqualsIgnoreCaseIfExists": {
	"aws:PrincipalTag/canMakeRequestsOutsideOfVpc": "true"
	}
	}
	}

view raw athena_special_case.json hosted with ❤ by GitHub

Alternatively, you may want to use the key aws:ViaAWSService with a value of true to generalize this rule to allow any calls made on behalf of another AWS service.

Conclusion: Continuing to add layers of security

These safeguards alone will not magically solve all of the threats that can arise from leaked credentials. They instead act as another step that when taken, minimizes the risk credentials pose when they fall into the wrong hands.

Going beyond the strategies already discussed, it’s equally important to consider how one can limit the options for attackers inside of a set of VPCs. This policy, for example, would not block an attacker from executing calls from a compromised host with unencrypted credentials, so practicing least-privilege on your IAM users can limit the potential impact of a compromised user. One should also consider whether hosts within a VPC all need access to the public internet given that if an attacker is able to break in and access sensitive data, preventing them from exfiltrating it could make a significant difference in the severity of the impact.

While it may seem daunting to consider every vector of attack, it’s important to remember how security is a layered objective where every level built upon the next has a meaningful impact on protecting your customers and their data.

Figures/References

Figure 1: How several conditions and/or condition keys are evaluated. Source: the official AWS IAM policy documentation.

If your policy has multiple condition operators or multiple keys attached to a single condition operator, the conditions are evaluated using a logical AND. If a single condition operator includes multiple values for one key, that condition operator is evaluated using a logical OR. All conditions must resolve to true to trigger the desired Allow or Deny effect.

¹As a recap of IAM policy evaluation, when multiple keys are specified under one condition, they are evaluated with logical OR, so in this case, both conditions must evaluate to false for this policy to deny the request. For more details, see Figure 1 at the end of this post, or refer to the official IAM policy documentation.

Want to work on a team that's just as invested in how you work as what you're working on? Check out our open positions and apply.

Making Leaked Credentials Useless for Attackers

Before we begin: Preparing your systems

Building the policy

Obtaining and assessing usage patterns of your IAM users

Querying Athena

Current limitations of CloudTrail

Safely locking down users

Digging into special cases

Adding an additional dimension: Trusting VPCs across your organization’s AWS accounts and regions

Considering local development?

Making S3 objects publicly accessible

Proxying API calls through other AWS services

Conclusion: Continuing to add layers of security

Figures/References

Recommended Articles

How to Learn Complex Things Quickly: A Guide

Ten Minutes to Help Secure Your Online Accounts

Architecture of a Java Agent to Inject Chaos

Saving Millions on Logging: Delivering Savings

Saving Millions on Logging: Finding Relevant Savings

Lessons Learned from Last Week's S3 Outage

Join our subscribers

Making Leaked Credentials Useless for Attackers

Before we begin: Preparing your systems

Building the policy

Obtaining and assessing usage patterns of your IAM users

Querying Athena

Current limitations of CloudTrail

Safely locking down users

Digging into special cases

Adding an additional dimension: Trusting VPCs across your organization’s AWS accounts and regions

Considering local development?

Making S3 objects publicly accessible

Proxying API calls through other AWS services

Conclusion: Continuing to add layers of security

Figures/References

Recommended Articles

How to Learn Complex Things Quickly: A Guide

Ten Minutes to Help Secure Your Online Accounts

Architecture of a Java Agent to Inject Chaos

Saving Millions on Logging: Delivering Savings

Saving Millions on Logging: Finding Relevant Savings

Lessons Learned from Last Week's S3 Outage

Join our subscribers

Get Connected