-
-
Notifications
You must be signed in to change notification settings - Fork 3.6k
Build: use scoped credentials for interacting with S3 #12078
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 15 commits
d670a5e
a2c7b35
952d72d
e2ee90d
de4ce2f
a2501c8
c014677
8a3b757
8e9500b
f097e0c
03898bf
d4fdefb
c7b482b
8b18cc7
4bb5206
354000e
f3a125c
0091598
1ad2576
221dbad
59bd314
7defc5e
c482d0a
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,30 @@ | ||
AWS temporary credentials | ||
========================= | ||
|
||
Builders run arbitrary commands provided by the user, while we run the commands in a sandboxed environment (docker), | ||
that shouln't be the only line of defense, as we still interact with the files generated by the user outside docker for some operations. | ||
|
||
This is why instead of using credentials that have access to all the resources in AWS, | ||
we are using credentials that are generated by the `AWS STS service <https://docs.aws.amazon.com/STS/latest/APIReference/welcome.html>`__, | ||
which are temporary and scoped to the resources that are needed for the build. | ||
|
||
Local development | ||
----------------- | ||
|
||
In order to make use of STS, you need: | ||
|
||
- Create a role in IAM with a trusted entity type set to the AWS account that is going to be used to generate the temporary credentials. | ||
- Create an inline policy for the role, the policy should allow access to all S3 buckets and paths that are going to be used. | ||
|
||
You can use :ref:`environment variables <settings:AWS configuration>` to set the credentials for AWS, make sure to set the value of ``RTD_S3_PROVIDER`` to ``AWS``. | ||
|
||
.. note:: | ||
|
||
If you are part of the development team, you should be able to use the credentials from the ``storage-dev``` user, | ||
which is already configured to make use of STS, and the ARN from the ``RTDSTSAssumeRoleDev`` role. | ||
|
||
.. note:: | ||
|
||
You should use AWS only when you are testing the AWS integration, | ||
use the default minio provider for local development. | ||
Otherwise, files may be overridden if multiple developers are using the same credentials. |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -167,6 +167,22 @@ providers using the following environment variables: | |
.. envvar:: RTD_SOCIALACCOUNT_PROVIDERS_GOOGLE_CLIENT_ID | ||
.. envvar:: RTD_SOCIALACCOUNT_PROVIDERS_GOOGLE_SECRET | ||
|
||
AWS configuration | ||
~~~~~~~~~~~~~~~~~ | ||
|
||
The following variables can be used to use AWS in your local environment. | ||
stsewd marked this conversation as resolved.
Show resolved
Hide resolved
|
||
Useful for testing :doc:`temporary credentials </aws-temporary-credentials>`. | ||
|
||
.. envvar:: RTD_S3_PROVIDER | ||
.. envvar:: RTD_AWS_ACCESS_KEY_ID | ||
.. envvar:: RTD_AWS_SECRET_ACCESS_KEY | ||
.. envvar:: RTD_AWS_STS_ASSUME_ROLE_ARN | ||
Comment on lines
+176
to
+179
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This changes the current semantic of the We have been using There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. All environment variables that are passed to our application are prefixed with RTD, settings related to the platform are prefixed with RTD (settings.py). |
||
.. envvar:: RTD_S3_MEDIA_STORAGE_BUCKET | ||
.. envvar:: RTD_S3_BUILD_COMMANDS_STORAGE_BUCKET | ||
.. envvar:: RTD_S3_BUILD_TOOLS_STORAGE_BUCKET | ||
.. envvar:: RTD_S3_STATIC_STORAGE_BUCKET | ||
.. envvar:: RTD_AWS_S3_REGION_NAME | ||
|
||
Stripe secrets | ||
~~~~~~~~~~~~~~ | ||
|
||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,233 @@ | ||
""" | ||
Module to interact with AWS STS (Security Token Service) to assume a role and get temporary scoped credentials. | ||
|
||
This is mainly used to generate temporary credentials to interact with S3 buckets from the builders. | ||
|
||
In order to make use of STS, we need: | ||
|
||
- Create a role in IAM with a trusted entity type set to the AWS account that is going to be used to generate the temporary credentials. | ||
- A policy that allows access to all S3 buckets and paths that are going to be used. | ||
Which should be attached to the role. | ||
- The permissions of the temporary credentials are the result of the intersection of the role policy and the inline policy that is passed to the AssumeRole API. | ||
This means that the inline policy can be used to limit the permissions of the temporary credentials, but not to expand them. | ||
|
||
See: | ||
|
||
- https://docs.aws.amazon.com/STS/latest/APIReference/API_AssumeRole.html | ||
- https://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_sts-comparison.html | ||
- https://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_temp_control-access_assumerole.html | ||
- https://docs.readthedocs.com/dev/latest/aws-temporary-credentials.html | ||
""" | ||
|
||
import json | ||
from dataclasses import dataclass | ||
|
||
import boto3 | ||
import structlog | ||
from django.conf import settings | ||
|
||
|
||
log = structlog.get_logger(__name__) | ||
|
||
|
||
class AWSTemporaryCredentialsError(Exception): | ||
"""Exception raised when there is an error getting AWS S3 credentials.""" | ||
|
||
|
||
@dataclass | ||
class AWSTemporaryCredentials: | ||
"""Dataclass to hold AWS temporary credentials.""" | ||
|
||
access_key_id: str | ||
secret_access_key: str | ||
session_token: str | None | ||
|
||
|
||
@dataclass | ||
class AWSS3TemporaryCredentials(AWSTemporaryCredentials): | ||
"""Subclass of AWSTemporaryCredentials to include S3 specific fields.""" | ||
|
||
bucket_name: str | ||
region_name: str | ||
|
||
|
||
def get_sts_client(): | ||
return boto3.client( | ||
"sts", | ||
aws_access_key_id=settings.AWS_ACCESS_KEY_ID, | ||
aws_secret_access_key=settings.AWS_SECRET_ACCESS_KEY, | ||
# TODO: should this be its own setting? | ||
region_name=settings.AWS_S3_REGION_NAME, | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is this TODO still valid? It seems it's its own setting already. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This setting is linked to storage, not the region name of the account used to interact with STS, not sure if it matters anyway... There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I could try testing without region and see 🤷♂️ There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Looks like it's recommend to explicitly set a region, in our case all the resources are from the same account that is in the same region, so it's valid to share the same region with the S3 setting There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We could probably have a AWS_REGION_NAME setting, and AWS_S3_REGION_NAME defaults to that one, but could also just use AWS_S3_REGION_NAME everywhere. |
||
) | ||
|
||
|
||
def _get_scoped_credentials(*, session_name, policy, duration) -> AWSTemporaryCredentials: | ||
""" | ||
:param session_name: An identifier to attach to the generated credentials, useful to identify who requested them. | ||
AWS limits the session name to 64 characters, so if the session_name is too long, it will be truncated. | ||
:duration: The duration of the credentials in seconds. Default is 15 minutes. | ||
stsewd marked this conversation as resolved.
Show resolved
Hide resolved
|
||
Note that the minimum duration time is 15 minutes and the maximum is given by the role (defaults to 1 hour). | ||
|
||
stsewd marked this conversation as resolved.
Show resolved
Hide resolved
|
||
.. note:: | ||
|
||
If USING_AWS is set to False, this function will return | ||
the values of the AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY settings. | ||
Useful for local development where we don't have a service like AWS STS. | ||
""" | ||
if not settings.USING_AWS: | ||
return AWSTemporaryCredentials( | ||
access_key_id=settings.AWS_ACCESS_KEY_ID, | ||
secret_access_key=settings.AWS_SECRET_ACCESS_KEY, | ||
# A session token is not needed for the default credentials. | ||
session_token=None, | ||
) | ||
|
||
# Limit to 64 characters, as per AWS limitations. | ||
session_name = session_name[:64] | ||
try: | ||
sts_client = get_sts_client() | ||
response = sts_client.assume_role( | ||
RoleArn=settings.AWS_STS_ASSUME_ROLE_ARN, | ||
RoleSessionName=session_name, | ||
Policy=json.dumps(policy), | ||
DurationSeconds=duration, | ||
) | ||
except Exception: | ||
log.exception( | ||
"Error while assuming role to generate temporary credentials", | ||
session_name=session_name, | ||
policy=policy, | ||
duration=duration, | ||
) | ||
raise AWSTemporaryCredentialsError | ||
|
||
credentials = response["Credentials"] | ||
return AWSTemporaryCredentials( | ||
access_key_id=credentials["AccessKeyId"], | ||
secret_access_key=credentials["SecretAccessKey"], | ||
session_token=credentials["SessionToken"], | ||
) | ||
|
||
|
||
def get_s3_build_media_scoped_credentials( | ||
*, | ||
build, | ||
duration=60 * 15, | ||
) -> AWSS3TemporaryCredentials: | ||
""" | ||
Get temporary credentials with read/write access to the build media bucket. | ||
|
||
The credentials are scoped to the paths that the build needs to access. | ||
|
||
:duration: The duration of the credentials in seconds. Default is 15 minutes. | ||
Note that the minimum duration time is 15 minutes and the maximum is given by the role (defaults to 1 hour). | ||
""" | ||
project = build.project | ||
version = build.version | ||
bucket_arn = f"arn:aws:s3:::{settings.S3_MEDIA_STORAGE_BUCKET}" | ||
storage_paths = version.get_storage_paths() | ||
# Generate the list of allowed prefix resources | ||
# The resulting prefix looks like: | ||
# - html/project/latest/* | ||
# - pdf/project/latest/* | ||
allowed_prefixes = [f"{storage_path}/*" for storage_path in storage_paths] | ||
|
||
# Generate the list of allowed object resources in ARN format. | ||
# The resulting ARN looks like: | ||
# arn:aws:s3:::readthedocs-media/html/project/latest/* | ||
# arn:aws:s3:::readthedocs-media/pdf/project/latest/* | ||
allowed_objects_arn = [f"{bucket_arn}/{prefix}" for prefix in allowed_prefixes] | ||
|
||
# Inline policy document to limit the permissions of the temporary credentials. | ||
policy = { | ||
"Version": "2012-10-17", | ||
"Statement": [ | ||
{ | ||
"Effect": "Allow", | ||
"Action": [ | ||
"s3:GetObject", | ||
"s3:PutObject", | ||
"s3:DeleteObject", | ||
], | ||
"Resource": allowed_objects_arn, | ||
}, | ||
# In order to list the objects in a path, we need to allow the ListBucket action. | ||
# But since that action is not scoped to a path, we need to limit it using a condition. | ||
{ | ||
"Effect": "Allow", | ||
"Action": ["s3:ListBucket"], | ||
"Resource": [ | ||
bucket_arn, | ||
], | ||
"Condition": { | ||
"StringLike": { | ||
"s3:prefix": allowed_prefixes, | ||
} | ||
}, | ||
}, | ||
], | ||
} | ||
|
||
session_name = f"rtd-{build.id}-{project.slug}-{version.slug}" | ||
credentials = _get_scoped_credentials( | ||
session_name=session_name, | ||
policy=policy, | ||
duration=duration, | ||
) | ||
return AWSS3TemporaryCredentials( | ||
access_key_id=credentials.access_key_id, | ||
secret_access_key=credentials.secret_access_key, | ||
session_token=credentials.session_token, | ||
region_name=settings.AWS_S3_REGION_NAME, | ||
bucket_name=settings.S3_MEDIA_STORAGE_BUCKET, | ||
) | ||
|
||
|
||
def get_s3_build_tools_scoped_credentials( | ||
*, | ||
build, | ||
duration=60 * 15, | ||
) -> AWSS3TemporaryCredentials: | ||
""" | ||
Get temporary credentials with read-only access to the build-tools bucket. | ||
|
||
:param build: The build to get the credentials for. | ||
:duration: The duration of the credentials in seconds. Default is 15 minutes. | ||
stsewd marked this conversation as resolved.
Show resolved
Hide resolved
|
||
Note that the minimum duration time is 15 minutes and the maximum is given by the role (defaults to 1 hour). | ||
""" | ||
project = build.project | ||
version = build.version | ||
bucket = settings.S3_BUILD_TOOLS_STORAGE_BUCKET | ||
bucket_arn = f"arn:aws:s3:::{bucket}" | ||
|
||
# Inline policy to limit the permissions of the temporary credentials. | ||
# The build-tools bucket is publicly readable, so we don't need to limit the permissions to a specific path. | ||
policy = { | ||
"Version": "2012-10-17", | ||
"Statement": [ | ||
{ | ||
"Effect": "Allow", | ||
"Action": [ | ||
"s3:GetObject", | ||
"s3:ListBucket", | ||
], | ||
"Resource": [ | ||
bucket_arn, | ||
f"{bucket_arn}/*", | ||
], | ||
}, | ||
], | ||
} | ||
session_name = f"rtd-{build.id}-{project.slug}-{version.slug}" | ||
credentials = _get_scoped_credentials( | ||
session_name=session_name, | ||
policy=policy, | ||
duration=duration, | ||
) | ||
return AWSS3TemporaryCredentials( | ||
access_key_id=credentials.access_key_id, | ||
secret_access_key=credentials.secret_access_key, | ||
session_token=credentials.session_token, | ||
region_name=settings.AWS_S3_REGION_NAME, | ||
bucket_name=bucket, | ||
) |
Uh oh!
There was an error while loading. Please reload this page.