Skip to content

Build: support cloning private repos with token #12115

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 14 commits into from
May 14, 2025

Conversation

stsewd
Copy link
Member

@stsewd stsewd commented Apr 15, 2025

First I wanted to pass the env var just in the clone step, but we don't allow passing additional env vars once the environment is created, so it's available in the whole "clone" environment. The access token we create is read-only, and should be scoped to just one project as well (waiting on PyGithub/PyGithub#3287).

Once the clone is done, the token is stored in the .git/config file, so that token isn't always kept secret from the rest of the build like ssh keys, but since the token is read-only and scoped to the current project, and temporary (1 hour). It should be fine. Additionally, the token is only created for private repos, meaning that only people with explicit access to the repo may be able to extract the token, but again, since they already have access to the repo, there is no additional permissions the token is granting to the user (will document this in #12114).

@stsewd stsewd marked this pull request as ready for review April 23, 2025 22:54
@stsewd stsewd requested a review from a team as a code owner April 23, 2025 22:54
@stsewd stsewd requested a review from ericholscher April 23, 2025 22:54
Copy link
Member

@ericholscher ericholscher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once the clone is done, the token is stored in the .git/config file, so that token isn't always kept secret from the rest of the build like ssh keys, but since the token is read-only and scoped to the current project, and temporary (1 hour). It should be fine.

Can we disable this behavior? Do we actually need it stored locally for some reason, or is this an anti-feature that we don't want?

@stsewd
Copy link
Member Author

stsewd commented May 13, 2025

Can we disable this behavior? Do we actually need it stored locally for some reason, or is this an anti-feature that we don't want?

Storing the token is done by Git in order for git operations to continue working after the clone (fetch, pull, etc). This is kind of similar when using ssh keys, after clone, you can still interact with Git. But there are a couple of differences:

  • When using SSH keys, we don't expose the private key, but the agent keeps running in the "clone" environment, outside that the agent isn't exposed anymore. This allows for operations like unshallow a repo.
  • When using a token, the token is stored for the whole build. Is this a security issue? No, as long as the token is very scoped, the users executing the build already have access to the files from the repo, so there are no additional permissions granted, also, the token expires after an hour, so there is no permanent access granted.

Of course, we can delete the token after the "clone" environment is done, so it behaves similar to the SSH agent, but that doesn't offer any additional protection, as the token is already exposed in the post_checkout step. But also, looks like the token grants access to some metadata by default as well, like the repo description and topics.

Comment on lines +474 to +488
# TODO: Use self.gh_app_client.get_access_token instead,
# once https://github.com/PyGithub/PyGithub/pull/3287 is merged.
_, response = self.gh_app_client.requester.requestJsonAndCheck(
"POST",
f"/app/installations/{self.installation.installation_id}/access_tokens",
headers=self.gh_app_client._get_headers(),
input={
"repository_ids": [int(project.remote_repository.remote_id)],
"permissions": {
"contents": "read",
},
},
)
token = response["token"]
return f"x-access-token:{token}"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the PR from pygithub hasn't been merged, so I went ahead and used the internal request object to make the raw request to get the token scoped a single repo.

@stsewd stsewd merged commit abf58ae into main May 14, 2025
5 checks passed
@stsewd stsewd deleted the support-clonning-private-repos-with-token branch May 14, 2025 16:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants