Skip to content

Commit a012077

Browse files
authored
Add GitHub App service (#12072)
This is the big one :D - After this PR is merged, we will start keeping in sync installations and remote repositories, but they won't be exposed to users yet (we can manually set up our projects to use a remote repo to test things out, private repos are not supported with this PR). - All operations from the old integration are supported, except for cloning private repos (will open another PR for that). - PyGitHub is an okay solution, it abstracts dealing with the JWT and has types for everything. But it can also be inefficient in some cases (like to interact with a commit, you need to retrieve the commit from the API first), but I think that's fine for now, we can always fallback to do this manually, and hopefully shouldn't be that much work, or the other option is keep using PyGitHub and use a raw call (requester.getJsonResponse("url")). - Webhooks always recover nicely in case the installation wasn't tracked, so if one operation fails for whatever reason (a bug, github is down, our app is down), we can recover the state in the next operation. Users can also manually hit "sync". - The webhook is exposed from the `/webhook/githubapp/` URL, we can change that if needed. - We need to add the secrets and IDs on the ops repos. - Could I have divided this into smaller PRs? Maybe, but that may require more time removing the tests that are not needed... Extracted from #11942
1 parent c551a26 commit a012077

25 files changed

+3155
-17
lines changed

docs/dev/github-app.rst

Lines changed: 46 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,46 @@
1+
GitHub App
2+
==========
3+
4+
Our GitHub App integration consists of a GitHub App (one for each platform, readthedocs.org and readthedocs.com),
5+
which can be installed on a user's account or organization.
6+
7+
After installing the GitHub App, users can grant acccess to all repositories or select specific repositories,
8+
this allows Read the Docs to access the repositories and perform actions on them, such as reporting build statuses,
9+
and subscribe to events like push and pull request events.
10+
11+
Unlike our other Git integrations, this doesn't rely on user's OAuth2 tokens to interact with the repositories, instead it makes use of the installation.
12+
As long as the installation is active and has access to the repositories, Read the Docs can perform actions on them.
13+
14+
Installations can be `suspended <https://docs.github.com/en/apps/maintaining-github-apps/suspending-a-github-app-installation>`__,
15+
which means that the GitHub App is still installed but cannot access any resources, this action is reversible.
16+
Installations can also be uninstalled, which means that the GitHub App is removed from the account or organization,
17+
and cannot access any resources, this action is irreversible.
18+
19+
You can read more about GitHub Apps in `GitHub's documentation <https://docs.github.com/en/apps/overview>`__.
20+
21+
Modeling
22+
--------
23+
24+
Since we no longer make use of user's OAuth2 tokens, we need to keep track of all the installations of our GitHub App in the database (``GitHubAppInstallation`` model).
25+
Each remote repository has a foreign key to the installation that has access to it,
26+
so we can use that installation to perform actions on the repository.
27+
28+
Keeping data in sync
29+
--------------------
30+
31+
Once a GitHub app is installed, GitHub will send webhook events to our application (``readthedocs.oauth.views``) related to the installation
32+
(even when the installation is uninstalled or revoked), repositories, and organizations.
33+
This allows us to always have the ``RemoteRepository`` and ``RemoteOrganization`` models and its permissions in sync with GitHub.
34+
35+
The only use case for using the user's OAuth2 token is to get all the installations the user has access to,
36+
this helps us to keep permissions in sync with GitHub.
37+
38+
Security
39+
--------
40+
41+
- All webhooks are signed with a secret, which we verify before processing each event.
42+
- Since we make use of the installation to perform actions on the repositories instead of the user's OAuth2 token,
43+
we make sure that only users with admin permissions on the repository can link the repository to a Read the Docs project.
44+
- Once we lose access to a repository (e.g. the installation is uninstalled or revoked, or the project was deselected from the installation),
45+
we remove the remote repository from the database, as we don't want to keep the relation bettween the project and the repository.
46+
This is to prevent connecting the repository to the project again without the user's consent if they grant access to the repository again.

docs/dev/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -29,5 +29,6 @@ or taking the open source Read the Docs codebase for your own custom installatio
2929
server-side-search
3030
search-integration
3131
subscriptions
32+
github-app
3233
settings
3334
tests

docs/dev/install.rst

Lines changed: 23 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -262,10 +262,32 @@ For others, the webhook will simply fail to connect when there are new commits t
262262

263263
* Configure the applications on GitHub, Bitbucket, and GitLab.
264264
For each of these, the callback URI is ``http://devthedocs.org/accounts/<provider>/login/callback/``
265-
where ``<provider>`` is one of ``github``, ``gitlab``, or ``bitbucket_oauth2``.
265+
where ``<provider>`` is one of ``github``, ``githubapp``, ``gitlab``, or ``bitbucket_oauth2``.
266266
When setup, you will be given a "Client ID" (also called an "Application ID" or just "Key") and a "Secret".
267267
* Take the "Client ID" and "Secret" for each service and set them as :ref:`environment variables <settings:Allauth secrets>`.
268268

269+
Configuring GitHub App
270+
~~~~~~~~~~~~~~~~~~~~~~
271+
272+
- Create a new GitHub app from https://github.com/settings/apps/new.
273+
- Callback URL should be ``http://devthedocs.org/accounts/githubapp/login/callback/``.
274+
- Keep marked "Expire user authorization tokens"
275+
- Activate the webhook, and set the URL to one provided by a service like `Webhook.site <https://docs.webhook.site/cli.html>`__ to forward all incoming webhooks to your local development instance.
276+
You should forward all events to ``http://devthedocs.org/webhook/githubapp/``.
277+
- In permissions, select the following:
278+
279+
- Repository permissions: Commit statuses (read and write, so we can create commit statuses),
280+
Contents (read only, so we can clone repos with a token),
281+
Metadata (read only, so we read the repo collaborators),
282+
Pull requests (read and write, so we can post a comment on PRs in the future).
283+
- Organization permissions: Members (read only so we can read the organization members).
284+
- Account permissions: Email addresses (read only, so allauth can fetch all verified emails).
285+
286+
- Subscribe to the following events: Installation target, Member, Organization, Membership, Pull request, Push, and Repository.
287+
- Copy the "Client ID" and "Client Secret" and set them as :ref:`environment variables <settings:Allauth secrets>`.
288+
- Generate a webhook secret and a private key from the GitHub App settings,
289+
and set them as :ref:`environment variables <settings:GitHub App secrets>`.
290+
269291
Troubleshooting
270292
---------------
271293

docs/dev/settings.rst

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -167,6 +167,16 @@ providers using the following environment variables:
167167
.. envvar:: RTD_SOCIALACCOUNT_PROVIDERS_GOOGLE_CLIENT_ID
168168
.. envvar:: RTD_SOCIALACCOUNT_PROVIDERS_GOOGLE_SECRET
169169

170+
GitHub App
171+
~~~~~~~~~~
172+
173+
You can use the following environment variables to set the settings used by the GitHub App:
174+
175+
.. envvar:: RTD_GITHUB_APP_ID
176+
.. envvar:: RTD_GITHUB_APP_NAME
177+
.. envvar:: RTD_GITHUB_APP_PRIVATE_KEY
178+
.. envvar:: RTD_GITHUB_APP_WEBHOOK_SECRET
179+
170180
Stripe secrets
171181
~~~~~~~~~~~~~~
172182

readthedocs/oauth/clients.py

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,10 @@
11
from datetime import datetime
22

33
import structlog
4+
from django.conf import settings
45
from django.utils import timezone
6+
from github import Auth
7+
from github import GithubIntegration
58
from requests_oauthlib import OAuth2Session
69

710

@@ -70,3 +73,15 @@ def get_oauth2_client(account):
7073
token_updater=_get_token_updater(token),
7174
)
7275
return session
76+
77+
78+
def get_gh_app_client() -> GithubIntegration:
79+
"""Return a client authenticated as the GitHub App to interact with the API."""
80+
app_auth = Auth.AppAuth(
81+
app_id=settings.GITHUB_APP_CLIENT_ID,
82+
private_key=settings.GITHUB_APP_PRIVATE_KEY,
83+
# 10 minutes is the maximum allowed by GitHub.
84+
# PyGithub will handle the token expiration and renew it automatically.
85+
jwt_expiry=60 * 10,
86+
)
87+
return GithubIntegration(auth=app_auth)

readthedocs/oauth/models.py

Lines changed: 83 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,7 @@
11
"""OAuth service models."""
22

3+
from functools import cached_property
4+
35
import structlog
46
from allauth.socialaccount.models import SocialAccount
57
from django.contrib.auth.models import User
@@ -12,6 +14,7 @@
1214
from readthedocs.projects.constants import REPO_CHOICES
1315
from readthedocs.projects.models import Project
1416

17+
from .constants import GITHUB_APP
1518
from .constants import VCS_PROVIDER_CHOICES
1619
from .querysets import RemoteOrganizationQuerySet
1720
from .querysets import RemoteRepositoryQuerySet
@@ -84,6 +87,86 @@ class GitHubAppInstallation(TimeStampedModel):
8487
class Meta(TimeStampedModel.Meta):
8588
verbose_name = _("GitHub app installation")
8689

90+
@cached_property
91+
def service(self):
92+
"""Return the service for this installation."""
93+
from readthedocs.oauth.services import GitHubAppService
94+
95+
return GitHubAppService(self)
96+
97+
def delete(self, *args, **kwargs):
98+
"""Override delete method to remove orphaned remote organizations."""
99+
self.delete_repositories()
100+
return super().delete(*args, **kwargs)
101+
102+
def delete_repositories(self, repository_ids: list[int] | None = None):
103+
"""
104+
Delete repositories linked to this installation.
105+
When an installation is deleted, we delete all its remote repositories
106+
and relations, users will need to manually link the projects to each repository again.
107+
We also remove organizations that don't have any repositories after removing the repositories.
108+
:param repository_ids: List of repository ids (remote ID) to delete.
109+
If None, all repositories will be considered for deletion.
110+
"""
111+
# repository_ids is optional (None, which means all repositories),
112+
# but if it's an empty list, we don't want to delete anything.
113+
if repository_ids is not None and not repository_ids:
114+
log.info("No remote repositories to delete")
115+
return
116+
117+
remote_organizations = RemoteOrganization.objects.filter(
118+
repositories__github_app_installation=self,
119+
vcs_provider=GITHUB_APP,
120+
)
121+
remote_repositories = self.repositories.filter(vcs_provider=GITHUB_APP)
122+
if repository_ids:
123+
remote_organizations = remote_organizations.filter(
124+
repositories__remote_id__in=repository_ids
125+
)
126+
remote_repositories = remote_repositories.filter(remote_id__in=repository_ids)
127+
128+
# Fetch all IDs before deleting the repositories, so we can filter the organizations later.
129+
remote_organizations_ids = list(remote_organizations.values_list("id", flat=True))
130+
131+
count, deleted = remote_repositories.delete()
132+
log.info(
133+
"Deleted remote repositories",
134+
count=count,
135+
deleted=deleted,
136+
installation_id=self.installation_id,
137+
target_id=self.target_id,
138+
target_type=self.target_type,
139+
)
140+
141+
count, deleted = RemoteOrganization.objects.filter(
142+
id__in=remote_organizations_ids,
143+
repositories=None,
144+
).delete()
145+
log.info(
146+
"Deleted orphaned remote organizations",
147+
count=count,
148+
deleted=deleted,
149+
installation_id=self.installation_id,
150+
target_id=self.target_id,
151+
target_type=self.target_type,
152+
)
153+
154+
def delete_organization(self, organization_id: int):
155+
"""Delete a remote organization and all its remote repositories and relations from the database."""
156+
count, deleted = RemoteOrganization.objects.filter(
157+
remote_id=str(organization_id),
158+
vcs_provider=GITHUB_APP,
159+
).delete()
160+
log.info(
161+
"Deleted remote organization",
162+
count=count,
163+
deleted=deleted,
164+
organization_id=organization_id,
165+
installation_id=self.installation_id,
166+
target_id=self.target_id,
167+
target_type=self.target_type,
168+
)
169+
87170

88171
class RemoteOrganization(TimeStampedModel):
89172
"""

readthedocs/oauth/querysets.py

Lines changed: 12 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,7 @@
33
from django.db import models
44

55
from readthedocs.core.querysets import NoReprQuerySet
6+
from readthedocs.oauth.constants import GITHUB_APP
67

78

89
class RelatedUserQuerySet(NoReprQuerySet, models.QuerySet):
@@ -12,7 +13,11 @@ def api(self, user=None):
1213
"""Return objects for user."""
1314
if not user.is_authenticated:
1415
return self.none()
15-
return self.filter(users=user)
16+
queryset = self.filter(users=user)
17+
# TODO: Once we are migrated into GitHub App we should include these repositories/organizations.
18+
# Exclude repositories/organizations from the GitHub App for now to avoid duplicated entries.
19+
queryset = queryset.exclude(vcs_provider=GITHUB_APP)
20+
return queryset
1621

1722
def api_v2(self, *args, **kwargs):
1823
# API v2 is the same as API v3 for .org, but it's
@@ -28,10 +33,14 @@ def for_project_linking(self, user):
2833
Repositories can be linked to a project only if the user has admin access
2934
to the repository on the VCS service.
3035
"""
31-
return self.filter(
36+
queryset = self.filter(
3237
remote_repository_relations__user=user,
3338
remote_repository_relations__admin=True,
34-
).distinct()
39+
)
40+
# TODO: Once we are migrated into GitHub App we should include these repositories/organizations.
41+
# Exclude repositories/organizations from the GitHub App for now to avoid duplicated entries.
42+
queryset = queryset.exclude(vcs_provider=GITHUB_APP)
43+
return queryset.distinct()
3544

3645

3746
class RemoteOrganizationQuerySet(RelatedUserQuerySet):

readthedocs/oauth/services/__init__.py

Lines changed: 11 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,16 @@
44
from readthedocs.oauth.services import bitbucket
55
from readthedocs.oauth.services import github
66
from readthedocs.oauth.services import gitlab
7+
from readthedocs.oauth.services.githubapp import GitHubAppService
8+
9+
10+
__all__ = [
11+
"GitHubService",
12+
"BitbucketService",
13+
"GitLabService",
14+
"GitHubAppService",
15+
"registry",
16+
]
717

818

919
class GitHubService(SettingsOverrideObject):
@@ -21,4 +31,4 @@ class GitLabService(SettingsOverrideObject):
2131
_override_setting = "OAUTH_GITLAB_SERVICE"
2232

2333

24-
registry = [GitHubService, BitbucketService, GitLabService]
34+
registry = [GitHubService, BitbucketService, GitLabService, GitHubAppService]

readthedocs/oauth/services/base.py

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -34,7 +34,7 @@ class Service:
3434
vcs_provider_slug: str
3535
allauth_provider = type[OAuth2Provider]
3636

37-
url_pattern: re.Pattern | None
37+
url_pattern: re.Pattern | None = None
3838
default_user_avatar_url = settings.OAUTH_AVATAR_USER_DEFAULT_URL
3939
default_org_avatar_url = settings.OAUTH_AVATAR_ORG_DEFAULT_URL
4040
supports_build_status = False
@@ -106,6 +106,10 @@ def send_build_status(self, *, build, commit, status):
106106
"""
107107
raise NotImplementedError
108108

109+
def get_clone_token(self, project):
110+
"""Get a token used for cloning the repository."""
111+
raise NotImplementedError
112+
109113
@classmethod
110114
def is_project_service(cls, project):
111115
"""
@@ -326,3 +330,7 @@ def sync_repositories(self):
326330

327331
def sync_organizations(self):
328332
raise NotImplementedError
333+
334+
def get_clone_token(self, project):
335+
"""User services make use of SSH keys only for cloning."""
336+
return None

0 commit comments

Comments
 (0)