-
-
Notifications
You must be signed in to change notification settings - Fork 3.6k
Git backend: Improve heuristic checking if GITLAB_MR_PULL_PATTERN is needed #10397
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Git backend: Improve heuristic checking if GITLAB_MR_PULL_PATTERN is needed #10397
Conversation
be2922b
to
f230881
Compare
Since design decision are needed to have a complete integration of self-hosted GitLab instances, the scope of this changes is purposefully kept small. Unclear how to address the remaining
|
…needed This commit is an incremental improvement toward having full support for self-hosted GitLab instance. It ensures that the expected sources are fetched when a build is triggered after a merge request event is processed in the context of a GitLab integration webhook associated with a self-hosted GitLab instance.
f230881
to
0d4e6cb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have the full context of this PR yet, but I just skimmed it pretty quick and made some questions. It doesn't seem like a lot of code, but I just need to find the time to prioritize it.
On the other hand, it would be great to have some tests cases for this.
# ImportError: cannot import name 'Project' from partially initialized module | ||
# 'readthedocs.projects.models' (most likely due to a circular import) | ||
_gitlab_webhook = "gitlab_webhook" # Integration.GITLAB_WEBHOOK |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it's a circular import issue, we could import this constant from inside this static method to solve this problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, I will update accordingly
if ( | ||
project.integrations.count() == 1 | ||
and project.integrations.first().integration_type == _gitlab_webhook | ||
): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure to understand the context behind checking for .count() == 1
here? Isn't there another way to be 100% sure it was triggered by a merge request? Can't we check the body of the original request to decide this or similar?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have access to this webhook data during the build process?
The current behavior will guess the provider type by pattern matching the repo URL with supported oauth integrations.
It does seem a bit strange to have to look up which webhook integrations exist in order to resolve the Git provider's type in each and every build. But I'm not sure if there's currently much else available for the build process.
It could seem like a good approach to store the type of Git provider either when the webhook is added or when it's called. Or have a manually configured Project setting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the review 🙏
To provide some background, the current approach was added to minimize the amount of changes, avoid breaking existing workflow and slightly improve the support for self-hosted GitLab instance.
What is needed ?
Add support for "triggering context"
To properly support this, we would probably need to support querying details about the triggering context.
This information would then then be used to:
- customize constants like
GITLAB_URL
,GITLAB_COMMIT_URL
,GITLAB_MERGE_REQUEST_COMMIT_URL
andGITLAB_MERGE_REQUEST_URL
currently set inprojects/constants.py
1 - query the source of the trigger in
vcs_support/backends/git.py
2
Support for updating build status
To support updating status on instance different from the official GitLab.com
and GitHub.com
, I think we following could be done:
- the
Integration
model could be updated to have an additional field calledapi_token
3 to be used in inoauth.services.base.(create_session|get_session)
4 itself called fromoauth.services.gitlab.send_build_status
5 - the integration details UI could then be updated to include a new "API Token" field
Footnotes
-
https://github.com/readthedocs/readthedocs.org/blob/9.13.2/readthedocs/projects/constants.py#L368-L383 ↩
-
https://github.com/readthedocs/readthedocs.org/blob/9.13.2/readthedocs/vcs_support/backends/git.py#L153-L172 ↩
-
https://github.com/readthedocs/readthedocs.org/blob/9.13.2/readthedocs/integrations/models.py#L264-L327 ↩
-
https://github.com/readthedocs/readthedocs.org/blob/9.13.2/readthedocs/oauth/services/base.py#L77-L116 ↩
-
https://github.com/readthedocs/readthedocs.org/blob/9.13.2/readthedocs/oauth/services/gitlab.py#L522-L587 ↩
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I 'm not sure to understand the context behind checking for .count() == 1 here?
Since it seems there is no way to have details about the triggering context1, the proposed patch is done by assuming that:
- if there is only one integration
- and it the integration version type is
EXTERNAL
- and if the integration type is
Integration.GITLAB_WEBHOOK
- and if it is NOT coming from the
gitlab.com
provider
In that specific case, it is then "safe" to consider it was triggered by a merge request coming from a GitLab hosted instance.
Hence the function _is_trigger_integration_gitlab
in this PR
Isn't there another way to be 100% sure it was triggered by a merge request?
Given the current implementation, I didn't find a way.
Can't we check the body of the original request to decide this or similar?
This could indeed be done where the webhook is handled2 by looking for X-Gitlab-Instance
(e.g X-Gitlab-Instance: https://gitlab.kitware.com
).
Possible patch to readthedocs/api/v2/views/integrations.py for adding get_external_provider_url()
diff --git a/readthedocs/api/v2/views/integrations.py b/readthedocs/api/v2/views/integrations.py
index 3ff940dd9..c610b7e71 100644
--- a/readthedocs/api/v2/views/integrations.py
+++ b/readthedocs/api/v2/views/integrations.py
@@ -50,6 +50,7 @@ GITLAB_MERGE_REQUEST_MERGE = "merge"
GITLAB_MERGE_REQUEST_OPEN = "open"
GITLAB_MERGE_REQUEST_REOPEN = "reopen"
GITLAB_MERGE_REQUEST_UPDATE = "update"
+GITLAB_INSTANCE_HEADER= "HTTP_X_GITLAB_INSTANCE"
GITLAB_TOKEN_HEADER = "HTTP_X_GITLAB_TOKEN"
GITLAB_PUSH = "push"
GITLAB_NULL_HASH = "0" * 40
@@ -138,6 +139,10 @@ class WebhookMixin:
"""Handle webhook payload."""
raise NotImplementedError
+ def get_external_provider_url(self):
+ """Get External provider URL from payload header."""
+ raise NotImplementedError
+
def get_external_version_data(self):
"""Get External Version data from payload."""
raise NotImplementedError
@@ -237,6 +242,7 @@ class WebhookMixin:
:param project: Project instance
:type project: readthedocs.projects.models.Project
"""
+ provider_url = self.get_external_provider_url()
version_data = self.get_external_version_data()
# create or get external version object using `verbose_name`.
external_version = get_or_create_external_version(
@@ -569,6 +575,10 @@ class GitLabWebhookView(WebhookMixin, APIView):
return False
return token == secret
+ def get_external_provider_url(self):
+ """Get External provider URL from payload header."""
+ return self.request.META.get(GITLAB_INSTANCE_HEADER, "https://gitlab.com")
+
def get_external_version_data(self):
"""Get commit SHA and merge request number from payload."""
try:
What is not clear is how to pass the all the way down to the git backend. Hence the discussion above to support passing down the triggering context details.
Footnotes
-
That information does not seem to be associated with any of the objects currently available from
vcs_support.backends.git.fetch()
(see https://github.com/readthedocs/readthedocs.org/blob/9.13.2/readthedocs/vcs_support/base.py#L57-L75 and https://github.com/readthedocs/readthedocs.org/blob/9.13.2/readthedocs/vcs_support/backends/git.py) ↩ -
https://github.com/readthedocs/readthedocs.org/blob/9.13.2/readthedocs/api/v2/views/integrations.py. ↩
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the other hand, it would be great to have some tests cases for this.
Ditto. Not being familiar with the code base, I would appreciate some guidance (e.g similar to test to use an example)
This commit is an incremental improvement toward having full support for self-hosted GitLab instance.
It ensures that the expected sources are fetched when a build is triggered after a merge request event is processed in the context of a GitLab integration webhook associated with a self-hosted GitLab instance.
Related issues: