-
-
Notifications
You must be signed in to change notification settings - Fork 3.6k
Git backend: make default_branch
to point to VCS' default branch
#9424
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Currently, importing a project manually without setting the `default_branch`, will make Git VCS to use `master` as the default branch. However, with the last swap about `master` and `main` branches, repositories defaulting to `main` branch fail when being imported on Read the Docs. Leaving the user in a blocked state they can get out. This commit allows `Version.identifier` to be NULL meaning that "it won't try to guess what's the default branch of the VCS" and just leave the repository in the immediate state after being cloned (which is already the default branch). To do this, I'm removing the command `git checkout --force <branch>` from the steps executed for this case. Closes #9367
This works fine when importing and building. However, when a webhook is pushed, we need to do some work to add the mapping between the branch/tag updated and the |
GH webhook sends Or even, checking if |
As long as that's not going to become an infinite loop of sorts? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sounds like the way to go! Do we have test cases where no git identifiers are specified or Version.identifier
isn't set?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like a reasonable approach. I think just doing the clone and then using the branc that we pull without trying to checkout a specific branch is a reasonable approach for version syncing as well.
defaul_branch
to point to VCS' default branchdefault_branch
to point to VCS' default branch
Still remaining:
|
Use `default_branch` coming from the Webhook itself to decide if a build for Latest Version with `identifier=None` has to be triggered or not. Note that this feature only works for GitHub and GitLab since Bitbucket does not send the `default_branch` of the repository.
Doing a little house cleaning for sprints: what is the status of this PR? Blocked on anything in particular? |
@agjohnson we need to keep thinking how to relate the incoming webhook for the default version with the "latest" version itself |
Noted for later - tried to test-run this by importing a new project without setting the default branch. Migration issue needs to be fixed:
and (not sure if related to the missing migration), I encounter this on project import:
|
…mitos/vcs-default-branch
The VCS's `default_branch` comes in the incoming webhooks that GitHub and GitLab sends to us when the user realizes an action in the repository. Note that in this commit BitBucket is not supported and will keep having the issues this PR solves. To solve this problem, we would need to update the `Version.identifier` from the builder immediately after clonning the default branch (e.g. `git clone <URL>`). I put some commented code/ideas in place to come back to this once we feel it's the right time.
I updated this branch with the latest changes in After that, I worked on solving the problem for GitHub and GitLab since they provide the I tried to be clear by writing comments in the code explaining the changes in meaning and logic behind this. It's tricky to follow and it's easy to get lost, even after you think you understand the things. So, I'm happy to jump into a call and do some pair review for this if you think it's gonna be easier. |
This is related to #9736 in some degree. We can continue following this path and remove this command completely by adding an extra argument to the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like a great approach, and pretty straightforward! It makes a lot of sense to me.
# The idea is to hit the APIv2 here to update the `latest` version with | ||
# the `default_branch` we just got from the repository itself, | ||
# after clonning it. | ||
# However, we don't know the PK for the version we want to update. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could just add a get_default_branch
method to the class here, and hit the API from the caller where we have the version?
readthedocs.org/readthedocs/doc_builder/director.py
Lines 200 to 204 in 239d890
self.vcs_repository.update() | |
identifier = self.data.build_commit or self.data.version.identifier | |
log.info("Checking out.", identifier=identifier) | |
self.vcs_repository.checkout(identifier) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hrm... We could, yes. I'm not sure if it's better than just initializing this class with a read Version
object instead of the version_slug
only.
- the
clone()
method's caller is the same class; from outside, the method called isupdate()
--but I understand it's the same for the purpose of your comment - I do see hitting the API from the
clone()
method a little more organic, since the caller doesn't need to think about updating thedefault_branch
each time after callingclone()
--However, I don't see it bad for now.
There are other potential solutions as well:
- initialize the class with
version_pk
too - initialize the class with a complete
Version
object - use APIv3 (instead of APIv2) from the builder that supports getting a
Version
object without thepk
: https://docs.readthedocs.io/en/stable/api/v3.html#version-detail
It probably makes sense to use APIv3 since it's part of the organic transition to remove APIv2 internally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hrm, thinking a little more about this... These ides will interfere with this other improvement #9736 where we will be cloning the exact branch directly, instead of the default_branch
and then switching to the one wanted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could also be part of another issue to investigate a little more, if you think it's not mandatory for this PR for now. It only affects Bitbucket currently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's fine to not include, but I just wanted to note that we have options here to get the data we want by thinking about getting it another way. 🤷
Meh, I made a mistake. This commit should have gone to a different branch. This reverts commit fc90d36.
Co-authored-by: Eric Holscher <[email protected]>
I replied to all the feedback that I receive. There are some good ideas that may require some extra time investigating and decided how to proceed. I suggested to open new issues for those instead of blocking this PR again. Other than that, this is ready for a re-review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is probably good to go ahead with. I do think we should probably not try to deploy it this week since we already have other big build changes going out. I'd be 👍 shipping it next deploy tho.
People is able to pass `default_branch=main` so Read the Docs can decide whether or not it has to trigger a build for `LATEST` in case its `identifier=None` (the default branch of the repository)
All these tests are expecting `master` to be the default branch. We have to force this now since it was automatically "guess" previously, but now we are not guessing anymore.
I don't understand why we were checking for this in this test, but I think it's not necessary since it's an old feature.
I made some latest changes to make the tests accurate to the changes made on this PR. I also added support for Note that 90% of the QA was successful and it's working fine. However, there is something that's not perfect still. When the $ curl -X POST devthedocs.org/api/v2/webhook/test-builds/1/ -H "Content-Type: application/json" --data @webhook.json
{"build_triggered":false,"project":"test-builds","versions":[]} Being {
"created": false,
"deleted": false,
"forced": false,
"default_branch": "main",
"ref": "refs/heads/main"
} I don't think this case is terrible and should not cause troubles, but it's not 100% correct, tho. I suppose there may be other edge cases like this one that we could continue researching and working on to fix them, but I don't think they are a huge priority now, considering that "Importing a repository manually and do not specify a curl -X POST devthedocs.org/api/v2/webhook/test-builds/1/ -H "Content-Type: application/json" --data @webhook.json
{"build_triggered":true,"project":"test-builds","versions":["latest"]} @ericholscher please, review these changes again and let me know if you think we are moving in the right direction. These changes are getting more complex that we all thought for sure 😩 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure we care about the generic build API, tbh. I think it still looks fine.
I removed because executing tests with `--nomigrations` made it fail.
Great work @humitos! I haven't tried it out yet, but I'll probably be doing some related stuff with my own development environment, so will send back feedback if anything comes up.
I agree and it was also a proposed solution: ask the user to fill in the default branch, then verify that it exists as part of validating form input. We could open up an issue on adding this validation? I don't think there is any contradiction to the current implementation.. we can still allow the field to be left blank and try to resolve the default branch automatically. |
I'm not sure how to do this. We cannot verify the branch exists while the user is filling the form. We would need to clone the repository in the background before being able to validate the form. It seems complicated to me, at the moment.
In theory, this should work with the implementation done in this PR. It should use whatever is the branch cloned after executing |
I think there are others, such as sending HEAD requests to known HTTP endpoints or using |
Oh, yeah. In fact, we are already using that and I've implemented it in our code 😅 , ha ha. Manually importing repositories is a discouraged action as we were saying and we may remove it in the future. However, feel free to open a issue with the proposed feature/form validation for this and we can come back to it when needed 👍🏼 |
Sounds good! I'll open an issue about validation of the form field for the default branch 👍 |
Currently, importing a project manually without setting the
default_branch
,will make Git VCS to use
master
as the default branch. However, with the lastswap about
master
andmain
branches, repositories defaulting tomain
branch fail when being imported on Read the Docs. Leaving the user in a blocked
state they can get out.
This commit allows
Version.identifier
to be NULL meaning that "it won't try toguess what's the default branch of the VCS" and just leave the repository in the
immediate state after being cloned (which is already the default branch). To do
this, I'm removing the command
git checkout --force <branch>
from the stepsexecuted for this case.
Closes #9367