Skip to content

Remote Repository and Remote Organization Normalization #7949

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 113 commits into from
Mar 22, 2021

Conversation

saadmk11
Copy link
Member

@saadmk11 saadmk11 commented Feb 24, 2021

This PR contains the full implementation for the Remote Repository and Remote Organization Normalization.

This PR creates separate database tables for the RemoteRepository and RemoteOrganization with the new modeling.
This also updates the codebase with the changes for the new modeling and adds a management command to re-sync all the remote repositories and remoter organizations.

The tasks done in this PR can be found here: https://github.com/orgs/readthedocs/projects/74#column-11164063

Closes #7572
Closes #7727

saadmk11 and others added 30 commits November 14, 2020 14:55
@humitos
Copy link
Member

humitos commented Mar 1, 2021

I did some research and tests for the database constraint that we will need to deal with. The plan here will be:

  1. remove all the constraints from the new tables (oauth_remoterepository_2020, oauth_remoterepository_relations, oauth_remoteorganization_relations) that reference to current used tables (socialaccount_socialaccount, auth_user, etc) that could be deleted while doing the 1-week resync
  2. do the resync during that week
  3. re-add these constraints as NOT VALID (more below) so deleted users/accounts are not validated, but all new data is validated before insert/update
  4. remove all rows from new tables that are not valid (the row referenced in the other table was deleted) --this step is not strict, but is a good cleanup
  5. run ALTER TABLE ... VALIDATE CONSTRAINT ... after the table is cleanup from invalid rows

PostgreSQL documentation: https://www.postgresql.org/docs/9.1/sql-altertable.html

ADD table_constraint [ NOT VALID ]
This form adds a new constraint to a table using the same syntax as CREATE TABLE, plus the option NOT VALID, which is currently only allowed for foreign key constraints. If the constraint is marked NOT VALID, the potentially-lengthy initial check to verify that all rows in the table satisfy the constraint is skipped. The constraint will still be enforced against subsequent inserts or updates (that is, they'll fail unless there is a matching row in the referenced table). But the database will not assume that the constraint holds for all rows in the table, until it is validated by using the VALIDATE CONSTRAINT option.

As example in my local instance:

  • DROP CONSTRAINT
# Drop constraints from RemoteRepositoryRelation
ALTER TABLE oauth_remoterepositoryrelation DROP CONSTRAINT oauth_remotereposito_account_id_02d56944_fk_socialacc;
ALTER TABLE oauth_remoterepositoryrelation DROP CONSTRAINT oauth_remoterepositoryrelation_user_id_3c2baf85_fk_auth_user_id;

# Drop constraints from RemoteOrganizationRelation
ALTER TABLE oauth_remoteorganizationrelation DROP CONSTRAINT oauth_remoteorganiza_account_id_8dbef288_fk_socialacc;
ALTER TABLE oauth_remoteorganizationrelation DROP CONSTRAINT oauth_remoteorganiza_user_id_0603492b_fk_auth_user;
  • ADD CONSTRAINT
# Add constraints from RemoteRepositoryRelation
ALTER TABLE "oauth_remoterepositoryrelation" ADD CONSTRAINT "oauth_remoterepositoryrelation_user_id_3c2baf85_fk_auth_user_id" FOREIGN KEY ("user_id") REFERENCES "auth_user" ("id") DEFERRABLE INITIALLY DEFERRED;

ALTER TABLE "oauth_remoterepositoryrelation" ADD CONSTRAINT "oauth_remotereposito_account_id_02d56944_fk_socialacc" FOREIGN KEY ("account_id") REFERENCES "socialaccount_socialaccount" ("id") DEFERRABLE INITIALLY DEFERRED;


# Add constraints for RemoteOrganizationRelation
ALTER TABLE "oauth_remoteorganizationrelation" ADD CONSTRAINT "oauth_remoteorganiza_user_id_0603492b_fk_auth_user" FOREIGN KEY ("user_id") REFERENCES "auth_user" ("id") DEFERRABLE INITIALLY DEFERRED;

ALTER TABLE "oauth_remoteorganizationrelation" ADD CONSTRAINT "oauth_remoteorganiza_account_id_8dbef288_fk_socialacc" FOREIGN KEY ("account_id") REFERENCES "socialaccount_socialaccount" ("id") DEFERRABLE INITIALLY DEFERRED;

First, try to ADD CONSTRAINT as they are here (changing the name, probably for the production ones) and if they fail saying that there are some invalid data, add NOT VALID at the end. After that, do the cleanup step.

Note I found the ADD CONSTRAINT commands by running inv docker.manage sqlmigrate oauth 0013 (and 0012)

@ericholscher
Copy link
Member

ericholscher commented Mar 1, 2021

NOT VALID is awesome! This sounds like a good plan 👍

Copy link
Member

@ericholscher ericholscher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes look good!

@humitos
Copy link
Member

humitos commented Mar 2, 2021

Just to mention that I linked docs from 9.1 and we are using v12. The documentation is a updated and better: https://www.postgresql.org/docs/12/sql-altertable.html

First, try to ADD CONSTRAINT as they are here (changing the name, probably for the production ones) and if they fail saying that there are some invalid data, add NOT VALID at the end. After that, do the cleanup step.

After reading the v12 docs it seems it's better to use NOT VALID at first (to avoid a huge scan on the table that will lock it) and once it's done use VALIDATE CONSTRAINT that won't lock the table. However, from the Notes section,

validation acquires only a SHARE UPDATE EXCLUSIVE lock on the table being altered. (If the constraint is a foreign key then a ROW SHARE lock is also required on the table referenced by the constraint.)

it seems we may need to lock the other tables? (Project and SocialAcctount)

@humitos humitos added the PR: hotfix Pull request applied as hotfix to release label Mar 16, 2021
@humitos
Copy link
Member

humitos commented Mar 16, 2021

This PR is merged in rel and already deployed in Community. I'm not merging it into master yet because we don't want to be deployed in Commercial yet. We are still doing some QA on .org and we haven't triggered the re-sync task on .com yet.

@humitos humitos removed the Status: blocked Issue is blocked on another issue label Mar 22, 2021
Copy link
Member

@humitos humitos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR is ready to be merged and it will go out tomorrow in commercial.

@humitos humitos merged commit b076906 into master Mar 22, 2021
@humitos humitos deleted the remote-repository-normalization branch March 22, 2021 15:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
PR: hotfix Pull request applied as hotfix to release
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Race condition with get_or_create in remote repositories repository import filter doesn't
3 participants