Skip to content

Remote Repository and Remote Organization Normalization #7949

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 113 commits into from
Mar 22, 2021
Merged
Show file tree
Hide file tree
Changes from 97 commits
Commits
Show all changes
113 commits
Select commit Hold shift + click to select a range
307a1cb
Add Initial Modeling with Through Model and Data Migration for Remote…
saadmk11 Oct 5, 2020
7bd1a5f
Improve data migration performance
saadmk11 Oct 7, 2020
9476a36
Logging, performance optimization for data migration
saadmk11 Oct 10, 2020
1bd03c7
Use TimeStampedModel model and follow django docs for migrating throu…
saadmk11 Oct 10, 2020
c3592eb
Updated data migrations to only migrate RemoteRepositories of recentl…
saadmk11 Nov 14, 2020
c6699fb
Do not remove fields from RemoteRepository Model
saadmk11 Nov 14, 2020
bf5c6b1
Do not Migrate Active Field
saadmk11 Nov 16, 2020
d90ecad
Rename RemoteRelation model to RemoteRepositoryRelation
saadmk11 Nov 16, 2020
2c74027
Merge pull request #7536 from saadmk11/remote-repo-normalization
humitos Nov 16, 2020
9f30059
Update oauth services to use the RemoteRepositoryRelation model
saadmk11 Nov 18, 2020
42a516d
Use remote relations on update_webhook function
saadmk11 Nov 19, 2020
48114e6
Lint fix
saadmk11 Nov 19, 2020
421bcae
Fix tests
saadmk11 Nov 19, 2020
afc7af0
lint fix
saadmk11 Nov 19, 2020
28c6a18
Add TODO for removing json field from RemoteRepository serializer
saadmk11 Nov 20, 2020
37d89af
Optimize remote_relations query
saadmk11 Nov 21, 2020
8d126db
Filter remote_relations according to loggedin user on update_webhook
saadmk11 Nov 24, 2020
dff138e
Only delete RemoteRepositoryRelation objects on remote repository sync
saadmk11 Nov 24, 2020
5d7c56f
Change RemoteRepositoryRelation related_name to remote_repository_rel…
saadmk11 Nov 24, 2020
c194ac7
Delete remote repository relation on sync without filtering with proj…
saadmk11 Nov 24, 2020
76eff0d
Make remoterepository and account unique together
saadmk11 Nov 24, 2020
77c1777
Fix sync_delete test
saadmk11 Nov 24, 2020
076fe93
Add Initial Modeling with Through Model and Data Migration for Remote…
saadmk11 Oct 5, 2020
3376795
Improve data migration performance
saadmk11 Oct 7, 2020
9f7924d
Logging, performance optimization for data migration
saadmk11 Oct 10, 2020
b375696
Use TimeStampedModel model and follow django docs for migrating throu…
saadmk11 Oct 10, 2020
c463880
Updated data migrations to only migrate RemoteRepositories of recentl…
saadmk11 Nov 14, 2020
f72a220
Do not remove fields from RemoteRepository Model
saadmk11 Nov 14, 2020
cff0bfd
Do not Migrate Active Field
saadmk11 Nov 16, 2020
2df6a4e
Rename RemoteRelation model to RemoteRepositoryRelation
saadmk11 Nov 16, 2020
e714315
Merge branch 'remote-repository-normalization' into update-oauth-serv…
saadmk11 Nov 25, 2020
bf8a167
Rename oauth migrations
saadmk11 Nov 25, 2020
69df3d8
Merge pull request #7675 from saadmk11/update-oauth-services
humitos Dec 1, 2020
e1f4ac7
Add remote_id and vcs_provider field to RemoteRepository model and Re…
saadmk11 Dec 2, 2020
570e064
DB Index remote_id Field
saadmk11 Dec 3, 2020
ac0c863
Populate remote_id and vcs_provider on repository sync
saadmk11 Dec 3, 2020
8354b66
Do not merge VCS provider brand with vcs_provider choices
saadmk11 Dec 3, 2020
41bebc9
Fix Tests
saadmk11 Dec 3, 2020
7230e01
Merge pull request #7722 from saadmk11/add_remote_id_field
humitos Dec 3, 2020
1d48b93
Update parts of code that were using the onld model fields
saadmk11 Dec 4, 2020
8df0032
Use same variable naming for remote_repository_relation everywhere
saadmk11 Dec 5, 2020
bddaca2
Do not set remote_repository_relation.account again
saadmk11 Dec 6, 2020
3575d5c
Move redundant code to base Service class
saadmk11 Dec 6, 2020
8b89483
Rename get_remote_relation method
saadmk11 Dec 6, 2020
15ddb3d
update lint
saadmk11 Dec 7, 2020
42e1d52
Update send build status task to use the new RemoteRepositoryRelation…
saadmk11 Dec 8, 2020
ef99b33
Use remote_id and vcs_provider to get RemoteRepository on sync
saadmk11 Dec 8, 2020
6c0b22d
Update sync() method to use remote_id and vcs_provider to delete repos
saadmk11 Dec 8, 2020
56682ae
Merge pull request #7728 from saadmk11/use-remote-relation
humitos Dec 10, 2020
eb4c625
Update provider slug class variable name
saadmk11 Dec 10, 2020
5c1fd68
Merge pull request #7734 from saadmk11/remove_full_name_ref
humitos Dec 10, 2020
3e97c21
Use path_with_namespace for GitLab RemoteRepository full_name Field
saadmk11 Dec 10, 2020
fde005b
Update GitLabOAuthTests
saadmk11 Dec 10, 2020
018dc6b
Merge pull request #7746 from saadmk11/gitlab_repo_full_name
humitos Dec 14, 2020
bae9166
Update remoterepository field name to remote_repository
saadmk11 Dec 19, 2020
9b8e829
Merge pull request #7771 from saadmk11/change-remoterepository-field-…
humitos Dec 22, 2020
39cd6fa
Add migration to create a new table for RemoteRepository and RemoteRe…
saadmk11 Dec 22, 2020
b8d3c09
use full_name for RemoteRepository ordering
saadmk11 Dec 22, 2020
2c830e5
rename remoterepository field in migration
saadmk11 Dec 22, 2020
7950ca5
Merge pull request #7777 from saadmk11/create-remoterepository-table
humitos Dec 22, 2020
e748df6
Updated RemoteRepository API and import.js
saadmk11 Dec 22, 2020
c0a9a16
Fix unauthenticated user issue
saadmk11 Dec 22, 2020
0c04f9f
Fix import sorting
saadmk11 Dec 22, 2020
feeb903
Update readthedocs/api/v2/serializers.py
saadmk11 Dec 29, 2020
53d8f75
return empty queryset if the user is not authenticated
saadmk11 Dec 29, 2020
3ec81be
Add todo in RemoteRepositoryViewSet queryset for future optimization
saadmk11 Dec 29, 2020
d6c5948
Merge pull request #7778 from saadmk11/fix-import-js
humitos Dec 30, 2020
7b64f37
Create new table for RemoteOrganization and RemoteOrganizationRelation
saadmk11 Dec 23, 2020
58973c9
lint fix
saadmk11 Dec 23, 2020
ab7e093
Use TimeStampedModel for RemoteRepository and RemoteOrganizarion
saadmk11 Dec 29, 2020
e36719e
Make account field not nullable
saadmk11 Dec 29, 2020
88d896d
Remove unused import
saadmk11 Dec 29, 2020
629f355
Update VCS service classes to use the new RemoteOrganization modeling
saadmk11 Jan 1, 2021
055138a
Update tests for new RemoteOrganization modeling
saadmk11 Jan 2, 2021
578008f
update tests
saadmk11 Jan 2, 2021
233ac8e
do not delete remote organization on user delete
saadmk11 Jan 4, 2021
9d41148
Merge pull request #7783 from saadmk11/remote-org-normalization
humitos Jan 4, 2021
53d705b
fix merge conflict
saadmk11 Jan 5, 2021
7726375
lint fix
saadmk11 Jan 5, 2021
89960c3
Merge pull request #7795 from saadmk11/merge-master-to-remote-reposit…
humitos Jan 5, 2021
be8282a
Change query on `send_build_status` task for compatibility with .com
humitos Jan 5, 2021
8dcdd42
Move `users` variable definition where it was
humitos Jan 5, 2021
3489bca
Merge pull request #7797 from readthedocs/humitos/send-build-status-q…
humitos Jan 5, 2021
e9619cd
Add manafement command to Sync RemoteRepositories and RemoteOrganizat…
saadmk11 Jan 6, 2021
f275bbb
Add force option to force re-sync VCS provider data
saadmk11 Jan 7, 2021
097ba39
update comment
saadmk11 Jan 8, 2021
705cded
at least one username needs to be provided for users and skip users o…
saadmk11 Jan 8, 2021
1e5dd5e
added --no-dry-run option and added more user count output
saadmk11 Jan 12, 2021
6b2f38a
remove unwanted file
saadmk11 Jan 12, 2021
6f7eefd
use --dry-run insteand of --no-dry-run
saadmk11 Jan 12, 2021
dc40044
Merge pull request #7803 from saadmk11/oauth-sync-command
humitos Jan 12, 2021
265cde0
Merge branch 'master' of github.com:readthedocs/readthedocs.org into …
humitos Feb 23, 2021
ec7729c
Apply suggestions from code review
saadmk11 Feb 26, 2021
a718e78
Add comments and move remote relation methods to model
saadmk11 Feb 28, 2021
cea826b
Add management command to Load Project and RemoteRepository Relations…
saadmk11 Feb 28, 2021
29c437e
Check if the remote_repo was updated or not and log error
saadmk11 Mar 1, 2021
920db54
Merge pull request #7966 from saadmk11/load_project_remote_repo_relation
humitos Mar 1, 2021
d7a0445
update RemoteRepositorySerializer.is_admin comment about cached value
saadmk11 Mar 2, 2021
76604af
Merge branch 'master' of github.com:readthedocs/readthedocs.org into …
humitos Mar 2, 2021
349ba5a
Remove json field from RemoteRepositoryRelation and RemoteOrganizatio…
saadmk11 Mar 8, 2021
36dd7fe
Handle exception on _get_repo_id explicitly
saadmk11 Mar 8, 2021
12e6f64
Handle exception on _get_repo_id explicitly
saadmk11 Mar 8, 2021
5317c80
Merge pull request #7993 from saadmk11/remove-json-field
humitos Mar 8, 2021
99cbb6f
Merge tag '5.12.0' into remote-repository-normalization
humitos Mar 9, 2021
a449fd8
Remove duplicate results from RemoteOrganization API
saadmk11 Mar 10, 2021
08a4641
Add __str__ to RemoteRepositoryRelation and RemoteOrganizationRelatio…
saadmk11 Mar 10, 2021
17621bc
Merge pull request #8000 from saadmk11/fix-remote-org-api
humitos Mar 10, 2021
7fe6116
Add raw_id_fields to RemoteRepositoryAdmin
saadmk11 Mar 10, 2021
1eea07f
Merge pull request #8001 from saadmk11/remote-relation-admin
humitos Mar 10, 2021
f5e01a6
Improvements to sync_vcs_data.py script
humitos Mar 13, 2021
53a1b95
Update readthedocs/oauth/management/commands/sync_vcs_data.py
humitos Mar 15, 2021
20549f5
Merge pull request #8017 from readthedocs/humitos/sync-vcs-script
humitos Mar 15, 2021
73c0447
Merge branch 'master' of github.com:readthedocs/readthedocs.org into …
humitos Mar 16, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 16 additions & 2 deletions readthedocs/api/v2/serializers.py
Original file line number Diff line number Diff line change
Expand Up @@ -177,7 +177,7 @@ class RemoteOrganizationSerializer(serializers.ModelSerializer):

class Meta:
model = RemoteOrganization
exclude = ('json', 'email', 'users')
exclude = ('email', 'users',)


class RemoteRepositorySerializer(serializers.ModelSerializer):
Expand All @@ -188,16 +188,30 @@ class RemoteRepositorySerializer(serializers.ModelSerializer):

# This field does create an additional query per object returned
matches = serializers.SerializerMethodField()
admin = serializers.SerializerMethodField('is_admin')

class Meta:
model = RemoteRepository
exclude = ('json', 'users')
exclude = ('users',)

def get_matches(self, obj):
request = self.context['request']
if request.user is not None and request.user.is_authenticated:
return obj.matches(request.user)

def is_admin(self, obj):
request = self.context['request']

# Use cached value
if hasattr(obj, 'admin'):
return obj.admin

if request.user and request.user.is_authenticated:
return obj.remote_repository_relations.filter(
user=request.user, admin=True
).exists()
return False


class ProviderSerializer(serializers.Serializer):

Expand Down
35 changes: 28 additions & 7 deletions readthedocs/api/v2/views/model_views.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@

from allauth.socialaccount.models import SocialAccount
from django.conf import settings
from django.db.models import BooleanField, Case, Value, When
from django.shortcuts import get_object_or_404
from django.template.loader import render_to_string
from rest_framework import decorators, permissions, status, viewsets
Expand All @@ -20,7 +21,11 @@
from readthedocs.projects.models import Domain, Project
from readthedocs.storage import build_commands_storage

from ..permissions import APIPermission, APIRestrictedPermission, IsOwner
from ..permissions import (
APIPermission,
APIRestrictedPermission,
IsOwner,
)
from ..serializers import (
BuildAdminSerializer,
BuildCommandSerializer,
Expand Down Expand Up @@ -298,7 +303,7 @@ class RemoteOrganizationViewSet(viewsets.ReadOnlyModelViewSet):
def get_queryset(self):
return (
self.model.objects.api(self.request.user).filter(
account__provider__in=[
remote_organization_relations__account__provider__in=[
service.adapter.provider_id for service in registry
],
)
Expand All @@ -313,7 +318,21 @@ class RemoteRepositoryViewSet(viewsets.ReadOnlyModelViewSet):
pagination_class = RemoteProjectPagination

def get_queryset(self):
query = self.model.objects.api(self.request.user)
if not self.request.user.is_authenticated:
return self.model.objects.none()

# TODO: Optimize this query after deployment
query = self.model.objects.api(self.request.user).annotate(
admin=Case(
When(
remote_repository_relations__user=self.request.user,
remote_repository_relations__admin=True,
then=Value(True)
),
default=Value(False),
output_field=BooleanField()
)
)
full_name = self.request.query_params.get('full_name')
if full_name is not None:
query = query.filter(full_name__icontains=full_name)
Expand All @@ -324,18 +343,20 @@ def get_queryset(self):
own = self.request.query_params.get('own', None)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is own here? Could probably use a better name 

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure why it's called own here, I think it refers to the VCS provider names ('github', 'gitlab' etc). We can change the variable name but the quey param name is for the API V2

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could probably make a note on this for the APIv3 where these endpoints are not created yet: #7510

if own is not None:
query = query.filter(
account__provider=own,
remote_repository_relations__account__provider=own,
organization=None,
)

query = query.filter(
account__provider__in=[
remote_repository_relations__account__provider__in=[
service.adapter.provider_id for service in registry
],
)
).distinct()

# optimizes for the RemoteOrganizationSerializer
query = query.select_related('organization')
query = query.select_related('organization').order_by(
'organization__name', 'full_name'
)

return query

Expand Down
1 change: 0 additions & 1 deletion readthedocs/builds/models.py
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,6 @@
)
from readthedocs.builds.version_slug import VersionSlugField
from readthedocs.config import LATEST_CONFIGURATION_VERSION
from readthedocs.core.utils import broadcast
from readthedocs.projects.constants import (
BITBUCKET_COMMIT_URL,
BITBUCKET_URL,
Expand Down
11 changes: 1 addition & 10 deletions readthedocs/core/signals.py
Original file line number Diff line number Diff line change
Expand Up @@ -86,7 +86,7 @@ def decide_if_cors(sender, request, **kwargs): # pylint: disable=unused-argumen


@receiver(pre_delete, sender=settings.AUTH_USER_MODEL)
def delete_projects_and_organizations(sender, instance, *args, **kwargs):
def delete_projects(sender, instance, *args, **kwargs):
# Here we count the owner list from the projects that the user own
# Then exclude the projects where there are more than one owner
# Add annotate before filter
Expand All @@ -98,16 +98,7 @@ def delete_projects_and_organizations(sender, instance, *args, **kwargs):
).exclude(num_users__gt=1)
)

# Here we count the users list from the organization that the user belong
# Then exclude the organizations where there are more than one user
oauth_organizations = (
RemoteOrganization.objects.annotate(num_users=Count('users')
).filter(users=instance.id
).exclude(num_users__gt=1)
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did we remove this code? Do we run it somewhere else?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are not deleting RemoteOrganization on user delete as we will only delete the RemoteOrganizationRelation in this case and that is handled by model on_delete=CASECADE.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we want, we could do a cleanup of organizations/repositories at some point by deleting those that has 0 users connected. I don't think we need this now, tho.


projects.delete()
oauth_organizations.delete()


signals.check_request_enabled.connect(decide_if_cors)
40 changes: 19 additions & 21 deletions readthedocs/core/tests/test_signals.py
Original file line number Diff line number Diff line change
@@ -1,52 +1,50 @@
# -*- coding: utf-8 -*-
import django_dynamic_fixture
import pytest

from django.contrib.auth.models import User

from readthedocs.oauth.models import RemoteOrganization
from readthedocs.projects.models import Project


@pytest.mark.django_db
class TestProjectOrganizationSignal:

@pytest.mark.parametrize('model_class', [Project, RemoteOrganization])
def test_project_organization_get_deleted_upon_user_delete(self, model_class):
"""If the user has Project or RemoteOrganization where he is the only
user, upon deleting his account, the Project or RemoteOrganization
def test_project_get_deleted_upon_user_delete(self):
"""If the user has Project where he is the only
user, upon deleting his account, the Project
should also get deleted."""

obj = django_dynamic_fixture.get(model_class)
project = django_dynamic_fixture.get(Project)
user1 = django_dynamic_fixture.get(User)
obj.users.add(user1)
project.users.add(user1)

obj.refresh_from_db()
assert obj.users.all().count() == 1
project.refresh_from_db()
assert project.users.all().count() == 1

# Delete the user
user1.delete()
# The object should not exist
obj = model_class.objects.all().filter(id=obj.id)
assert not obj.exists()
project = Project.objects.all().filter(id=project.id)
assert not project.exists()

@pytest.mark.parametrize('model_class', [Project, RemoteOrganization])
def test_multiple_users_project_organization_not_delete(self, model_class):
"""Check Project or RemoteOrganization which have multiple users do not
def test_multiple_users_project_not_delete(self):
"""Check Project which have multiple users do not
get deleted when any of the user delete his account."""

obj = django_dynamic_fixture.get(model_class)
project = django_dynamic_fixture.get(Project)
user1 = django_dynamic_fixture.get(User)
user2 = django_dynamic_fixture.get(User)
obj.users.add(user1, user2)
project.users.add(user1, user2)

obj.refresh_from_db()
assert obj.users.all().count() > 1
project.refresh_from_db()
assert project.users.all().count() > 1
# Delete 1 user of the project
user1.delete()

# The project should still exist and it should have 1 user
obj.refresh_from_db()
assert obj.id
obj_users = obj.users.all()
project.refresh_from_db()
assert project.id
obj_users = project.users.all()
assert len(obj_users) == 1
assert user2 in obj_users
9 changes: 8 additions & 1 deletion readthedocs/oauth/admin.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,12 @@

from django.contrib import admin

from .models import RemoteOrganization, RemoteRepository
from .models import (
RemoteOrganization,
RemoteOrganizationRelation,
RemoteRepository,
RemoteRepositoryRelation,
)


class RemoteRepositoryAdmin(admin.ModelAdmin):
Expand All @@ -22,4 +27,6 @@ class RemoteOrganizationAdmin(admin.ModelAdmin):


admin.site.register(RemoteRepository, RemoteRepositoryAdmin)
admin.site.register(RemoteRepositoryRelation)
admin.site.register(RemoteOrganization, RemoteOrganizationAdmin)
admin.site.register(RemoteOrganizationRelation)
9 changes: 9 additions & 0 deletions readthedocs/oauth/constants.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
GITHUB = 'github'
GITLAB = 'gitlab'
BITBUCKET = 'bitbucket'

VCS_PROVIDER_CHOICES = (
(GITHUB, 'GitHub'),
(GITLAB, 'GitLab'),
(BITBUCKET, 'Bitbucket'),
)
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
import json

from django.core.management.base import BaseCommand

from readthedocs.oauth.models import RemoteRepository


class Command(BaseCommand):
help = "Load Project and RemoteRepository Relationship from JSON file"

def add_arguments(self, parser):
# File path of the json file containing relationship data
parser.add_argument(
'--file',
required=True,
nargs=1,
type=str,
help='File path of the json file containing relationship data.',
)

def handle(self, *args, **options):
file = options.get('file')[0]

try:
# Load data from the json file
with open(file, 'r') as f:
data = json.load(f)
except Exception as e:
self.stdout.write(
self.style.ERROR(
f'Exception occurred while trying to load the file "{file}". '
f'Exception: {e}.'
)
)
return

for item in data:
try:
update_count = RemoteRepository.objects.filter(
remote_id=item['remote_id']
).update(project_id=item['project_id'])

if update_count < 1:
self.stdout.write(
self.style.ERROR(
f"Could not update {item['slug']}'s "
f"relationship with {item['html_url']}, "
f"remote_id {item['remote_id']}, "
f"username: {item['username']}."
)
)

except Exception as e:
self.stdout.write(
self.style.ERROR(
f"Exception occurred while trying to update {item['slug']}'s "
f"relationship with {item['html_url']}, "
f"username: {item['username']}, Exception: {e}."
)
)
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,7 @@ def _connect_repositories(self, organization, no_dry_run, only_owners):
Q(ssh_url__in=Subquery(organization.projects.values('repo'))) |
Q(clone_url__in=Subquery(organization.projects.values('repo')))
)
for remote in RemoteRepository.objects.filter(remote_query).order_by('pub_date'):
for remote in RemoteRepository.objects.filter(remote_query).order_by('created'):
admin = json.loads(remote.json).get('permissions', {}).get('admin')

if only_owners and remote.users.first() not in organization.owners.all():
Expand Down
Loading