Skip to content

SSO: re-sync VCS accounts for SSO organization daily #8601

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Oct 21, 2021
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
25 changes: 25 additions & 0 deletions readthedocs/oauth/tasks.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@
from django.contrib.auth.models import User
from django.utils.translation import ugettext_lazy as _

from readthedocs.core.permissions import AdminPermission
from readthedocs.core.utils.tasks import PublicTask, user_id_matches
from readthedocs.oauth.notifications import (
AttachWebhookNotification,
Expand Down Expand Up @@ -49,6 +50,30 @@ def sync_remote_repositories(user_id):
)


@app.task(queue='web')
def sync_remote_repositories_organizations():
"""
Re-sync users member of organizations with SSO enabled.

It will trigger one `sync_remote_repositories` task per user.
"""
query = (
SSOIntegration.objects
.filter(provider=SSOIntegration.PROVIDER_ALLAUTH)
.values_list('organization', flat=True)
)
log.info('Triggering scheduled SSO re-sync for all organizations. count=%s', query.count())
for organization in query:
members = AdminPermission.members(organization)
log.info(
'Triggering scheduled SSO re-sync for organization. organization=%s users=%s',
organization.slug,
members.count(),
)
for user in members:
sync_remote_repositories.delay(user.pk)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems this might back up the queue a good bit. I don't know a good way to avoid it, but might make sense to delay this a little bit in the task or something? (eg. sleep 1 second between organizations or something?) Though even that will probably totally overwhelm the queue. I wish we had a good way to do a low priority task via celery.

Copy link
Member Author

@humitos humitos Oct 20, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can delay the execution of the task by passing countdown=seconds (https://docs.celeryproject.org/en/stable/userguide/calling.html#eta-and-countdown). So, we can trigger all the task with seconds being 0, 30, 60, 90... or similar. Does this make sense to you?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, we can try it if that works. Perhaps split them up by 5 seconds or something? But I'm not sure if that will overload the queue (or if our monitoring will take it into account), so not sure the best solution.



@app.task(queue='web')
def attach_webhook(project_pk, user_pk, integration=None):
"""
Expand Down
5 changes: 5 additions & 0 deletions readthedocs/settings/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -422,6 +422,11 @@ def TEMPLATES(self):
'schedule': crontab(minute=0, hour=1),
'options': {'queue': 'web'},
},
'every-day-resync-sso-organization-users': {
'task': 'readthedocs.oauth.tasks.sync_remote_repositories_organizations',
'schedule': crontab(minute=0, hour=4),
'options': {'queue': 'web'},
},
'hourly-archive-builds': {
'task': 'readthedocs.builds.tasks.archive_builds',
'schedule': crontab(minute=30),
Expand Down