Deprecation: improve Celery task db query #10414

humitos · 2023-06-08T22:00:06Z

This commit excludes the users that already have an unread notification with this message. This is only to avoid performing db queries for users where our notification backend won't add a new notification anyways. This will improve the performance of this task making it to run faster (hopefully).

readthedocs/projects/tasks/utils.py

humitos · 2023-06-08T23:19:50Z

I found the slow query. It's because it passes 85k slugs into the query:

readthedocs.org/readthedocs/projects/tasks/utils.py

Lines 294 to 298 in dc08e98

    
           user_projects = ( 
        
               AdminPermission.projects(user, admin=True) 
        
               .filter(slug__in=projects) 
        
               .only("slug") 
        
           )

Instead of passing 82k slugs on each query to get the user's projects, we perform a set() intersection in Python which is fast. This reduces the time for this query in a pretty noticeable way.

This will interfere with the logic for sending email notifications. We do want to send an email even if the user didn't read the onsite notification.

humitos · 2023-06-09T08:57:32Z

I tried this code in production and it's a lot faster than the previous implementation. Using Python set() is way better than dumping 82k slugs into the SQL query. So, I'm to merge this PR 👍🏼

benjaoming

Makes sense 👍

Definitely okay to have a slow query for this case, but I totally understand if it got too slow. And that query does seem very slow :)

The code is also pretty readable, and it's definitely preferable to have correctness over performance :)

humitos requested a review from a team as a code owner June 8, 2023 22:00

humitos requested a review from benjaoming June 8, 2023 22:00

auto-assign bot assigned humitos Jun 8, 2023

humitos commented Jun 8, 2023

View reviewed changes

readthedocs/projects/tasks/utils.py Outdated Show resolved Hide resolved

humitos added 4 commits June 9, 2023 01:22

Deprecation: improve slow db query

a914e7c

Instead of passing 82k slugs on each query to get the user's projects, we perform a set() intersection in Python which is fast. This reduces the time for this query in a pretty noticeable way.

Deprecation: adapt comment

8ef9264

Minor fix

a933c70

Remove .exclude of unread notifications

5b5c5b9

This will interfere with the logic for sending email notifications. We do want to send an email even if the user didn't read the onsite notification.

benjaoming approved these changes Jun 9, 2023

View reviewed changes

humitos merged commit 098fc34 into main Jun 12, 2023

humitos deleted the humitos/deprecation-task-query branch June 12, 2023 08:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Deprecation: improve Celery task db query #10414

Deprecation: improve Celery task db query #10414

humitos commented Jun 8, 2023

humitos commented Jun 8, 2023

humitos commented Jun 9, 2023

benjaoming left a comment

Deprecation: improve Celery task db query #10414

Deprecation: improve Celery task db query #10414

Conversation

humitos commented Jun 8, 2023

humitos commented Jun 8, 2023

humitos commented Jun 9, 2023

benjaoming left a comment

Choose a reason for hiding this comment