-
-
Notifications
You must be signed in to change notification settings - Fork 3.6k
DB: do not fetch data
and others when deleting rows
#10446
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This task was cancelled again. The query shows a SELECT first that fetchs the whole rows. I think we can reduce this time/memory by only fetching the ids.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be great to do something about this, but I think we need a different approach, judging from the docs?
We need to find a way to avoid doing a |
I think raw SQL makes sense for tasks like these generally, as it explicit won't do any signal processing, and makes it really clear what we're doing. 👍 |
We are facing an issue with this query because it takes too long to execute (more than 30s) making our DB to kill the query. This is because Django performs a `SELECT` first to be able to trigger pre_ and post_ delete signals on each object delete. We don't really need this here, so we are using raw SQL to bypass this and make the query to execute faster. This is not ideal, but we didn't find a better approach.
I updated this PR to use the DB cursor directly. Please, take another look and let me know. If you approve this PR and think the SQL is fine, I can manually run this code in production before merging to be 100% it will work fine. I tested the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, 👍 on running the query manually as a test to start.
What about running these tasks more often and with a limit in the querysets? |
The query is executed without requiring `.fetchall()`
@stsewd yeah, I'm not super happy with this solution either -- but I don't care too much about this data since it's just old data we need to clean up with some frequency anyways. I tried other nicer approaches but I wasn't able to make them work without doing an auto |
This task was cancelled again. The query shows a SELECT first that fetchs the whole rows. I think we can reduce this time/memory by only fetching the ids.
Sentry https://read-the-docs.sentry.io/issues/4006261064/?project=148442&query=is%3Aunresolved+%21message%3A%22%21message%3A%22%21message%3A%22SystemExit%22+%21message%3A%22frame-ancestors%22&referrer=issue-stream&stream_index=9
Related https://github.com/readthedocs/readthedocs-ops/issues/1291