Skip to content

Cache 404 pages per-version #6797

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 4 commits into from
Closed

Cache 404 pages per-version #6797

wants to merge 4 commits into from

Conversation

humitos
Copy link
Member

@humitos humitos commented Mar 19, 2020

Caches are our most expensive requests since at this point we check
for multiple dirhtml files in the storage, for redirects, for a custom
404 page and we finally render the default Read the Docs' 404
page (Maze).

This new code checks for these scenarios (dirhtml, redirect, custom,
default) and return the cached response if it exists before performing
any other operation.

The default 404 page response is cached globally since it's the same
for all the projects. Although, before returning it, we check that was
the response we served the first time and not the custom 404 one.

@@ -225,7 +279,9 @@ def get(self, request, proxito_path, template_name='404.html'):

# TODO: decide if we need to check for infinite redirect here
# (from URL == to URL)
return HttpResponseRedirect(redirect_url)
response = HttpResponseRedirect(redirect_url)
cache.set('dirhtml_404+{proxito_path}', response, 60)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pretty sure we can't cache the object itself. We need to cache some string representation of it, then recreate the Python object.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested this locally and it worked properly. I didn't find anything about HttpResponse not being serializable, but I'll take a deeper look for potential issues.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Django's internals do the same: https://github.com/django/django/blob/4ed534758cb6a11df9f49baddecca5a6cdda9311/django/middleware/cache.py#L116

Although, that method check for some security concerns before caching the response, like Cache-Control: private, Cookie and caches only 200 and 304.

I don't think we need anyone of those in community at least. For commercial, I'm not sure yet, but I will think about.

Caches are our most expensive requests since at this point we check
for multiple dirhtml files in the storage, for redirects, for a custom
404 page and we finally render the default Read the Docs' 404
page (Maze).

This new code checks for these scenarios (dirhtml, redirect, custom,
default) and return the cached response if it exists before performing
any other operation.

The default 404 page response is cached globally since it's the same
for all the projects. Although, before returning it, we check that was
the response we served the first time and not the custom 404 one.
@humitos humitos force-pushed the humitos/cache-404-pages branch from bf74e76 to 9a55553 Compare March 23, 2020 09:15
@@ -34,6 +35,7 @@ def proxito_404_page_handler(request, exception=None, template_name='404.html'):

resp = render(request, template_name)
resp.status_code = 404
cache.set('default_404_page', resp, 60)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This time should probably be bigger than 60 seconds.

@humitos
Copy link
Member Author

humitos commented May 25, 2020

@ericholscher do we want to explore a little more on this or should we close it for now? I think it's a nice to have, but we haven't hit this issue of 404 pages in a while now.

@ericholscher
Copy link
Member

Yea, I think we can close this for now since we're behind the CDN with autoscaling, it's much less of an issue 👍

@stsewd stsewd deleted the humitos/cache-404-pages branch May 26, 2020 17:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants