Skip to content

Default URL redirects are cached too aggressively #9488

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
KevinMGranger opened this issue Aug 10, 2022 · 14 comments
Closed

Default URL redirects are cached too aggressively #9488

KevinMGranger opened this issue Aug 10, 2022 · 14 comments
Assignees
Labels
Improvement Minor improvement to code

Comments

@KevinMGranger
Copy link

KevinMGranger commented Aug 10, 2022

Details

Browser: chrome Version 103.0.5060.134 (Official Build) (arm64)
OS: macOS 12.4

Steps

  1. Set up automation to make latest release the default url:
    • Match: SemVer versions
    • Version type: tag
    • Action: set version as default
  2. Set up a github incoming webhook
  3. Tag a release
  4. View the docs at /
  5. Tag another release within 24 hours
  6. Verify that the default version in Advanced Settings / Global Settings / Default version points to the release from the previous step
  7. View the docs at /

Expected Result

I am redirected to /en/$tag_from_step_5

Actual Result

Chrome loads the 302 FOUND from cache, redirecting me to /en/$tag_from_step_3.

Analysis

Here's the response headers, I assume from the initial request at step 4:

Response Headers (some omitted, if I don't know how private they are)
cache-control: public, max-age=86400
cf-cache-status: MISS
content-language: en
content-length: 0
content-type: text/html; charset=utf-8
date: Tue, 09 Aug 2022 18:33:36 GMT
expires: Wed, 10 Aug 2022 18:33:36 GMT
location: https://pelorus.readthedocs.io/en/v1.6.0/
referrer-policy: no-referrer-when-downgrade
server: cloudflare
vary: Accept-Language, Cookie, Accept-Encoding
x-content-type-options: nosniff
x-rtd-domain: pelorus.readthedocs.io
x-rtd-project: pelorus
x-rtd-project-method: subdomain
x-rtd-redirect: system
x-rtd-version-method: path
x-served: Django-Proxito

Chrome caches redirects, but will honor cache-control headers on the redirect response. If I understand how this works, max-age=86400 means the redirect will be cached for 24 hours.

Perhaps conditional cache headers should be used for the top-level URL, instead of max-age in cache-control. Last-Modified or ETag (generated based on the default version) should work, if cloudflare can handle those. Note: The cache for the Location (the actual version, /en/$version) can stay the same! This isn't about that.

I do not believe this is related to #9167. The tagged version is fresh and served correctly. According to chrome, it doesn't even make the request because of the cache-control header. That issue says "a few minutes"-- the redirect is still pointing to the older version, despite the automation changing the settings 1 hour and 16 minutes ago at time of writing.

I'm not sure how this affects other browsers. Perhaps they're less aggressive at caching redirects.

If needed, I can try to make a smaller (perhaps even automated) reproduction of the bug.

@stsewd
Copy link
Member

stsewd commented Aug 10, 2022

I think we should reduce the max-age value, maybe 10 or 5 minutes is enough, cc @ericholscher

Perhaps conditional cache headers should be used for the top-level URL, instead of max-age in cache-control. Last-Modified or ETag (generated based on the default version) should work

I think you still need to set a max-age value, from my understanding ETag and Last-Modified are just the methods the server uses to response if the response should still be cached or not. CF should already handle this for us.

@KevinMGranger
Copy link
Author

Based on my skimming (not a deep read, I admit) of MDN and RFC9111, I don't think you need max-age. Unless that's a cloudflare-specific requirement.

@humitos
Copy link
Member

humitos commented Aug 11, 2022

@stsewd

I think we should reduce the max-age value, maybe 10 or 5 minutes is enough, cc @ericholscher

Are we setting this value at the application or is it done at Cloudflare? Can we reduce the max-cache only on redirect responses but keep it as is for regular pages?

@stsewd
Copy link
Member

stsewd commented Aug 11, 2022

@humitos that's done at Cloudflare, but we have had the same problem with normal pages, we usually ask users to do a hard refresh.

@humitos
Copy link
Member

humitos commented Aug 16, 2022

I'm not too worried about normal pages because it's easily solved with a hard refresh which is more likely for users to know about. However, telling the browser to forget a redirect is not common knowledge and may be hard to figure it out how to do it.

I'd think it makes sense to reduce the redirects cache time to 10 minutes or similar to avoid this problem. However, I wouldn't reduce too much the for regular resources because readers will have to re-download the full page each time they switch pages, tho.

@ericholscher
Copy link
Member

I don't know if we have the ability to specify redirect-only headers, but we might be able to. We should see if there's a Page Rule or similar in the CDN that can handle this.

Looks like CF already does a version of this? https://developers.cloudflare.com/cache/how-to/configure-cache-status-code/#edge-ttl

@stsewd
Copy link
Member

stsewd commented Aug 17, 2022

Looks like CF already does a version of this? https://developers.cloudflare.com/cache/how-to/configure-cache-status-code/#edge-ttl

@ericholscher that's the edge caching, here the problem is with the browser caching level, which we have set to 1 day

Screenshot 2022-08-16 at 21-11-07 Configuration Caching readthedocs io Read the Docs Production Cloudflare

@stsewd
Copy link
Member

stsewd commented Aug 17, 2022

Also, just checked the page rules and transform rules, and I don't see something that let us filter by status code https://developers.cloudflare.com/rules/transform/response-header-modification/reference/fields-functions/, there is only that option to set the cache at the edge level, but maybe that rule also overrides the age for the browser cache level?

@ericholscher
Copy link
Member

ericholscher commented Aug 17, 2022

The Browser Cache TTL sets the expiration for resources cached in a visitor’s browser. By default, Cloudflare honors the cache expiration set in your Expires and Cache-Control headers but overrides those headers if:

  • The value of the Cache-Control header from the origin web server is less than the Browser Cache TTL Cloudflare setting.

From: https://developers.cloudflare.com/cache/about/edge-browser-cache-ttl/

So I think we can lower the default, and set it in our application.

@stsewd stsewd added the Improvement Minor improvement to code label Aug 18, 2022
@github-project-automation github-project-automation bot moved this to Planned in 📍Roadmap Feb 1, 2023
@ericholscher
Copy link
Member

I've just updated this to be 5 minutes, which should hopefully work around this issue.

@effigies
Copy link

effigies commented Feb 2, 2023

Just ran across this issue, and thanks for dropping the TTL. I got bit by this because we serve JSON resources on our site and Deno respects the Cache-Control max-age when loading Javascript or JSON documents from the internet. Days or even months is fine for our versioned docs, as we will almost never rebuild, but /en/latest/ is useful to have on the order of 5-30 minutes.

Just sharing in case this use-case is not one you'd considered. It might even be nice to be separately configurable in the readthedocs.yml for tags and latest/stable.

@ericholscher ericholscher self-assigned this Feb 2, 2023
@ericholscher ericholscher moved this from Planned to Needs review in 📍Roadmap Feb 14, 2023
@ericholscher
Copy link
Member

Refs #10035

@stsewd
Copy link
Member

stsewd commented Feb 15, 2023

We saw an increase in our load, so we set the browser cache TTL to 20min, but we are discussing increasing the edge cache instead.

@ericholscher
Copy link
Member

I think we can probably call this issue done, and focus on any additional TTL configuration in #10035. The original problem is solved, and we have a path forward on being a lot smarter about how we handle different files and versions, which would solve this bit of feedback:

Just ran across this issue, and thanks for dropping the TTL. I got bit by this because we serve JSON resources on our site and Deno respects the Cache-Control max-age when loading Javascript or JSON documents from the internet. Days or even months is fine for our versioned docs, as we will almost never rebuild, but /en/latest/ is useful to have on the order of 5-30 minutes.

I will make sure we consider that in #10035.

@github-project-automation github-project-automation bot moved this from Needs review to Done in 📍Roadmap Feb 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Improvement Minor improvement to code
Projects
Archived in project
Development

No branches or pull requests

5 participants