-
-
Notifications
You must be signed in to change notification settings - Fork 3.6k
Proxito: handle http to https redirects for all requests #10199
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 5 commits
cdb88ad
72fe380
523a010
a663462
c5a54a9
2434760
4f76856
97a5f09
15e0806
dd7eb94
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,65 @@ | ||
import structlog | ||
|
||
log = structlog.get_logger(__name__) | ||
|
||
CACHE_TAG_HEADER = "Cache-Tag" | ||
CDN_CACHE_CONTROL_HEADER = "CDN-Cache-Control" | ||
Comment on lines
+5
to
+6
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We are using a |
||
|
||
|
||
def add_cache_tags(response, cache_tags): | ||
""" | ||
Add cache tags to the response. | ||
|
||
New cache tags will be appended to the existing ones. | ||
|
||
:param response: The response to add cache tags to. | ||
:param cache_tags: A list of cache tags to add to the response. | ||
""" | ||
cache_tags = cache_tags.copy() | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do we copy them for any particular reason? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We are extending this list below, we don't want to change the original list. |
||
current_cache_tag = response.headers.get(CACHE_TAG_HEADER) | ||
if current_cache_tag: | ||
cache_tags.append(current_cache_tag) | ||
|
||
response.headers[CACHE_TAG_HEADER] = ",".join(cache_tags) | ||
|
||
|
||
def cache_response(response, cache_tags=None, force=True): | ||
""" | ||
Cache the response at the CDN level. | ||
|
||
We add the ``Cache-Tag`` header to the response, to be able to purge | ||
the cache by a given tag. And we set the ``CDN-Cache-Control`` header | ||
to public, to cache the response at the CDN level only. This doesn't | ||
stsewd marked this conversation as resolved.
Show resolved
Hide resolved
|
||
affect caching at the browser level (``Cache-Control``). | ||
|
||
See: | ||
|
||
- https://developers.cloudflare.com/cache/how-to/purge-cache/#cache-tags-enterprise-only | ||
- https://developers.cloudflare.com/cache/about/cdn-cache-control. | ||
|
||
:param response: The response to cache. | ||
:param cache_tags: A list of cache tags to add to the response. | ||
:param force: If ``True``, the header will be set to public even if it | ||
was already set to private. | ||
Comment on lines
+42
to
+43
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think it's not clear what There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For me force is more clear than public or private, since we are forcing the request to be cached or not depending on the method being called. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. My point is that sometimes Something more explicit here would be better in my opinion. If you like the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. But they are used in different functions, the name of the function already specifies if the request will be forced to be public or private (cache_response, private_response), that's why it's just a boolean, we aren't using one function to cache the response or make it private, we are using two different functions. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah, I saw that. IMO, it's still hard to realize what I am forcing to when passing the boolean, that's why I raised this comment here. I got confused while doing the review. |
||
""" | ||
if cache_tags: | ||
add_cache_tags(response, cache_tags) | ||
if force or CDN_CACHE_CONTROL_HEADER not in response.headers: | ||
if not response.headers.get(CACHE_TAG_HEADER): | ||
log.warning("Caching response without cache tags.") | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is this a real case or we are logging this here just for debugging purposes? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It doesn't make sense to cache a response if we don't have a way to purge it using cache tags. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This would be good as a comment, or even better in the logging message. |
||
|
||
response.headers[CDN_CACHE_CONTROL_HEADER] = "public" | ||
|
||
|
||
def private_response(response, force=True): | ||
""" | ||
Prevent the response from being cached at the CDN level. | ||
|
||
We do this by explicitly setting the ``CDN-Cache-Control`` header to private. | ||
|
||
:param response: The response to mark as private. | ||
:param force: If ``True``, the header will be set to private even if it | ||
was already set to public. | ||
Comment on lines
+63
to
+64
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Same here. As a user calling There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What other name do you suggest? To me force is clear, especially since the default is always True. |
||
""" | ||
if force or CDN_CACHE_CONTROL_HEADER not in response.headers: | ||
response.headers[CDN_CACHE_CONTROL_HEADER] = "private" |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,91 @@ | ||
from urllib.parse import urlparse | ||
|
||
import structlog | ||
from django.http import HttpResponseRedirect | ||
|
||
from readthedocs.core.resolver import resolver | ||
from readthedocs.core.utils.url import unsafe_join_url_path | ||
from readthedocs.proxito.cache import cache_response | ||
from readthedocs.proxito.constants import RedirectType | ||
from readthedocs.redirects.exceptions import InfiniteRedirectException | ||
|
||
log = structlog.get_logger(__name__) | ||
|
||
|
||
def redirect_to_https(request, project): | ||
""" | ||
Shortcut to get a https redirect. | ||
|
||
:param request: Request object. | ||
:param project: The current project being served | ||
""" | ||
return canonical_redirect(request, project, RedirectType.http_to_https) | ||
|
||
|
||
def canonical_redirect(request, project, redirect_type, external_version_slug=None): | ||
""" | ||
Return a canonical redirect response. | ||
|
||
All redirects are cached, since the final URL will be checked for authorization. | ||
|
||
The following cases are covered: | ||
|
||
- Redirect a custom domain from http to https | ||
http://project.rtd.io/ -> https://project.rtd.io/ | ||
- Redirect a domain to a canonical domain (http or https). | ||
http://project.rtd.io/ -> https://docs.test.com/ | ||
http://project.rtd.io/foo/bar/ -> https://docs.test.com/foo/bar/ | ||
- Redirect from a subproject domain to the main domain | ||
https://subproject.rtd.io/en/latest/foo -> https://main.rtd.io/projects/subproject/en/latest/foo # noqa | ||
https://subproject.rtd.io/en/latest/foo -> https://docs.test.com/projects/subproject/en/latest/foo # noqa | ||
|
||
Raises ``InfiniteRedirectException`` if the redirect is the same as the current URL. | ||
|
||
:param request: Request object. | ||
:param project: The current project being served | ||
:param redirect_type: The type of canonical redirect (https, canonical-cname, subproject-main-domain) | ||
:param external_version_slug: The version slug if the request is from a pull request preview. | ||
""" | ||
from_url = request.build_absolute_uri() | ||
parsed_from = urlparse(from_url) | ||
|
||
if redirect_type == RedirectType.http_to_https: | ||
# We only need to change the protocol. | ||
to = parsed_from._replace(scheme="https").geturl() | ||
elif redirect_type == RedirectType.to_canonical_domain: | ||
# We need to change the domain and protocol. | ||
canonical_domain = project.get_canonical_custom_domain() | ||
protocol = "https" if canonical_domain.https else "http" | ||
to = parsed_from._replace( | ||
scheme=protocol, netloc=canonical_domain.domain | ||
).geturl() | ||
elif redirect_type == RedirectType.subproject_to_main_domain: | ||
# We need to get the subproject root in the domain of the main | ||
# project, and append the current path. | ||
project_doc_prefix = resolver.get_url_prefix( | ||
project=project, | ||
external_version_slug=external_version_slug, | ||
) | ||
parsed_doc_prefix = urlparse(project_doc_prefix) | ||
to = parsed_doc_prefix._replace( | ||
path=unsafe_join_url_path(parsed_doc_prefix.path, parsed_from.path), | ||
query=parsed_from.query, | ||
).geturl() | ||
else: | ||
raise NotImplementedError | ||
|
||
if from_url == to: | ||
# check that we do have a response and avoid infinite redirect | ||
log.warning( | ||
"Infinite Redirect: FROM URL is the same than TO URL.", | ||
url=to, | ||
) | ||
raise InfiniteRedirectException() | ||
|
||
log.info( | ||
"Canonical Redirect.", host=request.get_host(), from_url=from_url, to_url=to | ||
) | ||
resp = HttpResponseRedirect(to) | ||
resp["X-RTD-Redirect"] = redirect_type.name | ||
cache_response(resp, cache_tags=[project.slug]) | ||
return resp |
Uh oh!
There was an error while loading. Please reload this page.