Skip to content

Use correct Cache-Tag (CDN) and X-RTD-Project header on subprojects #7593

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Oct 26, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion readthedocs/proxito/middleware.py
Original file line number Diff line number Diff line change
Expand Up @@ -120,7 +120,7 @@ class ProxitoMiddleware(MiddlewareMixin):
def add_proxito_headers(self, request, response):
"""Add debugging headers to proxito responses."""

project_slug = getattr(request, 'host_project_slug', '')
project_slug = getattr(request, 'path_project_slug', '')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we not need to use host_project_slug sometimes as well? It seems like this is breaking existing logic, unless path_project_slug is getting set other places?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we not need to use host_project_slug sometimes as well?

As a fallback, you mean? In case that path_project_slug is not available? Why it would not be available?

It seems like this is breaking existing logic, unless path_project_slug is getting set other places?

What existing logic do you think it's breaking? If you have an example on mind I can try to write a test case.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are using the same logic for version_path_project that follows this line. It's used here and set in _get_project_data_from_request (

request.path_version_slug = version_slug
) which is the first function that is called when serving a page at
final_project, lang_slug, version_slug, filename = _get_project_data_from_request( # noqa
request,
project_slug=project_slug,
subproject_slug=subproject_slug,
lang_slug=lang_slug,
version_slug=version_slug,
filename=filename,
)

The host_project_slug is returned by map_host_project_slug which is accurate,

ret = map_host_to_project_slug(request)
. However, the project slug we get from the host sometimes (in subprojects or translations, for example) is not the project slug being served.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha, are other uses of host_project_slug unaffected by this issue? It just seems bad to have 2 different variables for the project_slug, where we use them in different places.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It just seems bad to have 2 different variables for the project_slug, where we use them in different places.

Well, if I'm understand correctly this they are different things:

  • host_project_slug is the project slug that we infer from the host/domain's URL
    • not accurate for subdomain/translations
    • it's fine for custom domains and .readthedocs.io
    • it's set at the beginning of the request in the middleware
  • path_project_slug is the project slug that we can infer from the path's URL mainly
    • it's good for subdomain/translations
    • it's the same as what we have been calling final_project (maybe it's better to call it final_project_slug?)
    • it's set in the _get_project_data_from_request which is called from inside the ServeDocs view in Proxito as the first step of the request after all the middlewares

It seems that both are required. We use host_project_slug as first step trying to get the Project object (map_project_slug decorator). Then, inside the _get_project_data_from_request we use host_project_slug to check single project, translation and subproject and set the path_project_slug which in those cases could be different than host_project_slug.

Gotcha, are other uses of host_project_slug unaffected by this issue?

Grepping the code I didn't find anything that clearly needs to be adapted. Maybe there is something deeper that I'm not seeing, tho

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sense to me. We should probably document all these variables (as you have done above) in the code/docs, so we don't get confused in the future.

version_slug = getattr(request, 'path_version_slug', '')
path = getattr(response, 'proxito_path', '')

Expand Down
16 changes: 16 additions & 0 deletions readthedocs/proxito/tests/test_headers.py
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,22 @@ def test_serve_headers(self):
self.assertEqual(r['X-RTD-version-Method'], 'path')
self.assertEqual(r['X-RTD-Path'], '/proxito/media/html/project/latest/index.html')

def test_subproject_serve_headers(self):
r = self.client.get('/projects/subproject/en/latest/', HTTP_HOST='project.dev.readthedocs.io')
self.assertEqual(r.status_code, 200)
self.assertEqual(r['Cache-Tag'], 'subproject,subproject-latest')
self.assertEqual(r['X-RTD-Domain'], 'project.dev.readthedocs.io')
self.assertEqual(r['X-RTD-Project'], 'subproject')

# I think it's not accurate saying that it's `subdomain` the method
# that we use to get the project slug here, since it was in fact the
# URL's path but we don't have that feature built
self.assertEqual(r['X-RTD-Project-Method'], 'subdomain')
Comment on lines +48 to +51
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not related to this PR, but I wanted to make a note here, tho. Maybe a TODO where we set this header is better?

if hasattr(request, 'rtdheader'):
response['X-RTD-Project-Method'] = 'rtdheader'
elif hasattr(request, 'subdomain'):
response['X-RTD-Project-Method'] = 'subdomain'
elif hasattr(request, 'cname'):
response['X-RTD-Project-Method'] = 'cname'

This header is wrong to all the subprojects when they are accessed via the domain of their parent's, eg. project.readthedocs.io/projects/subproject/en/latest/


self.assertEqual(r['X-RTD-Version'], 'latest')
self.assertEqual(r['X-RTD-version-Method'], 'path')
self.assertEqual(r['X-RTD-Path'], '/proxito/media/html/subproject/latest/index.html')

def test_404_headers(self):
r = self.client.get('/foo/bar.html', HTTP_HOST='project.dev.readthedocs.io')
self.assertEqual(r.status_code, 404)
Expand Down
3 changes: 2 additions & 1 deletion readthedocs/proxito/views/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -92,7 +92,8 @@ def _get_project_data_from_request(
# * Subproject
# * Translations

# Set the version slug on the request so we can log it in middleware
# Set the project and version slug on the request so we can log it in middleware
request.path_project_slug = final_project.slug
request.path_version_slug = version_slug

return final_project, lang_slug, version_slug, filename