Skip to content

Contextualize 404 page #9657

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 97 commits into from
Apr 26, 2023
Merged
Show file tree
Hide file tree
Changes from 83 commits
Commits
Show all changes
97 commits
Select commit Hold shift + click to select a range
fd380a5
WIP: Adds context to 404 page
benjaoming Oct 11, 2022
ad1cb2f
Merge branch 'main' of github.com:readthedocs/readthedocs.org into co…
benjaoming Oct 12, 2022
aca278d
WIP: Adds a test 404 page
benjaoming Oct 12, 2022
7f86767
Contextualization gives us 3 interesting scenarios for the 404 page
benjaoming Oct 12, 2022
31ec5f5
404 handler: Receive exception keyword
benjaoming Oct 12, 2022
8cbff08
Improve contextualized messages, not ready for review, needs further …
benjaoming Oct 27, 2022
bc73398
Remove search form, that was a bad idea, since it will only work on s…
benjaoming Nov 1, 2022
a8573f9
Early work on an approach that uses a custom Http404 exception to sav…
benjaoming Nov 2, 2022
8bd53b6
Refactor error templates to separate contextualized handling + some t…
benjaoming Nov 3, 2022
7d1c69a
Apply new exceptions in some of the cases
benjaoming Nov 4, 2022
a30fa3c
Update readthedocs/proxito/exceptions.py
benjaoming Nov 7, 2022
c550ade
Merge branch 'main' of github.com:readthedocs/readthedocs.org into co…
benjaoming Nov 23, 2022
0ef9c4d
Merge branch 'contextualize-404' of github.com:benjaoming/readthedocs…
benjaoming Nov 23, 2022
c505544
Refactor unresolver
stsewd Feb 16, 2023
d2eadec
Merge branch 'main' into refactor-unresolver
stsewd Feb 16, 2023
14bdbd2
Move external version checks inside unresolver
stsewd Feb 21, 2023
b807b2d
Merge branch 'main' into refactor-unresolver
stsewd Feb 21, 2023
b5006dc
Merge branch 'main' into refactor-unresolver
stsewd Feb 21, 2023
34d0cea
Merge branch 'main' of github.com:readthedocs/readthedocs.org into co…
benjaoming Feb 22, 2023
dd43631
!fixup 8cbff084845a9a7ad0fd660f96be8bab245ec171
benjaoming Feb 22, 2023
0c07025
Update link to Custom 404 docs, remove rediraffe from tips
benjaoming Feb 22, 2023
9367e9e
Apply suggestions from code review
benjaoming Feb 22, 2023
d38491e
Raise custom 404 if subproject isn't found
benjaoming Feb 22, 2023
33f685b
Merge branch 'contextualize-404' of github.com:benjaoming/readthedocs…
benjaoming Feb 22, 2023
7f6f1b7
WIP: Conclude structure of exception classes and templates, needs mor…
benjaoming Feb 22, 2023
36b9f02
Updates from review
stsewd Feb 22, 2023
a4dee70
Reuse 404 view between proxito and general app, remove debug logging
benjaoming Feb 23, 2023
797e96b
Removes log.debug in views.serve
benjaoming Feb 23, 2023
8d38b8f
More work on text copy, fix bugs, raise ProxitoProjectVersionHttp404 …
benjaoming Feb 23, 2023
b60a109
Contextualize when subprojects aren't found. The disadvantages of the…
benjaoming Feb 23, 2023
b557590
Clean up
benjaoming Feb 23, 2023
3f7f2fd
Add project:slug search scope when pages aren't found
benjaoming Feb 23, 2023
63e64c2
clean up
benjaoming Feb 23, 2023
18823ba
Missed some linting there
benjaoming Feb 23, 2023
6dfde4c
fix pylint errors
benjaoming Feb 23, 2023
47338e9
Merge branch 'main' into refactor-unresolver
stsewd Feb 27, 2023
38496bc
Proxito: use unresolved in 404 handler
stsewd Feb 27, 2023
9fabc3d
Merge branch 'main' of github.com:readthedocs/readthedocs.org into co…
benjaoming Feb 28, 2023
bece466
Don't use request.path as the "not found" path for proxito 404 except…
benjaoming Feb 28, 2023
46496df
Merge branch 'main' into refactor-unresolver
stsewd Feb 28, 2023
69f4886
Updates from review
stsewd Feb 28, 2023
b25608a
Merge remote-tracking branch 'upstream/use-proxito-v2-in-404-handler'…
benjaoming Mar 1, 2023
2184112
Missing imports
benjaoming Mar 1, 2023
ce28e12
Linting: Too many blank lines
benjaoming Mar 1, 2023
a9e7b09
Merge branch 'use-proxito-v2-in-404-handler' into contextualize-404
benjaoming Mar 1, 2023
7e12e2e
Merge branch 'main' of github.com:readthedocs/readthedocs.org into co…
benjaoming Mar 1, 2023
3c8b76b
Merge branch 'main' of github.com:readthedocs/readthedocs.org into us…
benjaoming Mar 1, 2023
29a560c
Merge branch 'refactor-unresolver' of github.com:readthedocs/readthed…
benjaoming Mar 1, 2023
01f8cee
Add exception type for new 404 translation pages
benjaoming Mar 1, 2023
3085d95
Merge branch 'main' into use-proxito-v2-in-404-handler
stsewd Mar 2, 2023
05384be
Remove duplicated imports
stsewd Mar 2, 2023
7af491a
Black
stsewd Mar 2, 2023
86f67b2
Fix exc
stsewd Mar 2, 2023
2d92061
Update tests
stsewd Mar 2, 2023
f5b2db0
Merge branch 'main' into use-proxito-v2-in-404-handler
stsewd Mar 2, 2023
c017503
Move check to new implementation only
stsewd Mar 2, 2023
4c15d16
Merge branch 'use-proxito-v2-in-404-handler' of github.com:readthedoc…
benjaoming Mar 3, 2023
b5d96f4
Remove double error message for project 404s
benjaoming Mar 3, 2023
a36eb90
Add missing kwarg proxito_path for subproject 404 exception
benjaoming Mar 3, 2023
be22848
Updates from review
stsewd Mar 7, 2023
7e02a55
Merge branch 'use-proxito-v2-in-404-handler' of github.com:readthedoc…
benjaoming Mar 7, 2023
e3ff836
Add a new 404 exception class and targeted template for missing proje…
benjaoming Mar 7, 2023
03bbc6d
Updates from review
stsewd Mar 7, 2023
5ae6b6d
Merge branch 'use-proxito-v2-in-404-handler' of github.com:readthedoc…
benjaoming Mar 8, 2023
af07796
Merge branch 'main' of github.com:readthedocs/readthedocs.org into co…
benjaoming Mar 8, 2023
d597b09
Merge branch 'main' of github.com:readthedocs/readthedocs.org into co…
benjaoming Apr 11, 2023
b98dd0b
WIP: Simplify contextualized 404 messages into a set of exceptions. T…
benjaoming Apr 11, 2023
9a366bb
Revert changes on old proxito decorators implementation
benjaoming Apr 11, 2023
651161e
WIP: refactor
benjaoming Apr 12, 2023
3b7f4c0
Remove custom HTTP responses from most old proxito code, include CNAM…
benjaoming Apr 12, 2023
7c0c00c
Remove more changes to old proxito code
benjaoming Apr 12, 2023
93d7c48
Remove more old proxito implementation changes
benjaoming Apr 12, 2023
a1361d4
Use PRODUCTION_DOMAIN for search form action
benjaoming Apr 12, 2023
d20187c
Remove custom 404 handler from general app, have only that logic in p…
benjaoming Apr 12, 2023
b0e8db4
Remove a couple of additional unused patterns
benjaoming Apr 12, 2023
f0747c5
Comment accuracy update
benjaoming Apr 12, 2023
883aa50
Update another comment
benjaoming Apr 12, 2023
e70cec4
Revert DEBUG=False
benjaoming Apr 12, 2023
f195419
Move exceptions from core to proxito + use "path" from 404 serve view…
benjaoming Apr 12, 2023
62df795
Rename a few things
benjaoming Apr 12, 2023
11d099a
Removed unused kwarg
benjaoming Apr 12, 2023
2fa8a52
Update test cases to check for expected 404 exception and HTTP status
benjaoming Apr 12, 2023
92abfb9
Adds a test case variant to check the exception handling stack
benjaoming Apr 12, 2023
084ccc7
Apply suggestions from @stsewd code review
benjaoming Apr 19, 2023
607c735
use pgettext with context for 404 subjects
benjaoming Apr 19, 2023
344a845
change kwarg name, use some explicit kwargs
benjaoming Apr 19, 2023
6156040
Fix imports to be absolute
benjaoming Apr 19, 2023
9799f7e
Remove h3 from search bar when included with a search term
benjaoming Apr 19, 2023
239a0af
Merge branch 'main' of github.com:readthedocs/readthedocs.org into co…
benjaoming Apr 19, 2023
81e46a9
explicitly require certain exception arguments, trim the ones we're n…
benjaoming Apr 20, 2023
d73d397
Apply suggestions from @stsewd code review
benjaoming Apr 26, 2023
18b5788
Remove |default filter and assume a value exists
benjaoming Apr 26, 2023
8945bff
Update readthedocs/templates/errors/404/include_tips.html
benjaoming Apr 26, 2023
a4816e4
Merge branch 'contextualize-404' of github.com:benjaoming/readthedocs…
benjaoming Apr 26, 2023
5e9017e
Add more context for translators of not_found_subject
benjaoming Apr 26, 2023
f5cf9d9
Merge branch 'main' of github.com:readthedocs/readthedocs.org into co…
benjaoming Apr 26, 2023
8cd0404
Merge branch 'main' of github.com:readthedocs/readthedocs.org into co…
benjaoming Apr 26, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions media/css/core.css
Original file line number Diff line number Diff line change
Expand Up @@ -1331,6 +1331,9 @@ div.module.project-subprojects li.subproject a.subproject-edit:before {
content: "\f044";
}

#content ul.normal_list {list-style: disc; margin-left: 20px;}
#content code {background: #eee; border: 1px solid #ccc; padding: 3px; display: inline-block;}

/* Pygments */
div.highlight pre .hll { background-color: #ffffcc }
div.highlight pre .c { color: #60a0b0; font-style: italic } /* Comment */
Expand Down
100 changes: 100 additions & 0 deletions readthedocs/proxito/exceptions.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,100 @@
from django.http import Http404
from django.utils.translation import gettext_lazy as _


class ContextualizedHttp404(Http404):

"""
Base class for contextualized HTTP 404 handling.

Subclasses may define their own template name,
HTTP status and object that was not found.

The contextualized exception is handled by the project's 404 view.
"""

template_name = "errors/404/base.html"
not_found_subject = _("page")

def __init__(self, **kwargs):
"""
Constructor that all subclasses should call.

:param kwargs: all kwargs are added as page context for rendering the 404 template
:param http_status: 404 view should respect this and set the HTTP status.
:param path_not_found: Inform the template and 404 view about a different path from
request.path
"""
self.http_status = kwargs.pop("http_status", 404)
self.path_not_found = kwargs.pop("path_not_found", None)
self.kwargs = kwargs

def get_context(self):
c = {
"not_found_subject": self.not_found_subject,
"path_not_found": self.path_not_found,
}
c.update(self.kwargs)
return c


class DomainDNSHttp404(ContextualizedHttp404):

"""Raised if a DNS record points to us but we do not know the domain."""

template_name = "errors/404/dns.html"
not_found_subject = _("domain matching DNS record")


class ProjectHttp404(ContextualizedHttp404):

"""
Raised if a domain did not resolve to a project.

This is currently used very broadly.
It indicates a number of reasons for the user.
"""

template_name = "errors/404/no_project.html"
not_found_subject = _("project")


class SubprojectHttp404(ContextualizedHttp404):

"""Raised if a subproject was not found."""

template_name = "errors/404/no_subproject.html"
not_found_subject = _("sub project")


class ProxitoProjectFilenameHttp404(ContextualizedHttp404):

"""Raised if a page inside an existing project was not found."""

template_name = "errors/404/no_project_page.html"
not_found_subject = _("documentation page")


class ProjectTranslationHttp404(ContextualizedHttp404):

"""
Raised if a translation of a project was not found.

This means that the project does not exist for requested language.
If a page isn't found, raise a ProjectPageHttp404.
"""

template_name = "errors/404/no_language.html"
not_found_subject = _("translation")


class ProjectVersionHttp404(ContextualizedHttp404):

"""
Raised if a version was not found.

Note: The containing project can be both a project or a subproject inside another project.
"""

template_name = "errors/404/no_version.html"
not_found_subject = _("documentation version")
26 changes: 15 additions & 11 deletions readthedocs/proxito/middleware.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,8 +12,7 @@
import structlog
from django.conf import settings
from django.core.exceptions import SuspiciousOperation
from django.http import Http404
from django.shortcuts import redirect, render
from django.shortcuts import redirect
from django.urls import reverse
from django.utils.deprecation import MiddlewareMixin

Expand All @@ -28,6 +27,8 @@
from readthedocs.core.utils import get_cache_tag
from readthedocs.projects.models import Feature, Project

from .exceptions import DomainDNSHttp404, ProjectHttp404

log = structlog.get_logger(__name__)


Expand Down Expand Up @@ -213,20 +214,23 @@ def process_request(self, request): # noqa
unresolved_domain = unresolver.unresolve_domain_from_request(request)
except SuspiciousHostnameError as exc:
log.warning("Weird variation on our hostname.", domain=exc.domain)
return render(
request,
"core/dns-404.html",
context={"host": exc.domain},
status=400,
# Raise a contextualized 404 that will be handled by proxito's 404 handler
raise DomainDNSHttp404(
http_status=400,
host=exc.domain,
)
except (InvalidSubdomainError, InvalidExternalDomainError):
except (InvalidSubdomainError, InvalidExternalDomainError) as exc:
log.debug("Invalid project set on the subdomain.")
raise Http404
# Raise a contextualized 404 that will be handled by proxito's 404 handler
raise ProjectHttp404(
domain=exc.domain,
)
except InvalidCustomDomainError as exc:
# Some person is CNAMEing to us without configuring a domain - 404.
log.debug("CNAME 404.", domain=exc.domain)
return render(
request, "core/dns-404.html", context={"host": exc.domain}, status=404
# Raise a contextualized 404 that will be handled by proxito's 404 handler
raise DomainDNSHttp404(
host=exc.domain,
)
except InvalidXRTDSlugHeaderError:
raise SuspiciousOperation("Invalid X-RTD-Slug header.")
Expand Down
31 changes: 25 additions & 6 deletions readthedocs/proxito/tests/test_middleware.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,14 +4,15 @@
import pytest
from django.core.exceptions import SuspiciousOperation
from django.http import HttpRequest
from django.test import TestCase
from django.test import Client, TestCase
from django.test.utils import override_settings
from django_dynamic_fixture import get

from readthedocs.builds.models import Version
from readthedocs.projects.constants import PUBLIC
from readthedocs.projects.models import Domain, Feature, Project, ProjectRelationship
from readthedocs.proxito.constants import RedirectType
from readthedocs.proxito.exceptions import DomainDNSHttp404
from readthedocs.proxito.middleware import ProxitoMiddleware
from readthedocs.rtd_tests.base import RequestFactoryTestMixin
from readthedocs.rtd_tests.storage import BuildMediaFileSystemStorageTest
Expand Down Expand Up @@ -139,9 +140,18 @@ def test_proper_cname_uppercase(self):
def test_invalid_cname(self):
self.assertFalse(Domain.objects.filter(domain='my.host.com').exists())
request = self.request(method='get', path=self.url, HTTP_HOST='my.host.com')
r = self.run_middleware(request)
# We show the 404 error page
self.assertContains(r, 'my.host.com', status_code=404)

with self.assertRaises(DomainDNSHttp404) as cm:
self.run_middleware(request)

assert cm.exception.http_status == 404

# test all the exception handling combined
client_without_exception_handling = Client(rause_request_exception=False)
r = client_without_exception_handling.request(
method="get", path=self.url, HTTP_HOST="my.host.com"
)
assert r.status_code == 404

def test_proper_subdomain(self):
request = self.request(method='get', path=self.url, HTTP_HOST='pip.dev.readthedocs.io')
Expand Down Expand Up @@ -194,8 +204,17 @@ def test_request_header_not_allowed(self):
def test_long_bad_subdomain(self):
domain = 'www.pip.dev.readthedocs.io'
request = self.request(method='get', path=self.url, HTTP_HOST=domain)
res = self.run_middleware(request)
self.assertEqual(res.status_code, 400)
with self.assertRaises(DomainDNSHttp404) as cm:
self.run_middleware(request)

assert cm.exception.http_status == 400

# test all the exception handling combined
client_without_exception_handling = Client(rause_request_exception=False)
r = client_without_exception_handling.request(
method="get", path=self.url, HTTP_HOST=domain
)
assert r.status_code == 400

def test_front_slash(self):
domain = 'pip.dev.readthedocs.io'
Expand Down
2 changes: 1 addition & 1 deletion readthedocs/proxito/views/decorators.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ def map_subproject_slug(view_func):

@wraps(view_func)
def inner_view( # noqa
request, subproject=None, subproject_slug=None, *args, **kwargs
request, subproject=None, subproject_slug=None, *args, **kwargs
):
if subproject is None and subproject_slug:
# Try to fetch by subproject alias first, otherwise we might end up
Expand Down
58 changes: 48 additions & 10 deletions readthedocs/proxito/views/serve.py
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,12 @@
from readthedocs.redirects.exceptions import InfiniteRedirectException
from readthedocs.storage import build_media_storage

from ..exceptions import (
ContextualizedHttp404,
ProjectTranslationHttp404,
ProjectVersionHttp404,
ProxitoProjectFilenameHttp404,
)
from .mixins import (
InvalidPathError,
ServeDocsMixin,
Expand Down Expand Up @@ -139,7 +145,13 @@ def get(

original_version_slug = version_slug
version_slug = self.get_version_from_host(request, version_slug)
final_project, lang_slug, version_slug, filename = _get_project_data_from_request( # noqa

(
final_project,
lang_slug,
version_slug,
filename,
) = _get_project_data_from_request( # noqa
request,
project_slug=project_slug,
subproject_slug=subproject_slug,
Expand Down Expand Up @@ -241,7 +253,7 @@ def get(
'Invalid URL for project with versions.',
filename=filename,
)
raise Http404('Invalid URL for project with versions')
raise Http404("Invalid URL for project with versions")

redirect_path, http_status = self.get_redirect(
project=final_project,
Expand Down Expand Up @@ -341,6 +353,9 @@ def get_using_unresolver(self, request):
if unresolved_domain.is_from_external_domain:
self.version_type = EXTERNAL

# 404 errors aren't contextualized because they are sent to the HTTP proxy
# The path will be 'unresolved' again when HTTP server handles the 404 error
# See: ServeError404Base
try:
unresolved = unresolver.unresolve_path(
unresolved_domain=unresolved_domain,
Expand Down Expand Up @@ -516,13 +531,20 @@ def get(self, request, proxito_path, template_name="404.html"):

version_slug = kwargs.get('version_slug')
version_slug = self.get_version_from_host(request, version_slug)
final_project, lang_slug, version_slug, filename = _get_project_data_from_request( # noqa
# This special treatment of Http404 happens because the decorator that
# resolves a project doesn't know if it's resolving a subproject or a normal project
(
final_project,
lang_slug,
version_slug,
filename,
) = _get_project_data_from_request( # noqa
request,
project_slug=kwargs.get('project_slug'),
subproject_slug=kwargs.get('subproject_slug'),
lang_slug=kwargs.get('lang_slug'),
project_slug=kwargs.get("project_slug"),
subproject_slug=kwargs.get("subproject_slug"),
lang_slug=kwargs.get("lang_slug"),
version_slug=version_slug,
filename=kwargs.get('filename', ''),
filename=kwargs.get("filename", ""),
)

log.bind(
Expand Down Expand Up @@ -585,7 +607,7 @@ def get(self, request, proxito_path, template_name="404.html"):
if response:
return response

raise Http404('No custom 404 page found.')
raise Http404("No custom 404 page found.")

def _register_broken_link(self, project, version, path, full_path):
try:
Expand Down Expand Up @@ -750,6 +772,9 @@ def get_using_unresolver(self, request, path):
# Try to map the current path to a project/version/filename.
# If that fails, we fill the variables with the information we have
# available in the exceptions.

contextualized_404_class = ContextualizedHttp404

try:
unresolved = unresolver.unresolve_path(
unresolved_domain=unresolved_domain,
Expand All @@ -761,21 +786,26 @@ def get_using_unresolver(self, request, path):
filename = unresolved.filename
lang_slug = project.language
version_slug = version.slug
contextualized_404_class = ProxitoProjectFilenameHttp404
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can just initialize the error classes here with what we need, instead of passing everything at the end. This is since I'm thinking of reducing the number of variables we use here (#10112).

Suggested change
contextualized_404_class = ProxitoProjectFilenameHttp404
contextualized_404_error = ProxitoProjectFilenameHttp404(project=..., filename=...)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's simpler to keep the current pattern for now, maybe this change can happen in a later refactor?

except VersionNotFoundError as exc:
project = exc.project
lang_slug = project.language
version_slug = exc.version_slug
filename = exc.filename
contextualized_404_class = ProjectVersionHttp404
except TranslationNotFoundError as exc:
project = exc.project
lang_slug = exc.language
version_slug = exc.version_slug
filename = exc.filename
contextualized_404_class = ProjectTranslationHttp404
except InvalidExternalVersionError as exc:
project = exc.project
# TODO: Use a contextualized 404
except InvalidPathForVersionedProjectError as exc:
project = exc.project
filename = exc.path
# TODO: Use a contextualized 404

log.bind(
project_slug=project.slug,
Expand Down Expand Up @@ -838,7 +868,15 @@ def get_using_unresolver(self, request, path):
)
if response:
return response
raise Http404("No custom 404 page found.")

# No custom 404 page, use our contextualized 404 response
raise contextualized_404_class(
project=project,
version=version,
filename=filename,
lang_slug=project.language,
path_not_found=path,
)


class ServeError404(SettingsOverrideObject):
Expand Down Expand Up @@ -1034,7 +1072,7 @@ def changefreqs_generator():
only_active=True,
)
if not public_versions.exists():
raise Http404
raise Http404()

sorted_versions = sort_version_aware(public_versions)

Expand Down
Loading