Skip to content

New unresolver implementation #9500

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 17 commits into from
Aug 23, 2022
Merged

New unresolver implementation #9500

merged 17 commits into from
Aug 23, 2022

Conversation

stsewd
Copy link
Member

@stsewd stsewd commented Aug 15, 2022

This doesn't use this new implementation to handle the URLs from proxito nor has support for a custom urlconf, those things will be implemented in another PR.

This is on top of #9462


📚 Documentation preview 📚: https://docs--9500.org.readthedocs.build/en/9500/


📚 Documentation preview 📚: https://dev--9500.org.readthedocs.build/en/9500/

@stsewd stsewd changed the title Move to unresolver New unresolver implementation Aug 15, 2022
@stsewd stsewd marked this pull request as ready for review August 15, 2022 23:32
@stsewd stsewd requested a review from a team as a code owner August 15, 2022 23:32
@stsewd stsewd requested a review from humitos August 15, 2022 23:32
Copy link
Member

@ericholscher ericholscher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic cleans up the existing code nicely. I want to better understand the whole picture of this work. Is the next step to use the unresolver in proxito to standardize how we're parsing this in each place in the code?

version = canonical_project.versions.filter(
slug=canonical_project.default_version
).first()
if version:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In what case will this be False? I think when they have a default_version that doesn't exist?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, this is when a use doesn't have a default version or the default version is invalid

If the subproject exists, we try to resolve the rest of the path
with the subproject as the canonical project.
"""
match = self.subproject_pattern.match(path)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
match = self.subproject_pattern.match(path)
# TODO: Allow the canonical_project to override this url somehow..
# if canonical_project.subproject_pattern:
# match = canonical_project.subproject_pattern.match(path)
match = self.subproject_pattern.match(path)

Is this the longer term goal for customization?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, basically on each call to match, will be some extra logic to use the custom pattern from the project instead of the default one.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same idea should be used for multiversion patterns 👍🏼

@stsewd
Copy link
Member Author

stsewd commented Aug 18, 2022

Is the next step to use the unresolver in proxito to standardize how we're parsing this in each place in the code?

Yes, proxito will eventually use this to serve the documentation, but first I'll add the support for custom urlconfs (or a port of that), since we have users using that feature already on .com.

Base automatically changed from refactor-middleware to main August 18, 2022 15:36
Copy link
Member

@ericholscher ericholscher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these changes look good. @humitos not sure if you wanted to review this.

@stsewd This is safe to merge and use before we implement it in proxito right? If so, I'm fine with it going on this deploy.

@stsewd
Copy link
Member Author

stsewd commented Aug 22, 2022

@stsewd This is safe to merge and use before we implement it in proxito right? If so, I'm fine with it going on this deploy.

Yeah, it's safe to deploy.

@ericholscher
Copy link
Member

@stsewd Great, I'd say let's get it merged then so we can do dev on it this week to catch any outstanding issues.

@stsewd stsewd merged commit cd72e84 into main Aug 23, 2022
@stsewd stsewd deleted the move-to-unresolver branch August 23, 2022 18:08
@humitos
Copy link
Member

humitos commented Aug 23, 2022

@ericholscher

I think these changes look good. @humitos not sure if you wanted to review this.

Yeah, I want to review this still. I got pretty overload of things and wasn't able to jump into this yet.

I'll try to do it tomorrow.

Copy link
Member

@humitos humitos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good! 💯

I added some comments about styles and naming. My idea here is to keep consistency through all the codebase and make the names pretty explicit. I think it's important to read the name of the variable and immediately realize what its refers to. As example, subproject_alias works better than project_slug; since project_slug is a lot more generic and could refer to something different than the subproject's alias.

Also, I'd like see those returned tuples converted into dataclasses or similar objects that we can access them via dot notation with clear names.

It's also worth to note that this implementation supports the following cases:

  • single version
  • multi-language multi-version
  • subprojects with single version
  • subprojects with multi-language multi-version

However, it does not support "multi-language single-version".

request = RequestFactory().get(path=parsed.path, HTTP_HOST=domain)
project_slug = map_host_to_project_slug(request)
domain = self.get_domain_from_host(parsed.netloc)
project_slug, domain_object, external = self.unresolve_domain(domain)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are using project_slug to refer to the parent_project_slug here. Below, we have parent_project and project too.

I think it would be good to refactor this to use more descriptive names:

Probably makes sense to change the UnresolvedURL to use current_project as well so we keep consistency all across our code.

filename = "/" + filename
return filename

def _match_multiversion_project(self, parent_project, path):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a better name for this is _match_multiversion_url or _match_multiversion_path since it works over the url/path

Comment on lines +237 to +238
:returns: A tuple with: the project slug, domain object, and if the domain
is from an external version.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to make this a dataclass or similar. Returning these structured tuples is always hard to parse mentally and keep that in mind while working with this code and switching between files

:param check_subprojects: If we should check for subprojects,
this is used to call this function recursively.

:returns: A tuple with: project, version, and file name.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to make this a dataclass or similar here as well and all these related methods.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is okay to have this as a tuple, they are internal methods, and we already have one dataclass for the final result.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even if they are internal methods, real objects are a lot easier to read and follow while debugging/reading this code.

class Unresolver(SettingsOverrideObject):
# TODO: This can catch some possibly valid domains (docs.readthedocs.io.com)
# for example, but these might be phishing, so let's ignore them for now.
log.warning("Weird variation of our domain.", domain=domain)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took a look at this at New Relic and it seems we don't have any log for this: https://onenr.io/0oQD4oYZ5Qy 👍🏼


_default_class = UnresolverBase
log.info("Invalid domain.", domain=domain)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do have some of these on New Relic: https://onenr.io/0Zw06DrADjv

@humitos
Copy link
Member

humitos commented Sep 7, 2022

I added some comments about styles and naming. My idea here is to keep consistency through all the codebase and make the names pretty explicit.

Also, I just noticed that we are calling root_project in the design doc, but in the implementation we are just calling it project. I think it's important to keep consistency here so we all speak the same language and refer to the same thing with the same name. It seems that root_project is more specific and I think it makes sense to use it in the code as well: https://dev.readthedocs.io/en/stable/design/better-doc-urls-handling.html#alternative-implementation

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

3 participants