Skip to content

Unresolver: strict validation for external versions and other fixes #9534

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 26 commits into from
Sep 26, 2022

Conversation

stsewd
Copy link
Member

@stsewd stsewd commented Aug 22, 2022

  • At first I was thinking to add this validation
    outside the unresolver, but I think it makes
    sense to be in the unresolver.
  • This also has a fix for partial subproject matches.
  • Instead of returning the query and the fragment
    separately, just return the parsed URL, I think
    it can useful to have the full path without
    having to call urllib.parse again.
  • Just return the extracted external version
    from unresolve_domain, so we don't have to parse the
    subdomain again.

This is on top of #9500.


📚 Documentation previews 📚

stsewd and others added 18 commits August 1, 2022 10:38
- At first I was thinking to add this validation
  outside the unresolver, but I think it makes
  sense to be in the unresolver.
- This also has a fix for partial subproject matches.
- Instead of returning the query and the fragment
  separately, just return the parsed URL, I think
  it can useful to have the full path without
  having to call urllib.parse again.
- Just return the extracted external version
  from unresolve_domain, so we don't have to parse the
  subdomain again.
@stsewd stsewd requested a review from a team as a code owner August 22, 2022 23:28
@stsewd stsewd requested a review from humitos August 22, 2022 23:28
Copy link
Member

@ericholscher ericholscher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a good change, but I do worry a bit about some of the response magic that I don't fully understand here, and what it's handling.

Comment on lines +94 to +95
version = None
filename = None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we do something more than set these to None here? Is raising an exception better, rather than having the caller need to know what None means on these fields?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Setting version to None will be interpreted as like the version wasn't found, but we could return None, or raise an exception, since those URLs will be generated by manually changing the URL.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about this as well, and I'm not sure what's my position here.

I'd like to raise exceptions for all these cases were we return None for not found things. However, I'm not sure it's possible for all the cases because IIRC there are other places were project, None, None is useful to continue moving forward.

If that's not the case and I'm confused here, I'm 👍🏼 on standardizing this and always raise a detailed exception for each case that we can clearly handle from outside.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, basically this would be interpreted as a 404, like if the version wouldn't exist.

Base automatically changed from move-to-unresolver to main August 23, 2022 18:08
@ericholscher
Copy link
Member

ericholscher commented Aug 31, 2022

I think this is close once we fix up the conflicts 👍

Also would be useful to have a @humitos review as well.

Copy link
Member

@humitos humitos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is getting better 👍🏼

I think it's not ready to get merged yet, but it's pretty close. I left some comments that I'd like to see considered. Some of these comments are related with the review I've done in #9500 (review)

I'm not opposed to move forward without refactoring all the naming and others, but I'd really appreciate if we do that work now that we are all of us with this code in mind.

Comment on lines +94 to +95
version = None
filename = None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about this as well, and I'm not sure what's my position here.

I'd like to raise exceptions for all these cases were we return None for not found things. However, I'm not sure it's possible for all the cases because IIRC there are other places were project, None, None is useful to continue moving forward.

If that's not the case and I'm confused here, I'm 👍🏼 on standardizing this and always raise a detailed exception for each case that we can clearly handle from outside.

Comment on lines +94 to +95
version = None
filename = None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stsewd
Copy link
Member Author

stsewd commented Sep 14, 2022

I have updated this PR with some of comments from the review from
#9425.

There are a couple I didn't address, since I'm -0.
But we can continue discussing if you feel strongly.
I'd merge this for now, since all other things are related to refactoring.

Renames

  • UnresolvedURL.project -> UnresolvedURL.current_project.

    Probably makes sense to change the UnresolvedURL to use current_project as
    well so we keep consistency all across our code.

    I think it's fine to refer to it as just project inside the dataclass
    (similar we have just version instead of current_version)

  • _match_multiversion_project -> _match_multiversion_path, _match_subproject -> _match_subproject_path.

    I think a better name for this is _match_multiversion_url or
    _match_multiversion_path since it works over the url/path

    Same here, I think it's better to call this _match_subproject_path

    _match_single_version_project -> _match_single_version_path?

    For me all those methods work over the project and the path, so I'm fine with any name,
    but slightly preference for the current one.

Dataclass instead of a tuple

  • _unresolve_path and friends

    It would be good to make this a dataclass or similar. Returning these
    structured tuples is always hard to parse mentally and keep that in mind
    while working with this code and switching between files

  • unresolve_domain

    It would be good to make this a dataclass or similar here as well and all
    these related methods.

I started the implementation using 3 dataclasses,
but it felt complex instead of returning the 3 elements
(if we are returning more than 3, than I think it makes sense to use a dataclass),
since we already have one dataclass for the final result, having more makes it confusing where they are used,
and all of these are private methods, so we won't use them outside the class.

Others

Also, I just noticed that we are calling root_project in the design doc, but
in the implementation we are just calling it project.

We are calling it parent project (#9500 (comment)),
I think it was my mistake while updating the design doc to use root project instead of parent project
(2658ec6),
we were using parent project before I updated the document.

Copy link
Member

@humitos humitos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @stsewd for considering my comments from the other PR as well. I think we can move forward with this PR and merge it. I didn't see anything blocking it at this point. As you said, there are things about naming and refactoring that we can do in another PR. The core logic is done 👍🏼

@stsewd stsewd merged commit 141f89d into main Sep 26, 2022
@stsewd stsewd deleted the strict-validation-for-external-versions branch September 26, 2022 21:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

3 participants