-
-
Notifications
You must be signed in to change notification settings - Fork 933
Fix blob filter types #1459
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix blob filter types #1459
Conversation
Fix the types and type annotations of some of the blob filter code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot, the change will be appreciated!
There is some usage of str(…)
conversions which requires attention, but with that fix there should be nothing in the way of merging.
git/index/typ.py
Outdated
def __call__(self, stage_blob: Blob) -> bool: | ||
path = stage_blob[1].path | ||
def __call__(self, stage_blob: Tuple[StageType, Blob]) -> bool: | ||
path: str = str(stage_blob[1].path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Paths can't be converted to str
as this implies an interpreter defined encoding.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain what you mean by this or give an example?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before the change it looked like this:
path = stage_blob[1].path
And after the change it should still be looking like this but possibly with type annotations added to please the type checker. Creating new strings with str(…)
changes the way the program functions, and does so in a breaking way due to encodings being unspecified for paths on linux.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed to use paths instead of strs. Still not really sure what you mean by encodings being unspecified for paths on linux.
git/index/typ.py
Outdated
for p in self.paths: | ||
if path.startswith(p): | ||
if path.startswith(str(p)): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There has to be a way to do this without str(…)
, please see above for the reason.
Remove usage of `PosixPath.is_relative_to` because it was added in Python 3.9 and earlier versions of Python are supported by `GitPython`.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for the changes and the continued effort! I think using Path
should be correct in terms of required encodings, at least it's as good as it gets.
I also understand that just converting to a known type temporarily is probably the easiest than having to try to support all possible input types, so let's go with that even though it seems potentially wasteful.
There is one more issue I seem to be seeing, and am looking forward to hearing your opinion.
git/index/typ.py
Outdated
for pathlike in self.paths: | ||
path: Path = pathlike if isinstance(pathlike, Path) else Path(pathlike) | ||
# TODO: Change to use `PosixPath.is_relative_to` once Python 3.8 is no longer supported. | ||
if all(i == j for i, j in zip(path.parts, blob_path.parts)): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this not another way of writing path1 == path2
, since all path components (here .parts
) have to be equivalent? The benefit of doing so of course is that it ignores which path separator is used, but I am having trouble to understand how this checks that blob_path
starts with one of self.paths
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should work because zip
stops when the shortest of the arguments is exhausted. So it is only checking if all the path components are equivalent in the case that the length of the parts
are equal. But, when one path has more parts than the other, then it is checking that the longer path starts with the shorter path.
Example:
def startswith(a, b): return all(i == j for i, j in zip(a, b))
a = [1, 2, 3]
b = [1, 2]
assert startswith(a, b)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatively, the implementation of PosixPath.is_relative_to
looks like this:
def is_relative_to(self, *other):
"""Return True if the path is relative to another path or False.
"""
try:
self.relative_to(*other)
return True
except ValueError:
return False
We could do something similar to this in-place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the explanation, I see how this works now and why this is correct. The alternative has to remain a TODO until it becomes available, and looking at the code I am happy to suggest to not ever go there - the try:except clause seems like a deal-breaker to me. It's entirely up to you though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, let me undo what I last said. I once again think this implementation is incorrect, as these paths are not equivalent.
base = [1, 2, 3]
added_path = [1, 2]
The above would match even though added_path
is not contained in base.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, I pushed a change that addresses this and adds a test too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for the test and the fix. It's definitely an under-tested area of GitPython as the issue wasn't detected by any test prior.
I left a comment, but when clarified, this PR is ready to be merged.
Thanks a lot for your contribution and going through the review process with me! I hope the code is more robust now, also knowing that the current implementation (independently of your contribution) generally lacks beyond repair compared to what git actually does. |
Fix the types and type annotations of some of the blob filter code.