Skip to content

diff incorrectly populates submodule blobs #891

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
hefee opened this issue Jul 8, 2019 · 4 comments
Closed

diff incorrectly populates submodule blobs #891

hefee opened this issue Jul 8, 2019 · 4 comments

Comments

@hefee
Copy link

hefee commented Jul 8, 2019

The blobs returned by diff are somehow different that the blobs returned by commit.tree/path for submodule blobs. E.g. if I try to create a IndexEntry from a blob it fails with a ValueError, when using the blob from diff. As you see I added an assert (line 28) to make sure, that I really access the same blob!

I can run successfully:

sub_blob = r.head.commit.tree/"sub"
git.IndexEntry.from_blob(sub_blob)

but fails for:

d = r.commit('1').diff(r.commit('2'))[0]
git.IndexEntry.from_blob(d.b_blob)
$ rm -rf /tmp/sub /tmp/foo && python3 /tmp/test.py
Traceback (most recent call last):
  File "/tmp/test.py", line 29, in <module>
    git.IndexEntry.from_blob(d.b_blob)
  File "/usr/lib/python3/dist-packages/git/index/typ.py", line 176, in from_blob
    time, time, 0, 0, 0, 0, blob.size))
  File "/usr/lib/python3/dist-packages/gitdb/util.py", line 253, in __getattr__
    self._set_cache_(attr)
  File "/usr/lib/python3/dist-packages/git/objects/base.py", line 166, in _set_cache_
    super(IndexObject, self)._set_cache_(attr)
  File "/usr/lib/python3/dist-packages/git/objects/base.py", line 72, in _set_cache_
    oinfo = self.repo.odb.info(self.binsha)
  File "/usr/lib/python3/dist-packages/git/db.py", line 37, in info
    hexsha, typename, size = self._git.get_object_header(bin_to_hex(sha))
  File "/usr/lib/python3/dist-packages/git/cmd.py", line 1077, in get_object_header
    return self.__get_object_header(cmd, ref)
  File "/usr/lib/python3/dist-packages/git/cmd.py", line 1066, in __get_object_header
    return self._parse_object_header(cmd.stdout.readline())
  File "/usr/lib/python3/dist-packages/git/cmd.py", line 1030, in _parse_object_header
    raise ValueError("SHA %s could not be resolved, git returned: %r" % (tokens[0], header_line.strip()))                                                                                             
ValueError: SHA b'e03cac5f0551bfd7cc9030a7bf862aee43937ae0' could not be resolved, git returned: b'e03cac5f0551bfd7cc9030a7bf862aee43937ae0 missing' 

Here is my test script (/tmp/test.py)

import git

sub = git.Repo.init("/tmp/sub")
open("/tmp/sub/subfile", "w").write("")
sub.index.add(["subfile"])
sub.index.commit("first commit")

r = git.Repo.init("/tmp/foo")
open("/tmp/foo/test", "w").write("")
r.index.add(['test'])
git.Submodule.add(r, "subtest", "sub", url="file:///tmp/sub")
r.index.commit("first commit")
r.create_tag('1')
submodule = r.submodule('subtest')
open("/tmp/foo/sub/subfile", "w").write("blub")
submodule.module().index.add(["subfile"])
submodule.module().index.commit("changed subfile")
submodule.binsha = submodule.module().head.commit.binsha 
r.index.add([submodule])
r.index.commit("submodule changed")
r.create_tag('2')

sub_blob = r.head.commit.tree/"sub"
git.IndexEntry.from_blob(sub_blob)

d = r.commit('1').diff(r.commit('2'))[0]

assert(sub_blob.hexsha == d.b_blob.hexsha)
git.IndexEntry.from_blob(d.b_blob)
@hefee
Copy link
Author

hefee commented Jul 8, 2019

The initial issue is tracked at tails: https://redmine.tails.boum.org/code/issues/16862

@Byron
Copy link
Member

Byron commented Jul 20, 2019

Thanks a lot for this fantastic issue! I could reproduce it and hope someone can dig in deeper for a PR with a fix.

asharov pushed a commit to asharov/git-hammer that referenced this issue Oct 28, 2019
By default, git diff includes entries for submodules in its output.
This breaks when trying to extract information from such a file
since the submodule directory is not in fact a file known to git.
This may be related to a GitPython issue:
gitpython-developers/GitPython#891

But for git-hammer, the sensible course of action seems to be
ignoring submodules. They are not code in the main repository,
and if their code should be in a project, the actual repository
can be included. So add to diff the option ignore-submodules that
causes it to skip submodules.

Also add a unit test that creates a submodule in the test
repository and verifies that git-hammer works correctly on it.
@intrigeri
Copy link

Hi,

@hefee reported elsewhere that this bug is fixed in 3.0.5. I suppose that's thanks to #947.

Shall this issue be closed?

@hefee
Copy link
Author

hefee commented Jan 28, 2021

Ack we can close this one, because of #974 got merged.

@hefee hefee closed this as completed Jan 28, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

3 participants