Closed
Description
The result of git.Git.show (...) misses newlines ("\n") at the end of a file.
Reproduction
- setup git repo / use existing
- add empty textfile "foo.txt", commit file
- modify file by inserting string "foo\r\n", commit changes
- run git cmd "show [commit sha]:foo.txt"
- ... via GitPython, example:
# commits = ...
repo.git.show(f"{commits[0].hexsha}:{testfile}")
- ... from Git terminal, pipe through cat etc. to see control characters, example:
git show 74e70e7:foo.txt | cat -vet | less
Observation
The string returned from show via GitPython has a trailing "\r", missing the "\n". The result from terminal execution has a trailing "\r\n".
Expected
Both outputs have the correct trailing "\r\n".
Addendum
-
run GitPython cmd with option "with_extended_output" -> same result
-
run GitPython cmd with option "stdout_as_string" -> same result (but in binary format)
-
crosscheck: run "git show" via subprocess.popen -> output has correct "\r\n"
# commits = ... p = subprocess.Popen(["git", "show", f"{commits[0].hexsha}:{testfile}"], stdout=subprocess.PIPE) out, err = p.communicate() print(repr(out))
Activity
Byron commentedon Nov 13, 2021
Thanks a lot for the wonderful description of the issue and for spending the time that undoubtedly when into it.
It's likely that GitPython strips
git
output somewhere in its command execution machinery, and I wonder why it was added in the first place.toku-sa-n commentedon Apr 4, 2022
Hello, I've also encountered this problem (see my SO question and the answer). I found the line where GitPython strips the last
\n
.GitPython/git/cmd.py
Lines 946 to 948 in 0b33576
The test below fails for the current main branch, and succeeds after removing
[:-1]
.The last
\n
is unnecessary for most operations, and this process is reasonable. However, this is problematic forgit show
, especially for the binary files.I think there are three options:
\n
at any time.\n
for thegit show
output.I'd like to send a PR to resolve this issue, so could you tell me your intention toward this?
Byron commentedon Apr 5, 2022
Thanks a lot for contributing a test and additional analysis, @toku-sa-n, I'd love to see this culminate in a PR.
Here are my thoughts:
This would be a potentially breaking change even though it would certainly be the correct way to do it.
This would require a parameter to be added to
execute()
to control whether or not the newline should be stripped. This is backwards compatible if the default isTrue
and allows callers to avoid the issue after running into it.Providing a means to alleviate this seems like the minimum action to perform here.
If it's never desirable to strip newlines for binary output it might be possible to detect if something is binary by looking at the first 1024 bytes or so of the output and avoid stripping newlines then. There is probably a package for that, similar to the
file
program in unix. Maybe this could be investigated and further discussed in the PR. I'd be particularly interested in the performance impact of such doing so.Thanks everyone.
strip_newline
flag #1423SonOfLilit commentedon Jul 25, 2024
Adding a flag and keeping the default as
False
even forshow
, which is what ended up happening in this thread, is not really a solution. People will keep hitting this issue, debugging for a few hours, and some of the times not figuring out there is a flag to address this and doing some horrible workaround instead.Will you accept a PR to toggle the default for
show
(and only forshow
because it's the only place it matters)?Will you accept a PR to fire a deprecation warning if it's
False
onshow
(saying the default is about to change), and then in say a year another PR to toggle the default?I'm speaking fro personal pain here:
I just encountered a vendor-provided codebase that failed on some edge case in git show, and examining the issue I noticed they used GitPython for everything except
git show
and a separate library for that, so I reverted to GitPython and after an additional half an hour of debuggning figured out why the file hashes sometimes came out different and googled "gitpython show trailing newlines". Evidently the person who wrote that codebase didn't think to google this, or perhaps they encountered it before this bug was reported and "fixed".Byron commentedon Jul 25, 2024
Please feel free to submit a PR, ideally with an accompanying test to show what it's doing. Breaking changes can't be done though.