Skip to content

CatFileContentStream.execute() should probably safe_decode() stdout and stderr #470

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
warsaw opened this issue Jun 13, 2016 · 0 comments · Fixed by #475
Closed

CatFileContentStream.execute() should probably safe_decode() stdout and stderr #470

warsaw opened this issue Jun 13, 2016 · 0 comments · Fixed by #475

Comments

@warsaw
Copy link
Contributor

warsaw commented Jun 13, 2016

FTR, using Python 3.5 here.

In a Debian project, we want to essentially git show <ref>:debian/changelog but the changelog has some bogus non-utf-8 characters in it. Here's an excerpt (not sure if this will come through in the GH issue):

sbuild (0.24) unstable; urgency=low

  * remove -qq from apt-get call in the updatechroot script
  * fix upgradechroot output and add -u to -y
  * added oldstable to distribution options
  * fix for dependency calculation for --arch-all builds from
    Martin K<F6>gler (Closes: #180859)
  * libpng-dev => libpng12-0-dev in sbuild.conf
  * add dpkg-dev to package dependencies - thanks Michael Banck
    (Closes: #182234)
  * chroot building fix and waldi's patch still to come

 -- Rick Younie <[email protected]>  Sat, 19 Apr 2003 14:41:03 -0700

However, the command tracebacks (notice the weird <F6> in the changelog entry).

Traceback (most recent call last):
  File "/home/barry/projects/ubuntu/uddgit/usd-importer/usd-import", line 144, in get_changelog_versions_from_treeish
    ref, self._local_repo.git.show('%s:debian/changelog' % ref)))
  File "/usr/lib/python3/dist-packages/git/cmd.py", line 459, in <lambda>
    return lambda *args, **kwargs: self._call_process(name, *args, **kwargs)
  File "/usr/lib/python3/dist-packages/git/cmd.py", line 920, in _call_process
    return self.execute(make_call(), **_kwargs)
  File "/usr/lib/python3/dist-packages/git/cmd.py", line 708, in execute
    stdout_value = stdout_value.decode(defenc)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xf6 in position 1341: invalid start byte

where defenc is utf-8. Since git/compat.py already has a safe_decode() method, that should probably be used instead on stdout_value and stderr_value to ensure you don't get an exception on bogus data.

warsaw added a commit to warsaw/GitPython that referenced this issue Jun 15, 2016
@warsaw warsaw mentioned this issue Jun 15, 2016
Byron added a commit that referenced this issue Jun 20, 2016
@Byron Byron added this to the v2.0.6 - Bugfixes milestone Jun 20, 2016
yarikoptic pushed a commit to yarikoptic/GitPython that referenced this issue Sep 8, 2017
 UTF-8 will cause a UnicodeDecodeError.
Author: Barry Warsaw <[email protected]>
Bug: gitpython-developers#470

Patch-Name: issue470-safe-decode.patch
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging a pull request may close this issue.

2 participants