Skip to content

adding patch property to Commit class #1416

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions AUTHORS
Original file line number Diff line number Diff line change
@@ -45,4 +45,5 @@ Contributors are:
-Alba Mendez <me _at_ alba.sh>
-Robert Westman <robert _at_ byteflux.io>
-Hugo van Kemenade
-Durai Pandian <dduraipandian _at_ gmail.com>
Portions derived from other open source works and are clearly marked.
12 changes: 12 additions & 0 deletions git/objects/commit.py
Original file line number Diff line number Diff line change
@@ -315,6 +315,18 @@ def stats(self) -> Stats:
text = self.repo.git.diff(self.parents[0].hexsha, self.hexsha, '--', numstat=True)
return Stats._list_from_string(self.repo, text)

@property
def patch(self) -> str:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It probably would be worth describing that this method can fail if the processed bytes aren't in the encoding that python expects (as it returns str and not bytes). Hence this method can throw an exception and from my experience it's more than likely that this happens to some.

This also means that at some point in the future and if there is demand one could provide a variant of this method that returns bytes instead.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this note. I will check to see if i can return bytes than string and leave the choice to users to convert them to string. Is that okay?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would be okay, yes. If it doesn't work with returning bytes right now then the possibility of decoding errors could be highlighted in the documentation specifically.

"""Get a git patch from changes between this commit and its first parent
or from all changes done if this is the very first commit.
:return: String"""
if not self.parents:
text = self.repo.git.diff(self.hexsha)
else:
text = self.repo.git.diff(self.parents[0].hexsha, self.hexsha, '--', numstat=False)
return text

@property
def trailers(self) -> Dict:
"""Get the trailers of the message as dictionary
75 changes: 75 additions & 0 deletions test/test_commit.py
Original file line number Diff line number Diff line change
@@ -149,6 +149,81 @@ def check_entries(d):
self.assertEqual(commit.committer_tz_offset, 14400, commit.committer_tz_offset)
self.assertEqual(commit.message, "initial project\n")

def test_patch_with_parent(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is a patch with parent, I think it should be easy to find a patch with is much smaller. If there truly is no way, the expected value could be read from a file.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for reviewing it. I will try to get commit with a smaller patch.

expected_path_result = """diff --git a/doc/source/tutorial.rst b/doc/source/tutorial.rst
index bc386e7c..fcbc18bf 100644
--- a/doc/source/tutorial.rst
+++ b/doc/source/tutorial.rst
@@ -66,7 +66,7 @@ Archive the repository contents to a tar file.
Advanced Repo Usage
===================
-And of course, there is much more you can do with this type, most of the following will be explained in greater detail in specific tutorials. Don't worry if you don't understand some of these examples right away, as they may require a thorough understanding of gits inner workings.
+And of course, there is much more you can do with this type, most of the following will be explained in greater detail in specific tutorials. Don't worry if you don't understand some of these examples right away, as they may require a thorough understanding of git's inner workings.
Query relevant repository paths ...
@@ -363,7 +363,7 @@ Handling Remotes
:start-after: # [25-test_references_and_objects]
:end-before: # ![25-test_references_and_objects]
-You can easily access configuration information for a remote by accessing options as if they where attributes. The modification of remote configuration is more explicit though.
+You can easily access configuration information for a remote by accessing options as if they were attributes. The modification of remote configuration is more explicit though.
.. literalinclude:: ../../test/test_docs.py
:language: python
@@ -391,7 +391,7 @@ Here's an example executable that can be used in place of the `ssh_executable` a
ID_RSA=/var/lib/openshift/5562b947ecdd5ce939000038/app-deployments/id_rsa
exec /usr/bin/ssh -o StrictHostKeyChecking=no -i $ID_RSA "$@"
-Please note that the script must be executable (i.e. `chomd +x script.sh`). `StrictHostKeyChecking=no` is used to avoid prompts asking to save the hosts key to `~/.ssh/known_hosts`, which happens in case you run this as daemon.
+Please note that the script must be executable (i.e. `chmod +x script.sh`). `StrictHostKeyChecking=no` is used to avoid prompts asking to save the hosts key to `~/.ssh/known_hosts`, which happens in case you run this as daemon.
You might also have a look at `Git.update_environment(...)` in case you want to setup a changed environment more permanently.
@@ -509,14 +509,14 @@ The type of the database determines certain performance characteristics, such as
GitDB
=====
-The GitDB is a pure-python implementation of the git object database. It is the default database to use in GitPython 0.3. Its uses less memory when handling huge files, but will be 2 to 5 times slower when extracting large quantities small of objects from densely packed repositories::
+The GitDB is a pure-python implementation of the git object database. It is the default database to use in GitPython 0.3. It uses less memory when handling huge files, but will be 2 to 5 times slower when extracting large quantities of small objects from densely packed repositories::
repo = Repo("path/to/repo", odbt=GitDB)
GitCmdObjectDB
==============
-The git command database uses persistent git-cat-file instances to read repository information. These operate very fast under all conditions, but will consume additional memory for the process itself. When extracting large files, memory usage will be much higher than the one of the ``GitDB``::
+The git command database uses persistent git-cat-file instances to read repository information. These operate very fast under all conditions, but will consume additional memory for the process itself. When extracting large files, memory usage will be much higher than ``GitDB``::
repo = Repo("path/to/repo", odbt=GitCmdObjectDB)
"""
commit = self.rorepo.commit('c0740570b31f0f0fe499bf4fc5abbf89feb1757d')
patch = commit.patch
self.assertEqual(patch[:200], expected_path_result[:200])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand why it would slice the result into lines like this. This also means it would possibly only test for equivalence on a subset of lines, missing some lines behind the 600 mark, for example.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't test with whole patch text. I have to set maxDiff = None to disable limit and test. For test_with_parent, I will try to get a commit with smaller patch. For test_with_no_parent, which is first commit, I will save patch text to file and set maxDiff in setup and test it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then I think it would be important to make clear how maxDiff works in the property's documentation so callers know how to deal with it. To me for instance it wasn't clear this exists.

Or said differently, if there is a technical limitation there is less of a need to work around it from my side, as long as it can be made clear in the docs how to handle it.

self.assertEqual(patch[200:400], expected_path_result[200:400])
self.assertEqual(patch[400:600], expected_path_result[400:600])

def test_patch_with_no_parent(self):
expected_path_result = """diff --git a/.flake8 b/.flake8
new file mode 100644
index 00000000..c55fe35d
--- /dev/null
+++ b/.flake8
@@ -0,0 +1,35 @@
+[flake8]
+show-source = True
+count= True
+statistics = True
+# E265 = comment blocks like @{ section, which it can't handle
+# E266 = too many leading '#' for block comment
+# E731 = do not assign a lambda expression, use a def
+# W293 = Blank line contains whitespace
+# W504"""
commit = self.rorepo.commit('33ebe7acec14b25c5f84f35a664803fcab2f7781')
patch = commit.patch
self.assertEqual(patch[:200], expected_path_result[:200])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above, I don't think the slice syntax should be necessary here. If it is, a description on why that is would be necessary, it could be passed as third argument to assertEqual I believe.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

have to set maxDiff = None to disable limit and test. I will do the change and test it.


def test_unicode_actor(self):
# assure we can parse unicode actors correctly
name = "Üäöß ÄußÉ"