Skip to content

Hang in git.Repo.clone_from() [Inter-process Communication Problem] #72

Closed
@mandaarp

Description

@mandaarp

Task:
Clone from a remote repository having ssh://abc@domain.com/git/remote_repo format, which has submodules included from the outside of domain.com.

Problem:
current_repository = git.Repo.clone_from(r"ssh://abc@domain.com/git/remote_repo", r"d:\temp")

the remote_repo has 313 objects (files and folders). Above method call hangs after fetching 294 objects. Total size of remote_repo is 82 MB.

Observations:

  1. Command Prompt: "git clone -v ssh://abc@domain.com/git/remote_repo d:\temp" works
  2. Git Bash: "git clone -v ssh://abc@domain.com/git/remote_repo d:\temp" works
  3. Python interpreter: os.system("git clone -v ssh://abc@domain.com/git/remote_repo d:\temp") works

Possible but incomplete solution:

there is some problem with stderr=PIPE parameter of subprocess.Popen call in execute method of Python27\Lib\site-packages\GitPython-0.3.2.RC1-py2.7.egg\git\cmd.py.

If I change the code to below:

            err_file = open("d:\\err.txt","w")
    # Start the process
    proc = Popen(command,
                    cwd=cwd,
                    stdin=istream,
                    stderr=err_file,
                    #stderr=PIPE,
                    stdout=PIPE,
                    close_fds=(os.name=='posix'),# unsupported on linux
                    **subprocess_kwargs
                    )

The process does not hang.

Activity

hagdog

hagdog commented on Aug 28, 2013

@hagdog

I am having the same problem. I tried the solution posted above and the clone command starts working.

I am sure that this workaround cannot be used permanently since stderr would always be compromised.

Is there, perhaps, a workaround that can be applied in a script that uses gitpython? I tried various things like closing or re-directing stderr in my script. Nothing worked.

Environment: 64-bit, Windows 7, Service Pack 1, gitpython-0.3.2.rc1.

Byron

Byron commented on Aug 29, 2013

@Byron
Member

I have recently written other process-handling code and used a select loop to read both channels in a single thread without blocking. This is what one should do here as well.

hagdog

hagdog commented on Sep 4, 2013

@hagdog

Thanks for the suggestion, Byron. Unfortunately, this will not work on Windows. Well, at least the way I tried it:

  File "./a_clone_test.py", line 20, in main
    rlist,wlist,xlist = select.select([], [sys.stderr], [])
select.error: (10038, 'An operation was attempted on something that is not a soc
ket')
my@myhost /c/cygwin/home/me/workspaces/infra/scripts (master)

So, the documentation is accurate, you can only select sockets in Windows, i.e. something that came from winsock.

added this to the v0.3.5 - bugfixes milestone on Nov 19, 2014
Byron

Byron commented on Nov 19, 2014

@Byron
Member

The git process handling is a major point of improvement in 0.3.5, as it is causing plenty of issues and has not been fixed sufficiently well yet.

Byron

Byron commented on Jan 7, 2015

@Byron
Member

Process handling was improved to the point where pipes are unable to run full and block.
Please see the related ticket #145 .

timblechmann

timblechmann commented on Feb 5, 2016

@timblechmann

@Byron unfortunately we've still seen this issue with gitpython 1.0.1 on windows. i didn't follow all the codepaths, but using quiet=True for my fetch and clone_from calls resoves this issue, so there must still be something fishy :/

i'm not exactly familiar with the implementation, but https://github.com/gitpython-developers/GitPython/blob/master/git/cmd.py#L661 still seems to have a codepath, which doesn't use the .communicate API ...

Byron

Byron commented on Feb 7, 2016

@Byron
Member

Indeed, by looking at the code linked by you, I'd think that a deadlock is possible here. However, it seems that this functionality is used just once in an unrelated command.

When looking at the code-path taken by fetch/pull, it becomes apparent that it will enforce opening stdout, even though it never reads from it - this could be the deadlock you experience.

The clone_from codepath could have the same problem unless a progress-handler is used. A change was made that could fix the clone-issue.

A few changes have been made, and you are welcome to test them with your setup. In case it works, please let me know.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @Byron@timblechmann@mandaarp@hagdog

        Issue actions

          Hang in git.Repo.clone_from() [Inter-process Communication Problem] · Issue #72 · gitpython-developers/GitPython