You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: doc/intro.rst
+5-7
Original file line number
Diff line number
Diff line change
@@ -4,20 +4,18 @@
4
4
Overview / Install
5
5
==================
6
6
7
-
GitPython is a python library used to interact with Git repositories.
7
+
GitPython is a python library used to interact with git repositories, high-level like git-porcelain, or low-level like git-plumbing.
8
8
9
-
GitPython was a port of the grit_ library in Ruby created by
10
-
Tom Preston-Werner and Chris Wanstrath, but grew beyond its heritage through its improved design and performance.
9
+
It provides abstractions of git objects for easy access of repository data, and additionally allows you to access the git repository more directly using either a pure python implementation, or the faster, but more resource intensive git command implementation.
11
10
12
-
.. _grit: http://grit.rubyforge.org
11
+
The object database implementation is optimized for handling large quantities of objects and large datasets, which is achieved by using low-level structures and data streaming.
13
12
14
13
Requirements
15
14
============
16
15
17
-
* Git_ tested with 1.5.3.7
18
-
* Requires Git_ 1.7.0 or newer
16
+
* Tested with `Git`_ 1.7.0 or newer
19
17
* `Python Nose`_ - used for running the tests
20
-
* `Mock by Michael Foord`_ used for tests. Requires 0.5
18
+
* `Mock by Michael Foord`_ used for tests. Requires version 0.5
Copy file name to clipboardExpand all lines: doc/tutorial.rst
+48-31
Original file line number
Diff line number
Diff line change
@@ -8,7 +8,7 @@
8
8
GitPython Tutorial
9
9
==================
10
10
11
-
GitPython provides object model access to your git repository. This tutorial is composed of multiple sections, each of which explain a real-life usecase.
11
+
GitPython provides object model access to your git repository. This tutorial is composed of multiple sections, each of which explains a real-life usecase.
12
12
13
13
Initialize a Repo object
14
14
************************
@@ -17,10 +17,12 @@ The first step is to create a ``Repo`` object to represent your repository::
In the above example, the directory ``/Users/mtrier/Development/git-python`` is my working repository and contains the ``.git`` directory. You can also initialize GitPython with a bare repository::
22
+
In the above example, the directory ``/Users/mtrier/Development/git-python`` is my working repository and contains the ``.git`` directory. You can also initialize GitPython with a *bare* repository::
22
23
23
24
repo = Repo.create("/var/git/git-python.git")
25
+
assert repo.bare == True
24
26
25
27
A repo object provides high-level access to your data, it allows you to create and delete heads, tags and remotes and access the configuration of the repository::
26
28
@@ -43,6 +45,25 @@ Archive the repository contents to a tar file::
43
45
44
46
repo.archive(open("repo.tar",'w'))
45
47
48
+
49
+
Object Databases
50
+
****************
51
+
``Repo`` instances are powered by its object database instance which will be used when extracting any data, or when writing new objects.
52
+
53
+
The type of the database determines certain performance characteristics, such as the quantity of objects that can be read per second, the resource usage when reading large data files, as well as the average memory footprint of your application.
54
+
55
+
GitDB
56
+
=====
57
+
The GitDB is a pure-python implementation of the git object database. It is the default database to use in GitPython 0.3. Its uses less memory when handling huge files, but will be 2 to 5 times slower when extracting large quantities small of objects from densely packed repositories::
58
+
59
+
repo = Repo("path/to/repo", odbt=GitDB)
60
+
61
+
GitCmdObjectDB
62
+
==============
63
+
The git command database uses persistent git-cat-file instances to read repository information. These operate very fast under all conditions, but will consume additional memory for the process itself. When extracting large files, memory usage will be much higher than the one of the ``GitDB``::
64
+
65
+
repo = Repo("path/to/repo", odbt=GitCmdObjectDB)
66
+
46
67
Examining References
47
68
********************
48
69
@@ -88,46 +109,44 @@ Change the symbolic reference to switch branches cheaply ( without adjusting the
88
109
89
110
Understanding Objects
90
111
*********************
91
-
An Object is anything storable in git's object database. Objects contain information about their type, their uncompressed size as well as the actual data. Each object is uniquely identified by a SHA1 hash, being 40 hexadecimal characters in size or 20 bytes in size.
112
+
An Object is anything storable in git's object database. Objects contain information about their type, their uncompressed size as well as the actual data. Each object is uniquely identified by a binary SHA1 hash, being 20 bytes in size.
92
113
93
114
Git only knows 4 distinct object types being Blobs, Trees, Commits and Tags.
94
115
95
-
In Git-Pyhton, all objects can be accessed through their common base, compared and hashed, as shown in the following example::
116
+
In Git-Python, all objects can be accessed through their common base, compared and hashed. They are usually not instantiated directly, but through references or specialized repository functions::
96
117
97
118
hc = repo.head.commit
98
119
hct = hc.tree
99
120
hc != hct
100
121
hc != repo.tags[0]
101
122
hc == repo.head.reference.commit
102
123
103
-
Basic fields are::
124
+
Common fields are::
104
125
105
126
hct.type
106
127
'tree'
107
128
hct.size
108
129
166
109
-
hct.sha
130
+
hct.hexsha
110
131
'a95eeb2a7082212c197cabbf2539185ec74ed0e8'
111
-
hct.data # returns string with pure uncompressed data
112
-
'...'
113
-
len(hct.data) == hct.size
132
+
hct.binsha
133
+
'binary 20 byte sha1'
114
134
115
-
Index Objects are objects that can be put into git's index. These objects are treesand blobs which additionally know about their path in the filesystem as well as their mode::
135
+
Index Objects are objects that can be put into git's index. These objects are trees, blobs and submodules which additionally know about their path in the filesystem as well as their mode::
116
136
117
137
hct.path # root tree has no path
118
138
''
119
139
hct.trees[0].path # the first subdirectory has one though
120
140
'dir'
121
-
htc.mode # trees have mode 0
122
-
0
141
+
htc.mode # trees have the mode of a linux directory
142
+
040000
123
143
'%o' % htc.blobs[0].mode # blobs have a specific mode though comparable to a standard linux fs
124
144
100644
125
145
126
146
Access blob data (or any object data) directly or using streams::
127
147
128
-
htc.data # binary tree data as string ( inefficient )
129
-
htc.blobs[0].data_stream # stream object to read data from
130
-
htc.blobs[0].stream_data(my_stream) # write data to given stream
148
+
htc.blobs[0].data_stream.read() # stream object to read data from
149
+
htc.blobs[0].stream_data(open("blob_data", "w")) # write data to given stream
131
150
132
151
133
152
The Commit object
@@ -153,11 +172,11 @@ The above will return commits 21-30 from the commit list.::
@@ -178,7 +197,7 @@ The above will return commits 21-30 from the commit list.::
178
197
'cleaned up a lot of test information. Fixed escaping so it works with
179
198
subprocess.'
180
199
181
-
Note: date time is represented in a ``seconds since epock`` format. Conversion to human readable form can be accomplished with the various time module methods::
200
+
Note: date time is represented in a ``seconds since epoch`` format. Conversion to human readable form can be accomplished with the various `time module<http://docs.python.org/library/time.html>`_ methods::
You can also get a tree directly from the repository if you know its name::
242
259
@@ -252,7 +269,7 @@ As trees only allow direct access to their direct entries, use the traverse met
252
269
253
270
tree.traverse()
254
271
<generator object at 0x7f6598bd65a8>
255
-
for entry in traverse(): do_something_with(entry)
272
+
for entry in tree.traverse(): do_something_with(entry)
256
273
257
274
258
275
The Index Object
@@ -263,15 +280,15 @@ The git index is the stage containing changes to be written with the next commit
263
280
264
281
Access objects and add/remove entries. Commit the changes::
265
282
266
-
for stage,blob in index.iter_blobs(): do_something(...)
267
-
Access blob objects
268
-
for (path,stage),entry in index.entries.iteritems: pass
269
-
Access the entries directly
283
+
for stage,blob in index.iter_blobs(): do_something(...)
284
+
# Access blob objects
285
+
for (path,stage),entry in index.entries.iteritems: pass
286
+
# Access the entries directly
270
287
index.add(['my_new_file']) # add a new file to the index
271
288
index.remove(['dir/existing_file'])
272
289
new_commit = index.commit("my commit message")
273
290
274
-
Create new indices from other trees or as result of a merge. Write that result to a new index::
291
+
Create new indices from other trees or as result of a merge. Write that result to a new index file::
275
292
276
293
tmp_index = Index.from_tree(repo, 'HEAD~1') # load a tree into a temporary index
277
294
merge_index = Index.from_tree(repo, 'base', 'HEAD', 'some_branch') # merge two trees three-way
@@ -303,7 +320,7 @@ Change configuration for a specific remote only::
303
320
Obtaining Diff Information
304
321
**************************
305
322
306
-
Diffs can generally be obtained by Subclasses of ``Diffable`` as they provide the ``diff`` method. This operation yields a DiffIndex allowing you to easily access diff information about paths.
323
+
Diffs can generally be obtained by subclasses of ``Diffable`` as they provide the ``diff`` method. This operation yields a DiffIndex allowing you to easily access diff information about paths.
307
324
308
325
Diffs can be made between the Index and Trees, Index and the working tree, trees and trees as well as trees and the working copy. If commits are involved, their tree will be used implicitly::
309
326
@@ -346,7 +363,7 @@ The return value will by default be a string of the standard output channel prod
346
363
Keyword arguments translate to short and long keyword arguments on the commandline.
347
364
The special notion ``git.command(flag=True)`` will create a flag without value like ``command --flag``.
348
365
349
-
If ``None`` is found in the arguments, it will be dropped silently. Lists and tuples passed as arguments will be unpacked to individual arguments. Objects are converted to strings using the str(...) function.
366
+
If ``None`` is found in the arguments, it will be dropped silently. Lists and tuples passed as arguments will be unpacked recursively to individual arguments. Objects are converted to strings using the str(...) function.
0 commit comments