Skip to content

Commit e34d726

Browse files
author
Yue Tu
committed
address comments about documentation
1 parent d11fa17 commit e34d726

File tree

1 file changed

+50
-46
lines changed

1 file changed

+50
-46
lines changed

doc/overview.rst

+50-46
Original file line numberDiff line numberDiff line change
@@ -183,77 +183,57 @@ Here is an example:
183183
# When you are done using your endpoint
184184
algo.delete_endpoint()
185185
186-
Git Support
187-
-----------
188-
If you have your training scripts or in your GitHub (or other Git) repository, you can use them directly without the
189-
trouble to download them locally. Git support can be enabled simply by providing ``git_config`` parameter
186+
Use Scripts Stored in a Git Repository
187+
--------------------------------------
188+
When you create an estimator, you can specify a training script that is stored in a GitHub or other Git repository as the entry point for the estimator, so that you don't have to download the scripts locally.
189+
If you fo so, source directory and dependencies should be in the same repo if they are needed. Git support can be enabled simply by providing ``git_config`` parameter
190190
when creating an ``Estimator`` object. If Git support is enabled, then ``entry_point``, ``source_dir`` and ``dependencies``
191-
should all be relative paths in the Git repo if provided. Note that if you decided to use Git support, then all your
192-
training scripts should be in a single Git repo.
191+
should be relative paths in the Git repo if provided.
193192

194-
The ``git_config`` parameter includes arguments ``repo``, ``branch``, ``commit``, ``2FA_enabled``, ``username``,
195-
``password`` and ``token``. Except for ``repo``, the other arguments are optional. ``repo`` specifies the Git repository
193+
The ``git_config`` parameter includes fields ``repo``, ``branch``, ``commit``, ``2FA_enabled``, ``username``,
194+
``password`` and ``token``. The ``repo`` field is required. All other fields are optional. ``repo`` specifies the Git repository
196195
that you want to use. If ``branch`` is not provided, master branch will be used. If ``commit`` is not provided,
197196
the latest commit in the required branch will be used.
198197

199-
``2FA_enabled``, ``username``, ``password`` and ``token`` are for authentication purpose. ``2FA_enabled`` should
200-
be 'True' or 'False', providing the information whether two-factor authentication is enabled for the GitHub (or other Git) account.
201-
If ``2FA_enabled`` is not provided, we consider 2FA as disabled.
198+
``2FA_enabled``, ``username``, ``password`` and ``token`` are used for authentication. Set ``2FA_enabled`` to 'True' if
199+
two-factor authentication is enabled for the GitHub (or other Git) account, otherwise set it to 'False'.
200+
If you do not provide a value for ``2FA_enabled``, a default value of 'False' is used.
202201

203-
If ``repo`` is an ssh url, you should either have no passphrase for the ssh key pairs, or have the ssh-agent configured
204-
so that you will not be prompted for ssh passphrase when you do 'git clone' command with ssh urls. For ssh urls, it
205-
makes no difference whether the 2FA is enabled or disabled.
202+
If ``repo`` is an SSH URL, you should either have no passphrase for the SSH key pairs, or have the ``ssh-agent`` configured
203+
so that you are not prompted for the SSH passphrase when you run a ``git clone`` command with SSH URLs. For SSH URLs, it
204+
does not matter whether two-factor authentication is enabled.
206205

207-
If ``repo`` is an https url, 2FA matters. When 2FA is disabled, either ``token`` or ``username``+``password`` will be
206+
If ``repo`` is an https URL, 2FA matters. When 2FA is disabled, either ``token`` or ``username``+``password`` will be
208207
used for authentication if provided (``token`` prioritized). When 2FA is enabled, only token will be used for
209208
authentication if provided. If required authentication info is not provided, python SDK will try to use local
210209
credentials storage to authenticate. If that fails either, an error message will be thrown.
211210

212-
Here are some ways to specify ``git_config``:
211+
Here are some examples of creating estimators with Git support:
213212

214213
.. code:: python
215214
216-
# The following three examples do not provide Git credentials, so python SDK will try to use
217-
# local credential storage.
218-
219-
# Specifies the git_config parameter
215+
# Specifies the git_config parameter. This example does not provide Git credentials, so python SDK will try
216+
# to use local credential storage.
220217
git_config = {'repo': 'https://github.com/username/repo-with-training-scripts.git',
221218
'branch': 'branch1',
222219
'commit': '4893e528afa4a790331e1b5286954f073b0f14a2'}
223220
224-
# Alternatively, you can also specify git_config by providing only 'repo' and 'branch'.
225-
# If this is the case, the latest commit in the branch will be used.
226-
git_config = {'repo': 'https://github.com/username/repo-with-training-scripts.git',
227-
'branch': 'branch1'}
228-
229-
# Only providing 'repo' is also allowed. If this is the case, latest commit in
230-
# 'master' branch will be used.
231-
git_config = {'repo': 'https://github.com/username/repo-with-training-scripts.git'}
232-
233-
# This example does not provide '2FA_enabled', so 2FA is treated as disabled by default. 'username' and
234-
# 'password' are provided for authentication
235-
git_config = {'repo': 'https://github.com/username/repo-with-training-scripts.git',
236-
'username': 'username',
237-
'password': 'passw0rd!'}
238-
239-
# This example specifies that 2FA is enabled, and token is provided for authentication
240-
git_config = {'repo': 'https://github.com/username/repo-with-training-scripts.git',
241-
'2FA_enabled': True,
242-
'token': 'your-token'}
243-
244-
The following are some examples to define estimators with Git support:
245-
246-
.. code:: python
247-
248221
# In this example, the source directory 'pytorch' contains the entry point 'mnist.py' and other source code.
249-
# and it is relative path inside the Git repo.
222+
# and it is relative path inside the Git repo.
250223
pytorch_estimator = PyTorch(entry_point='mnist.py',
251224
role='SageMakerRole',
252225
source_dir='pytorch',
253226
git_config=git_config,
254227
train_instance_count=1,
255228
train_instance_type='ml.c4.xlarge')
256229
230+
.. code:: python
231+
232+
# You can also specify git_config by providing only 'repo' and 'branch'.
233+
# If this is the case, the latest commit in that branch will be used.
234+
git_config = {'repo': '[email protected]:username/repo-with-training-scripts.git',
235+
'branch': 'branch1'}
236+
257237
# In this example, the entry point 'mnist.py' is all we need for source code.
258238
# We need to specify the path to it in the Git repo.
259239
mx_estimator = MXNet(entry_point='mxnet/mnist.py',
@@ -262,6 +242,15 @@ The following are some examples to define estimators with Git support:
262242
train_instance_count=1,
263243
train_instance_type='ml.c4.xlarge')
264244
245+
.. code:: python
246+
247+
# Only providing 'repo' is also allowed. If this is the case, latest commit in 'master' branch will be used.
248+
# This example does not provide '2FA_enabled', so 2FA is treated as disabled by default. 'username' and
249+
# 'password' are provided for authentication
250+
git_config = {'repo': 'https://github.com/username/repo-with-training-scripts.git',
251+
'username': 'username',
252+
'password': 'passw0rd!'}
253+
265254
# In this example, besides entry point and other source code in source directory, we still need some
266255
# dependencies for the training job. Dependencies should also be paths inside the Git repo.
267256
pytorch_estimator = PyTorch(entry_point='mnist.py',
@@ -272,8 +261,23 @@ The following are some examples to define estimators with Git support:
272261
train_instance_count=1,
273262
train_instance_type='ml.c4.xlarge')
274263
264+
.. code:: python
265+
266+
# This example specifies that 2FA is enabled, and token is provided for authentication
267+
git_config = {'repo': 'https://github.com/username/repo-with-training-scripts.git',
268+
'2FA_enabled': True,
269+
'token': 'your-token'}
270+
271+
# In this exmaple, besides entry point, we also need some dependencies for the training job.
272+
pytorch_estimator = PyTorch(entry_point='pytorch/mnist.py',
273+
role='SageMakerRole',
274+
dependencies=['dep.py'],
275+
git_config=git_config,
276+
train_instance_count=1,
277+
train_instance_type='local')
278+
275279
Git support can be used not only for training jobs, but also for hosting models. The usage is the same as the above,
276-
and ``git_config`` should be provided when creating the ``FrameworkModel`` object.
280+
and ``git_config`` should be provided when creating model objects, e.g. ``TensorFlowModel``, ``MXNetModel``, ``PyTorchModel``.
277281

278282
Training Metrics
279283
----------------

0 commit comments

Comments
 (0)