Skip to content

Commit fe4bdd3

Browse files
authored
Merge branch 'aws:master' into jeniyat/hf-inf-neuron
2 parents e590bf5 + dfc6eee commit fe4bdd3

File tree

88 files changed

+6318
-3893
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

88 files changed

+6318
-3893
lines changed

.readthedocs.yml

+1-1
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@
55
version: 2
66

77
python:
8-
version: 3.6
8+
version: 3.9
99
install:
1010
- method: pip
1111
path: .

CHANGELOG.md

+53
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,58 @@
11
# Changelog
22

3+
## v2.78.0 (2022-03-07)
4+
5+
### Features
6+
7+
* TensorFlow 2.4 for Neo
8+
* Data Serializer
9+
10+
### Bug Fixes and Other Changes
11+
12+
* Style update in DataSerializer
13+
* Remove sagemaker_job_name from hyperparameters in TrainingStep
14+
* reorganize test files for workflow
15+
* update code to get commit_id in codepipeline
16+
17+
## v2.77.1 (2022-02-25)
18+
19+
### Bug Fixes and Other Changes
20+
21+
* jumpstart model table
22+
23+
## v2.77.0 (2022-02-22)
24+
25+
### Features
26+
27+
* override jumpstart content bucket
28+
* jumpstart model id suggestions
29+
* adding customer metadata support to registermodel step
30+
31+
### Bug Fixes and Other Changes
32+
33+
* Improve Pipeline workflow unit test branch coverage
34+
* update lineage_trial_compoment get pipeline execution arn
35+
* Add lineage doc
36+
* Support primitive types for left value of ConditionSteps
37+
38+
## v2.76.0 (2022-02-17)
39+
40+
### Features
41+
42+
* Add FailStep Support for Sagemaker Pipeline
43+
44+
### Bug Fixes and Other Changes
45+
46+
* use recommended inference image uri from Neo API
47+
* pin test dependencies
48+
* Add exception in test_action
49+
* Update Static Endpoint
50+
* Add CMH to the non-P3 list
51+
52+
### Documentation Changes
53+
54+
* Support for generation of Jumpstart model table on build
55+
356
## v2.75.1 (2022-02-08)
457

558
### Bug Fixes and Other Changes

VERSION

+1-1
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
2.75.2.dev0
1+
2.78.1.dev0

doc/api/training/distributed.rst

+18-1
Original file line numberDiff line numberDiff line change
@@ -4,8 +4,25 @@ SageMaker distributed training libraries offer both data parallel and model para
44
They combine software and hardware technologies to improve inter-GPU and inter-node communications.
55
They extend SageMaker’s training capabilities with built-in options that require only small code changes to your training scripts.
66

7+
.. _sdp_api_docs_toc:
8+
9+
The SageMaker Distributed Data Parallel Library
10+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
11+
12+
.. toctree::
13+
:maxdepth: 3
14+
15+
smd_data_parallel
16+
sdp_versions/latest
17+
smd_data_parallel_use_sm_pysdk
18+
smd_data_parallel_release_notes/smd_data_parallel_change_log
19+
20+
.. _smp_api_docs_toc:
21+
22+
The SageMaker Distributed Model Parallel Library
23+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
24+
725
.. toctree::
826
:maxdepth: 3
927

10-
smd_data_parallel
1128
smd_model_parallel
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
.. _smddp-version-archive:
2+
3+
.. toctree::
4+
:maxdepth: 1
5+
6+
v1_2_x.rst
7+
v1_1_x.rst
8+
v1_0_0.rst
+41-3
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,47 @@
1+
.. _sdp_api_docs:
12

2-
Version 1.2.x (Latest)
3+
#############################################
4+
Use the Library to Adapt Your Training Script
5+
#############################################
6+
7+
This section contains the SageMaker distributed data parallel API documentation.
8+
If you are a new user of this library, it is recommended you use this guide alongside
9+
`SageMaker's Distributed Data Parallel Library
10+
<https://docs.aws.amazon.com/sagemaker/latest/dg/data-parallel.html>`_.
11+
12+
The library provides framework-specific APIs for TensorFlow and PyTorch.
13+
14+
Select the latest or one of the previous versions of the API documentation
15+
depending on the version of the library you use.
16+
17+
.. important::
18+
19+
The distributed data parallel library supports training jobs using CUDA 11 or later.
20+
When you define a :class:`sagemaker.tensorflow.estimator.TensorFlow` or
21+
:class:`sagemaker.pytorch.estimator.PyTorch`
22+
estimator with the data parallel library enabled,
23+
SageMaker uses CUDA 11. When you extend or customize your own training image,
24+
you must use a base image with CUDA 11 or later. See
25+
`SageMaker Python SDK's distributed data parallel library APIs
26+
<https://docs.aws.amazon.com/sagemaker/latest/dg/data-parallel-use-api.html#data-parallel-use-python-skd-api>`_
27+
for more information.
28+
29+
Version 1.4.0 (Latest)
330
======================
431

532
.. toctree::
633
:maxdepth: 1
734

8-
latest/smd_data_parallel_pytorch.rst
9-
latest/smd_data_parallel_tensorflow.rst
35+
latest/smd_data_parallel_pytorch
36+
latest/smd_data_parallel_tensorflow
37+
38+
Documentation Archive
39+
=====================
40+
41+
To find the API documentation for the previous versions of the library,
42+
choose one of the following:
43+
44+
.. toctree::
45+
:maxdepth: 1
46+
47+
archives

0 commit comments

Comments
 (0)