Skip to content

Commit c9db4cc

Browse files
committed
add smdmp v1.10 release note
1 parent 38b4916 commit c9db4cc

File tree

1 file changed

+48
-6
lines changed

1 file changed

+48
-6
lines changed

doc/api/training/smd_model_parallel_release_notes/smd_model_parallel_change_log.rst

+48-6
Original file line numberDiff line numberDiff line change
@@ -5,14 +5,29 @@ Release Notes
55
New features, bug fixes, and improvements are regularly made to the SageMaker
66
distributed model parallel library.
77

8-
SageMaker Distributed Model Parallel 1.9.0 Release Notes
9-
========================================================
8+
SageMaker Distributed Model Parallel 1.10.0 Release Notes
9+
=========================================================
1010

11-
*Date: May. 3. 2022*
11+
*Date: July. 19. 2022*
1212

13-
**Currency Updates**
13+
**New Features**
1414

15-
* Added support for PyTorch 1.11.0
15+
The following new features are added for PyTorch.
16+
17+
* Added support for FP16 training by implementing smdistributed.modelparallel
18+
modification of Apex FP16_Module and FP16_Optimizer. To learn more, see
19+
` <https://docs.aws.amazon.com/sagemaker/latest/dg/model-parallel-extended-features-pytorch-fp16.html>`_.
20+
* New checkpoint APIs for CPU memory usage optimization. To learn more, see
21+
` <https://docs.aws.amazon.com/sagemaker/latest/dg/model-parallel-extended-features-pytorch-checkpoint.html>`_.
22+
23+
**Improvements**
24+
25+
* The SageMaker distributed model parallel library manages and optimizes CPU
26+
memory by garbage-collecting non-local parameters in general and during checkpointing.
27+
* Changes in the `GPT-2 translate functions
28+
<https://docs.aws.amazon.com/sagemaker/latest/dg/model-parallel-extended-features-pytorch-hugging-face.html>`_
29+
(``smdistributed.modelparallel.torch.nn.huggingface.gpt2``)
30+
to save memory by not maintaining two copies of weights at the same time.
1631

1732
**Migration to AWS Deep Learning Containers**
1833

@@ -28,7 +43,7 @@ Binary file of this version of the library for custom container users:
2843

2944
.. code::
3045
31-
https://sagemaker-distributed-model-parallel.s3.us-west-2.amazonaws.com/pytorch-1.11.0/build-artifacts/2022-04-20-17-05/smdistributed_modelparallel-1.9.0-cp38-cp38-linux_x86_64.whl
46+
https://sagemaker-distributed-model-parallel.s3.us-west-2.amazonaws.com/pytorch-1.11.0/build-artifacts/2022-07-11-19-23/smdistributed_modelparallel-1.10.0-cp38-cp38-linux_x86_64.whl
3247
3348
3449
@@ -37,6 +52,33 @@ Binary file of this version of the library for custom container users:
3752
Release History
3853
===============
3954

55+
SageMaker Distributed Model Parallel 1.9.0 Release Notes
56+
--------------------------------------------------------
57+
58+
*Date: May. 3. 2022*
59+
60+
**Currency Updates**
61+
62+
* Added support for PyTorch 1.11.0
63+
64+
**Migration to AWS Deep Learning Containers**
65+
66+
This version passed benchmark testing and is migrated to the following AWS Deep Learning Containers (DLC):
67+
68+
- PyTorch 1.11.0 DLC
69+
70+
.. code::
71+
72+
763104351884.dkr.ecr.<region>.amazonaws.com/pytorch-training:1.11.0-gpu-py38-cu113-ubuntu20.04-sagemaker
73+
74+
Binary file of this version of the library for custom container users:
75+
76+
.. code::
77+
78+
https://sagemaker-distributed-model-parallel.s3.us-west-2.amazonaws.com/pytorch-1.11.0/build-artifacts/2022-04-20-17-05/smdistributed_modelparallel-1.9.0-cp38-cp38-linux_x86_64.whl
79+
80+
81+
4082
SageMaker Distributed Model Parallel 1.8.1 Release Notes
4183
--------------------------------------------------------
4284

0 commit comments

Comments
 (0)