diff --git a/doc/api/training/smd_model_parallel_release_notes/smd_model_parallel_change_log.rst b/doc/api/training/smd_model_parallel_release_notes/smd_model_parallel_change_log.rst index d65efd5022..12ed10049a 100644 --- a/doc/api/training/smd_model_parallel_release_notes/smd_model_parallel_change_log.rst +++ b/doc/api/training/smd_model_parallel_release_notes/smd_model_parallel_change_log.rst @@ -5,14 +5,31 @@ Release Notes New features, bug fixes, and improvements are regularly made to the SageMaker distributed model parallel library. -SageMaker Distributed Model Parallel 1.9.0 Release Notes -======================================================== +SageMaker Distributed Model Parallel 1.10.0 Release Notes +========================================================= -*Date: May. 3. 2022* +*Date: July. 19. 2022* -**Currency Updates** +**New Features** -* Added support for PyTorch 1.11.0 +The following new features are added for PyTorch. + +* Added support for FP16 training by implementing smdistributed.modelparallel + modification of Apex FP16_Module and FP16_Optimizer. To learn more, see + `FP16 Training with Model Parallelism + `_. +* New checkpoint APIs for CPU memory usage optimization. To learn more, see + `Checkpointing Distributed Models and Optimizer States + `_. + +**Improvements** + +* The SageMaker distributed model parallel library manages and optimizes CPU + memory by garbage-collecting non-local parameters in general and during checkpointing. +* Changes in the `GPT-2 translate functions + `_ + (``smdistributed.modelparallel.torch.nn.huggingface.gpt2``) + to save memory by not maintaining two copies of weights at the same time. **Migration to AWS Deep Learning Containers** @@ -28,7 +45,7 @@ Binary file of this version of the library for custom container users: .. code:: - https://sagemaker-distributed-model-parallel.s3.us-west-2.amazonaws.com/pytorch-1.11.0/build-artifacts/2022-04-20-17-05/smdistributed_modelparallel-1.9.0-cp38-cp38-linux_x86_64.whl + https://sagemaker-distributed-model-parallel.s3.us-west-2.amazonaws.com/pytorch-1.11.0/build-artifacts/2022-07-11-19-23/smdistributed_modelparallel-1.10.0-cp38-cp38-linux_x86_64.whl @@ -37,6 +54,33 @@ Binary file of this version of the library for custom container users: Release History =============== +SageMaker Distributed Model Parallel 1.9.0 Release Notes +-------------------------------------------------------- + +*Date: May. 3. 2022* + +**Currency Updates** + +* Added support for PyTorch 1.11.0 + +**Migration to AWS Deep Learning Containers** + +This version passed benchmark testing and is migrated to the following AWS Deep Learning Containers (DLC): + +- PyTorch 1.11.0 DLC + + .. code:: + + 763104351884.dkr.ecr..amazonaws.com/pytorch-training:1.11.0-gpu-py38-cu113-ubuntu20.04-sagemaker + +Binary file of this version of the library for custom container users: + + .. code:: + + https://sagemaker-distributed-model-parallel.s3.us-west-2.amazonaws.com/pytorch-1.11.0/build-artifacts/2022-04-20-17-05/smdistributed_modelparallel-1.9.0-cp38-cp38-linux_x86_64.whl + + + SageMaker Distributed Model Parallel 1.8.1 Release Notes --------------------------------------------------------