Merge branch 'master' into PT113Release

Qingzi-Lan · web-flow · commit 3bf949e8ca63 · 2023-05-01T15:49:08.000-07:00
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -1,5 +1,24 @@
 # Changelog
 
+## v2.151.0 (2023-04-27)
+
+### Features
+
+ * Update Transformers 4.26 - TensorFlow 2.11.0 Image URI
+ * Add Extra Parameters to Lambda Function Wrapper
+
+### Bug Fixes and Other Changes
+
+ * Add kms key support for Model registration
+ * Enable inference recommender slow tests
+ * Pass sagemaker session to downstream s3 calls
+ * Add ap-south-1 to no p3 regions
+ * skip test for p2 instance for TF2.12 and above
+
+### Documentation Changes
+
+ * Fix minor misses from the remote function doc release
+
 ## v2.150.0 (2023-04-26)
 
 ### Features
diff --git a/VERSION b/VERSION
@@ -1 +1 @@
-2.150.1.dev0
+2.151.1.dev0
diff --git a/doc/api/training/smd_model_parallel_release_notes/smd_model_parallel_change_log.rst b/doc/api/training/smd_model_parallel_release_notes/smd_model_parallel_change_log.rst
@@ -3,12 +3,88 @@ Release Notes
 #############
 
 New features, bug fixes, and improvements are regularly made to the SageMaker
-distributed model parallel library.
+model parallelism library.
 
 
-SageMaker Distributed Model Parallel 1.14.0 Release Notes
+SageMaker Distributed Model Parallel 1.15.0 Release Notes
 =========================================================
 
+*Date: Apr. 27. 2023*
+
+**Currency Updates**
+
+* Added support for PyTorch v2.0.0.
+  Note that the library does not support ``torch.compile`` in this release.
+
+**New Features**
+
+* Using sharded data parallelism with tensor parallelism together is now
+  available for PyTorch 1.13.1. It allows you to train with smaller global batch
+  sizes while scaling up to large clusters. For more information, see `Sharded
+  data parallelism with tensor parallelism <https://docs.aws.amazon.com/sagemaker/latest/dg/model-parallel-extended-features-pytorch-sharded-data-parallelism.html#model-parallel-extended-features-pytorch-sharded-data-parallelism-with-tensor-parallelism>`_
+  in the *Amazon SageMaker Developer Guide*.
+* Added support for saving and loading full model checkpoints when using sharded
+  data parallelism. This is enabled by using the standard checkpointing API,
+  ``smp.save_checkpoint`` with ``partial=False``.
+  Before, full checkpoints needed to be created by merging partial checkpoint
+  files after training finishes.
+* `DistributedTransformer <https://sagemaker.readthedocs.io/en/stable/api/training/smp_versions/latest/smd_model_parallel_pytorch_tensor_parallel.html#smdistributed.modelparallel.torch.nn.DistributedTransformerLayer>`_
+  now supports the ALiBi position embeddings.
+  When using DistributedTransformer, you can set the ``use_alibi`` parameter
+  to ``True`` to use the Triton-based flash attention kernels. This helps
+  evaluate sequences longer than those used for training.
+
+**Bug Fixes**
+
+* When using tensor parallelism, parameters were initialized multiple times
+  unncessarily. This release fixed the multiple initialization of parameters
+  so that each parameter is initialized exactly once.
+  It not only saves time, but also ensures that the random generator behavior
+  is similar to the non-tensor parallelism case.
+
+**Known issues**
+
+* Model initialization might take longer with PyTorch 2.0 than that with PyTorch 1.13.
+
+**Migration to AWS Deep Learning Containers**
+
+This version passed benchmark testing and is migrated to the following AWS Deep Learning Containers (DLC):
+
+- SageMaker training container for PyTorch v2.0.0
+
+  .. code::
+
+    763104351884.dkr.ecr.us-east-1.amazonaws.com/pytorch-training:2.0.0-gpu-py310-cu118-ubuntu20.04-sagemaker
+
+- SageMaker training container for PyTorch v1.13.1
+
+  .. code::
+
+    763104351884.dkr.ecr.us-east-1.amazonaws.com/pytorch-training:1.13.1-gpu-py39-cu117-ubuntu20.04-sagemaker
+
+Binary file of this version of the library for `custom container
+<https://docs.aws.amazon.com/sagemaker/latest/dg/model-parallel-sm-sdk.html#model-parallel-bring-your-own-container>`_ users:
+
+- For PyTorch v2.0.0
+
+  .. code::
+
+    https://sagemaker-distributed-model-parallel.s3.us-west-2.amazonaws.com/pytorch-2.0.0/build-artifacts/2023-04-14-20-14/smdistributed_modelparallel-1.15.0-cp310-cp310-linux_x86_64.whl
+
+- For PyTorch v1.13.1
+
+  .. code::
+
+    https://sagemaker-distributed-model-parallel.s3.us-west-2.amazonaws.com/pytorch-1.13.1/build-artifacts/2023-04-17-15-49/smdistributed_modelparallel-1.15.0-cp39-cp39-linux_x86_64.whl
+
+----
+
+Release History
+===============
+
+SageMaker Distributed Model Parallel 1.14.0 Release Notes
+---------------------------------------------------------
+
 *Date: Jan. 30. 2023*
 
 **Currency Updates**
@@ -39,10 +115,6 @@ Binary file of this version of the library for `custom container
 
     https://sagemaker-distributed-model-parallel.s3.us-west-2.amazonaws.com/pytorch-1.13.1/build-artifacts/2023-01-19-18-35/smdistributed_modelparallel-1.14.0-cp39-cp39-linux_x86_64.whl
 
-----
-
-Release History
-===============
 
 SageMaker Distributed Model Parallel 1.13.0 Release Notes
 ---------------------------------------------------------
diff --git a/doc/api/training/smp_versions/latest.rst b/doc/api/training/smp_versions/latest.rst
@@ -10,8 +10,8 @@ depending on which version of the library you need to use.
 To use the library, reference the
 **Common API** documentation alongside the framework specific API documentation.
 
-Version 1.11.0, 1.13.0, 1.14.0 (Latest)
-=======================================
+Version 1.11.0, 1.13.0, 1.14.0, 1.15.0 (Latest)
+===============================================
 
 To use the library, reference the Common API documentation alongside the framework specific API documentation.
 
diff --git a/doc/api/training/smp_versions/latest/smd_model_parallel_pytorch_tensor_parallel.rst b/doc/api/training/smp_versions/latest/smd_model_parallel_pytorch_tensor_parallel.rst
@@ -302,6 +302,20 @@ Tensor Parallelism Module APIs
       -  ``post_layernorm``: If ``True``, inserts layer normalization at
          the output. At least one of ``pre_layernorm`` and
          ``post_layernorm`` must be ``True``.
+      -  ``use_alibi`` (bool, default False): Activates Attention with
+         Linear Biases (ALiBi) for attention computation.
+         ALiBi facilitates efficient extrapolation on input sequences
+         and thus improves training efficiency.
+         The library enables ALiBi by using the `Triton
+         flash attention kernel
+         <https://github.com/HazyResearch/flash-attention>`_.
+         Refer to https://arxiv.org/abs/2108.12409 for more
+         details on the technique.
+         (Available from
+         the SageMaker model parallelism library v1.15.0.)
+      -  ``alibi_bias_max`` (int, default 8): Defines the ALiBi base
+         value for mask generation. (Available from
+         the SageMaker model parallelism library v1.15.0.)
 
    -  **Methods:**
 
diff --git a/src/sagemaker/clarify.py b/src/sagemaker/clarify.py
@@ -94,6 +94,8 @@
                             {object: object},
                         )
                     ],
+                    # Arbitrary JSON object as baseline
+                    {object: object},
                 ),
                 SchemaOptional("num_clusters"): int,
                 SchemaOptional("use_logit"): bool,
@@ -1211,7 +1213,7 @@ class SHAPConfig(ExplainabilityConfig):
 
     def __init__(
         self,
-        baseline: Optional[Union[str, List]] = None,
+        baseline: Optional[Union[str, List, Dict]] = None,
         num_samples: Optional[int] = None,
         agg_method: Optional[str] = None,
         use_logit: bool = False,
@@ -1224,7 +1226,7 @@ def __init__(
         """Initializes config for SHAP analysis.
 
         Args:
-            baseline (None or str or list): `Baseline dataset <https://docs.aws.amazon.com/sagemaker/latest/dg/clarify-feature-attribute-shap-baselines.html>`_
+            baseline (None or str or list or dict): `Baseline dataset <https://docs.aws.amazon.com/sagemaker/latest/dg/clarify-feature-attribute-shap-baselines.html>`_
                 for the Kernel SHAP algorithm, accepted in the form of:
                 S3 object URI, a list of rows (with at least one element),
                 or None (for no input baseline). The baseline dataset must have the same format
diff --git a/src/sagemaker/image_uri_config/djl-deepspeed.json b/src/sagemaker/image_uri_config/djl-deepspeed.json
@@ -29,7 +29,7 @@
                 "us-west-2": "763104351884"
             },
             "repository": "djl-inference",
-            "tag_prefix": "0.21.0-deepspeed0.8.0-cu117"
+            "tag_prefix": "0.21.0-deepspeed0.8.3-cu117"
         },
         "0.20.0": {
             "registries": {
diff --git a/tests/unit/sagemaker/image_uris/test_djl.py b/tests/unit/sagemaker/image_uris/test_djl.py
@@ -47,7 +47,7 @@
     "0.19.0": {"djl-deepspeed": "deepspeed0.7.3-cu113"},
     "0.20.0": {"djl-deepspeed": "deepspeed0.7.5-cu116"},
     "0.21.0": {
-        "djl-deepspeed": "deepspeed0.8.0-cu117",
+        "djl-deepspeed": "deepspeed0.8.3-cu117",
         "djl-fastertransformer": "fastertransformer5.3.0-cu117",
     },
 }
diff --git a/tests/unit/test_clarify.py b/tests/unit/test_clarify.py
@@ -563,14 +563,20 @@ def test_invalid_model_predicted_label_config():
     )
 
 
-def test_shap_config():
-    baseline = [
-        [
-            0.26124998927116394,
-            0.2824999988079071,
-            0.06875000149011612,
-        ]
-    ]
+@pytest.mark.parametrize(
+    "baseline",
+    [
+        ([[0.26124998927116394, 0.2824999988079071, 0.06875000149011612]]),
+        (
+            {
+                "instances": [
+                    {"features": [0.26124998927116394, 0.2824999988079071, 0.06875000149011612]}
+                ]
+            }
+        ),
+    ],
+)
+def test_valid_shap_config(baseline):
     num_samples = 100
     agg_method = "mean_sq"
     use_logit = True

Original file line number	Diff line number	Diff line change
`@@ -47,7 +47,7 @@`
`47`	`47`	`"0.19.0": {"djl-deepspeed": "deepspeed0.7.3-cu113"},`
`48`	`48`	`"0.20.0": {"djl-deepspeed": "deepspeed0.7.5-cu116"},`
`49`	`49`	`"0.21.0": {`
`50`		`- "djl-deepspeed": "deepspeed0.8.0-cu117",`
	`50`	`+ "djl-deepspeed": "deepspeed0.8.3-cu117",`
`51`	`51`	`"djl-fastertransformer": "fastertransformer5.3.0-cu117",`
`52`	`52`	`},`
`53`	`53`	`}`