You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: doc/overview.rst
+92
Original file line number
Diff line number
Diff line change
@@ -684,6 +684,98 @@ For more detailed explanations of the classes that this library provides for aut
684
684
- `API docs for HyperparameterTuner and parameter range classes <https://sagemaker.readthedocs.io/en/stable/tuner.html>`__
685
685
- `API docs for analytics classes <https://sagemaker.readthedocs.io/en/stable/analytics.html>`__
686
686
687
+
**********************************
688
+
SageMaker Asynchronous Inference
689
+
**********************************
690
+
Amazon SageMaker Asynchronous Inference is a new capability in SageMaker that queues incoming requests and processes them asynchronously.
691
+
This option is ideal for requests with large payload sizes up to 1GB, long processing times, and near real-time latency requirements.
692
+
You can configure Asynchronous Inference scale the instance count to zero when there are no requests to process, thereby saving costs.
693
+
More information about SageMaker Asynchronous Inference can be found in the `AWS documentation <https://docs.aws.amazon.com/sagemaker/latest/dg/async-inference.html>`__.
694
+
695
+
To deploy asynchronous inference endpoint, you will need to create a ``AsyncInferenceConfig`` object.
696
+
If you create ``AsyncInferenceConfig`` without specifying its arguments, the default ``S3OutputPath`` will
697
+
be ``s3://sagemaker-{REGION}-{ACCOUNTID}/async-endpoint-outputs/{UNIQUE-JOB-NAME}``. (example shown below):
698
+
699
+
.. code:: python
700
+
701
+
from sagemaker.async_inference import AsyncInferenceConfig
702
+
703
+
# Create an empty AsyncInferenceConfig object to use default values
704
+
async_config = new AsyncInferenceConfig()
705
+
706
+
Or you can specify configurations in ``AsyncInferenceConfig`` as you like. All of those configuration parameters
707
+
are optional but if you don’t specify the ``output_path``, Amazon SageMaker will use the default ``S3OutputPath``
708
+
mentioned above (example shown below):
709
+
710
+
.. code:: python
711
+
712
+
# Specify S3OutputPath, MaxConcurrentInvocationsPerInstance and NotificationConfig in the async config object
0 commit comments