@@ -620,24 +620,24 @@ To deploy a Multi-Model endpoint with TFS container, please start the container
620
620
### Multi-Model Interfaces
621
621
We provide four different interfaces for user to interact with a Multi-Model Mode container:
622
622
623
- +---------------------+---------------------------------+---------------------------------------------+
624
- | Functionality | Request | Response/Actions |
625
- +---------------------+---------------------------------+---------------------------------------------+
626
- | List A Single Model | GET /models/{model_name} | Information about the specified model |
627
- +---------------------+---------------------------------+---------------------------------------------+
628
- | List All Models | GET /models | List of Information about all loaded models |
629
- +---------------------+---------------------------------+---------------------------------------------+
630
- | | POST /models | Load model with "model_name" from |
631
- | | data = { | specified url |
632
- | Load A Model | "model_name": <model-name >, | |
633
- | | "url": <path to model data > | |
634
- | | } | |
635
- +---------------------+---------------------------------+---------------------------------------------+
636
- | Make Invocations | POST /models/{model_name}/invoke| Return inference result from |
637
- | | data = <invocation payload > | the specified model |
638
- +---------------------+---------------------------------+---------------------------------------------+
639
- | Unload A Model | DELETE /models/{model_name} | Unload the specified model |
640
- +---------------------+---------------------------------+---------------------------------------------+
623
+ +---------------------+---------------------------------+---------------------------------------------+
624
+ | Functionality | Request | Response/Actions |
625
+ +---------------------+---------------------------------+---------------------------------------------+
626
+ | List A Single Model | GET /models/{model_name} | Information about the specified model |
627
+ +---------------------+---------------------------------+---------------------------------------------+
628
+ | List All Models | GET /models | List of Information about all loaded models |
629
+ +---------------------+---------------------------------+---------------------------------------------+
630
+ | | POST /models | Load model with "model_name" from |
631
+ | | data = { | specified url |
632
+ | Load A Model | "model_name": <model-name>, | |
633
+ | | "url": <path to model data> | |
634
+ | | } | |
635
+ +---------------------+---------------------------------+---------------------------------------------+
636
+ | Make Invocations | POST /models/{model_name}/invoke| Return inference result from |
637
+ | | data = <invocation payload> | the specified model |
638
+ +---------------------+---------------------------------+---------------------------------------------+
639
+ | Unload A Model | DELETE /models/{model_name} | Unload the specified model |
640
+ +---------------------+---------------------------------+---------------------------------------------+
641
641
642
642
### Maximum Number of Models
643
643
Also please note the environment variable `` SAGEMAKER_SAFE_PORT_RANGE `` will limit the number of models that can be loaded to the endpoint at the same time.
0 commit comments