You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
where `inferencePool.targetPortNumber` is the pod that vllm backends served on and `inferencePool.modelServers.matchLabels` is the selector to match the vllm backends.
18
-
19
15
To install via the latest published chart in staging (--version v0 indicates latest dev version), you can run the following command:
|`inferencePool.name`| Name for the InferencePool, and inference extension will be named as `${inferencePool.name}-epp`.|
44
-
|`inferencePool.targetPortNumber`| Target port number for the vllm backends, will be used to scrape metrics by the inference extension. |
45
-
|`inferencePool.modelServers.matchLabels`| Label selector to match vllm backends managed by the inference pool. |
46
-
|`inferenceExtension.replicas`| Number of replicas for the inference extension service. Defaults to `1`.|
47
-
|`inferenceExtension.image.name`| Name of the container image used for the inference extension.|
48
-
|`inferenceExtension.image.hub`| Registry URL where the inference extension image is hosted. |
49
-
|`inferenceExtension.image.tag`| Image tag of the inference extension.|
50
-
|`inferenceExtension.image.pullPolicy`| Image pull policy for the container. Possible values: `Always`, `IfNotPresent`, or `Never`. Defaults to `Always`. |
51
-
|`inferenceExtension.extProcPort`| Port where the inference extension service is served for external processing. Defaults to `9002`. |
|`inferencePool.name`| Name for the InferencePool, and endpoint picker deployment and service will be named as `{.Release.name}-epp`. |
38
+
|`inferencePool.targetPortNumber`| Target port number for the vllm backends, will be used to scrape metrics by the inference extension. Defaults to 8000.|
39
+
|`inferencePool.modelServers.matchLabels`| Label selector to match vllm backends managed by the inference pool. |
40
+
|`inferenceExtension.replicas`| Number of replicas for the endpoint picker extension service. Defaults to `1`. |
41
+
|`inferenceExtension.image.name`| Name of the container image used for the endpoint picker. |
42
+
|`inferenceExtension.image.hub`| Registry URL where the endpoint picker image is hosted.|
43
+
|`inferenceExtension.image.tag`| Image tag of the endpoint picker. |
44
+
|`inferenceExtension.image.pullPolicy`| Image pull policy for the container. Possible values: `Always`, `IfNotPresent`, or `Never`. Defaults to `Always`. |
45
+
|`inferenceExtension.extProcPort`| Port where the endpoint picker service is served for external processing. Defaults to `9002`.|
0 commit comments