Skip to content

Commit 42eb5ff

Browse files
authored
cleaning up inferencePool helm docs (#665)
1 parent ae3df87 commit 42eb5ff

File tree

1 file changed

+4
-1
lines changed

1 file changed

+4
-1
lines changed

config/charts/inferencepool/README.md

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -17,9 +17,12 @@ To install via the latest published chart in staging (--version v0 indicates la
1717
```txt
1818
$ helm install vllm-llama3-8b-instruct \
1919
--set inferencePool.modelServers.matchLabels.app=vllm-llama3-8b-instruct \
20+
--set provider.name=[none|gke] \
2021
oci://us-central1-docker.pkg.dev/k8s-staging-images/gateway-api-inference-extension/charts/inferencepool --version v0
2122
```
2223

24+
Note that the provider name is needed to deploy provider-specific resources. If no provider is specified, then only the InferencePool object and the EPP are deployed.
25+
2326
## Uninstall
2427

2528
Run the following command to uninstall the chart:
@@ -34,7 +37,6 @@ The following table list the configurable parameters of the chart.
3437

3538
| **Parameter Name** | **Description** |
3639
|---------------------------------------------|------------------------------------------------------------------------------------------------------------------------|
37-
| `inferencePool.name` | Name for the InferencePool, and endpoint picker deployment and service will be named as `{.Release.name}-epp`. |
3840
| `inferencePool.targetPortNumber` | Target port number for the vllm backends, will be used to scrape metrics by the inference extension. Defaults to 8000. |
3941
| `inferencePool.modelServers.matchLabels` | Label selector to match vllm backends managed by the inference pool. |
4042
| `inferenceExtension.replicas` | Number of replicas for the endpoint picker extension service. Defaults to `1`. |
@@ -43,6 +45,7 @@ The following table list the configurable parameters of the chart.
4345
| `inferenceExtension.image.tag` | Image tag of the endpoint picker. |
4446
| `inferenceExtension.image.pullPolicy` | Image pull policy for the container. Possible values: `Always`, `IfNotPresent`, or `Never`. Defaults to `Always`. |
4547
| `inferenceExtension.extProcPort` | Port where the endpoint picker service is served for external processing. Defaults to `9002`. |
48+
| `provider.name` | Name of the Inference Gateway implementation being used. Possible values: `gke`. Defaults to `none`. |
4649

4750
## Notes
4851

0 commit comments

Comments
 (0)