Skip to content

Commit 8bd3485

Browse files
liu-congahg-g
andauthored
Update site-src/guides/model-server.md
Co-authored-by: Abdullah Gharaibeh <[email protected]>
1 parent 25a61d8 commit 8bd3485

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

site-src/guides/model-server.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ vLLM is configured as the default in the [endpoint picker extension](https://git
1818

1919
## Triton with TensorRT-LLM Backend
2020

21-
You need to specify the metric names when starting the EPP container. Add the following to the `args` of the [EPP deployment](https://github.com/kubernetes-sigs/gateway-api-inference-extension/blob/296247b07feed430458b8e0e3f496055a88f5e89/config/manifests/inferencepool.yaml#L48).
21+
Specify the metric names when starting the EPP container by adding the following to the `args` of the [EPP deployment](https://github.com/kubernetes-sigs/gateway-api-inference-extension/blob/296247b07feed430458b8e0e3f496055a88f5e89/config/manifests/inferencepool.yaml#L48).
2222
```
2323
- -totalQueuedRequestsMetric
2424
- "nv_trt_llm_request_metrics{request_type=waiting}"

0 commit comments

Comments
 (0)