You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: site-src/guides/model-server.md
+1-1Lines changed: 1 addition & 1 deletion
Original file line number
Diff line number
Diff line change
@@ -12,7 +12,7 @@ Any model server that conform to the [model server protocol](https://github.com/
12
12
|vLLM V1|v0.8.0 and above|[commit bc32bc7](https://github.com/vllm-project/vllm/commit/bc32bc73aad076849ac88565cff745b01b17d89c)||
13
13
Triton(TensorRT-LLM)| TODO| Pending [PR](https://github.com/triton-inference-server/tensorrtllm_backend/pull/725). |LoRA affinity feature is not available as the required LoRA metrics haven't been implemented in Triton yet.|
14
14
15
-
## Use vLLM
15
+
## vLLM
16
16
17
17
vLLM is configured as the default in the [endpoint picker extension](https://github.com/kubernetes-sigs/gateway-api-inference-extension/tree/main/pkg/epp). No further configuration is required.
0 commit comments