Skip to content

Commit b839b74

Browse files
Jeffwankfswain
authored andcommitted
Remove outdated configurations and ensure the tutorial runs smoothly (kubernetes-sigs#136)
* Fix the outdated fields in inference pool Signed-off-by: Jiaxin Shan <[email protected]> * fix model routing configuration issues Signed-off-by: Jiaxin Shan <[email protected]> * Update to use latest gateway name Signed-off-by: Jiaxin Shan <[email protected]> --------- Signed-off-by: Jiaxin Shan <[email protected]>
1 parent 8e7b4c2 commit b839b74

File tree

2 files changed

+8
-6
lines changed

2 files changed

+8
-6
lines changed

examples/poc/manifests/inferencepool-with-model.yaml

+7-5
Original file line numberDiff line numberDiff line change
@@ -4,8 +4,8 @@ metadata:
44
labels:
55
name: vllm-llama2-7b-pool
66
spec:
7-
targetPort: 8000
8-
modelServerSelector:
7+
targetPortNumber: 8000
8+
selector:
99
"app": "vllm-llama2-7b-pool"
1010
---
1111
apiVersion: inference.networking.x-k8s.io/v1alpha1
@@ -16,7 +16,7 @@ metadata:
1616
app.kubernetes.io/managed-by: kustomize
1717
name: inferencemodel-sample
1818
spec:
19-
modelName: sql-lora
19+
modelName: tweet-summary
2020
criticality: Critical
2121
poolRef:
2222
# this is the default val:
@@ -25,6 +25,8 @@ spec:
2525
kind: InferencePool
2626
name: vllm-llama2-7b-pool
2727
targetModels:
28-
- name: sql-lora-1fdg2
29-
weight: 100
28+
- name: tweet-summary-0
29+
weight: 50
30+
- name: tweet-summary-1
31+
weight: 50
3032

pkg/README.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -43,7 +43,7 @@ The current manifests rely on Envoy Gateway [v1.2.1](https://gateway.envoyproxy.
4343
Wait until the gateway is ready.
4444

4545
```bash
46-
IP=$(kubectl get gateway/instance-gateway -o jsonpath='{.status.addresses[0].value}')
46+
IP=$(kubectl get gateway/inference-gateway -o jsonpath='{.status.addresses[0].value}')
4747
PORT=8081
4848

4949
curl -i ${IP}:${PORT}/v1/completions -H 'Content-Type: application/json' -d '{

0 commit comments

Comments
 (0)