Bugs: Follow user guide ran into 500 internal server error when sending inference request

**What happened**:
When I followed https://gateway-api-inference-extension.sigs.k8s.io/guides/#__tabbed_2_2 for the cpu deployment, eventually when I ran
```
IP=$(kubectl get gateway/inference-gateway -o jsonpath='{.status.addresses[0].value}')
PORT=80

curl -i ${IP}:${PORT}/v1/completions -H 'Content-Type: application/json' -d '{
"model": "Qwen/Qwen2.5-1.5B-Instruct",
"prompt": "Write as if you were a critic: San Francisco",
"max_tokens": 100,
"temperature": 0
}'
```

It showed 500 internal server error.

**What you expected to happen**:
Should return valid response

**How to reproduce it (as minimally and precisely as possible)**:
Followed https://gateway-api-inference-extension.sigs.k8s.io/guides/#__tabbed_2_2 for the cpu deployment

**Anything else we need to know?**:
I debugged it, I think the error is because of the change like https://github.com/kubernetes-sigs/gateway-api-inference-extension/commit/2f72a8a86f295e4da606d9adf4e6e74126174287#diff-cabeb9ea1c075199163242f9adca3bdaad2cf1dda1aafb600c5d57885066e471, the `epp` deployment  showed error ```2025-05-06T20:41:17Z	LEVEL(-2)	health	epp/health.go:38	gRPC health check requested unknown service	{"available-services": ["envoy.service.ext_proc.v3.ExternalProcessor"], "requested-service": ""}```. Looks like something is mismatching. I guess it is because https://github.com/kubernetes-sigs/gateway-api-inference-extension/blob/main/config/manifests/inferencepool-resources.yaml#L51 is still using the old image which points to old `inference-extension` service?

Seems like main branch is broken now.
Things got fixed when I ran ```helm install vllm-llama3-8b-instruct \
  --set inferencePool.modelServers.matchLabels.app=vllm-llama3-8b-instruct \
  --set provider.name=gke \
  --version v0.3.0 \
  oci://registry.k8s.io/gateway-api-inference-extension/charts/inferencepool``` instead of running from the `main`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Bugs: Follow user guide ran into 500 internal server error when sending inference request #786

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Bugs: Follow user guide ran into 500 internal server error when sending inference request #786

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions