Skip to content

Commit cfb702a

Browse files
committed
Adding terminationGracePeriodSeconds to match vLLMs
1 parent 16ded66 commit cfb702a

File tree

2 files changed

+3
-1
lines changed

2 files changed

+3
-1
lines changed

config/manifests/inferencemodel.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
apiVersion: inference.networking.x-k8s.io/v1alpha2
22
kind: InferenceModel
33
metadata:
4-
name: tweet-summarizer
4+
name: food-review
55
spec:
66
modelName: food-review
77
criticality: Standard

config/manifests/inferencepool.yaml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -41,6 +41,8 @@ spec:
4141
labels:
4242
app: vllm-llama3-8b-instruct-epp
4343
spec:
44+
# Conservatively, this timeout should mirror the longest grace period of the pods within the pool
45+
terminationGracePeriodSeconds: 130
4446
containers:
4547
- name: epp
4648
image: us-central1-docker.pkg.dev/k8s-staging-images/gateway-api-inference-extension/epp:main

0 commit comments

Comments
 (0)