Skip to content

Commit 6a3b25e

Browse files
committed
added limits to cpu deployment
Signed-off-by: Nir Rozenbaum <[email protected]>
1 parent e2c381b commit 6a3b25e

File tree

1 file changed

+9
-0
lines changed

1 file changed

+9
-0
lines changed

config/manifests/vllm/cpu-deployment.yaml

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -31,6 +31,8 @@ spec:
3131
value: "8000"
3232
- name: VLLM_ALLOW_RUNTIME_LORA_UPDATING
3333
value: "true"
34+
- name: VLLM_CPU_KVCACHE_SPACE
35+
value: "4"
3436
ports:
3537
- containerPort: 8000
3638
name: http
@@ -55,6 +57,13 @@ spec:
5557
periodSeconds: 5
5658
successThreshold: 1
5759
timeoutSeconds: 1
60+
resources:
61+
limits:
62+
cpu: "12"
63+
memory: "9000Mi"
64+
requests:
65+
cpu: "12"
66+
memory: "9000Mi"
5867
volumeMounts:
5968
- mountPath: /data
6069
name: data

0 commit comments

Comments
 (0)