Skip to content

Commit a19e7a3

Browse files
committed
minor updates to cpu deployment
Signed-off-by: Nir Rozenbaum <[email protected]>
1 parent ee35a93 commit a19e7a3

File tree

1 file changed

+4
-2
lines changed

1 file changed

+4
-2
lines changed

config/manifests/vllm/cpu-deployment.yaml

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -26,8 +26,8 @@ spec:
2626
- "--max-loras"
2727
- "4"
2828
- "--lora-modules"
29-
- '{"name": "tweet-summary-0", "path": "/adapters/hub/models--ai-blond--Qwen-Qwen2.5-Coder-1.5B-Instruct-lora/snapshots/9cde18d8ed964b0519fb481cca6acd936b2ca811"}'
30-
- '{"name": "tweet-summary-1", "path": "/adapters/hub/models--ai-blond--Qwen-Qwen2.5-Coder-1.5B-Instruct-lora/snapshots/9cde18d8ed964b0519fb481cca6acd936b2ca811"}'
29+
- '{"name": "tweet-summary-0", "path": "/adapters/ai-blond/Qwen-Qwen2.5-Coder-1.5B-Instruct-lora_0"}'
30+
- '{"name": "tweet-summary-1", "path": "/adapters/ai-blond/Qwen-Qwen2.5-Coder-1.5B-Instruct-lora_1"}'
3131
env:
3232
- name: PORT
3333
value: "8000"
@@ -38,6 +38,8 @@ spec:
3838
key: token
3939
- name: VLLM_ALLOW_RUNTIME_LORA_UPDATING
4040
value: "true"
41+
- name: VLLM_CPU_KVCACHE_SPACE
42+
value: "4"
4143
ports:
4244
- containerPort: 8000
4345
name: http

0 commit comments

Comments
 (0)