Skip to content

Commit ba867c5

Browse files
authored
switch to using formal vllm-cpu image (#511)
* switch to formal vllm-cpu image Signed-off-by: Nir Rozenbaum <[email protected]> * documentation of formal vllm-cpu image Signed-off-by: Nir Rozenbaum <[email protected]> * minor updates to cpu deployment Signed-off-by: Nir Rozenbaum <[email protected]> --------- Signed-off-by: Nir Rozenbaum <[email protected]>
1 parent 7f839ae commit ba867c5

File tree

1 file changed

+7
-3
lines changed

1 file changed

+7
-3
lines changed

config/manifests/vllm/cpu-deployment.yaml

Lines changed: 7 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@ spec:
1414
spec:
1515
containers:
1616
- name: lora
17-
image: "seedjeffwan/vllm-cpu-env:bb392af4-20250203"
17+
image: "public.ecr.aws/q9t5s3a7/vllm-cpu-release-repo:v0.7.2" # formal images can be found in https://gallery.ecr.aws/q9t5s3a7/vllm-cpu-release-repo
1818
imagePullPolicy: Always
1919
command: ["python3", "-m", "vllm.entrypoints.openai.api_server"]
2020
args:
@@ -23,9 +23,11 @@ spec:
2323
- "--port"
2424
- "8000"
2525
- "--enable-lora"
26+
- "--max-loras"
27+
- "4"
2628
- "--lora-modules"
27-
- '{"name": "tweet-summary-0", "path": "/adapters/hub/models--ai-blond--Qwen-Qwen2.5-Coder-1.5B-Instruct-lora/snapshots/9cde18d8ed964b0519fb481cca6acd936b2ca811"}'
28-
- '{"name": "tweet-summary-1", "path": "/adapters/hub/models--ai-blond--Qwen-Qwen2.5-Coder-1.5B-Instruct-lora/snapshots/9cde18d8ed964b0519fb481cca6acd936b2ca811"}'
29+
- '{"name": "tweet-summary-0", "path": "/adapters/ai-blond/Qwen-Qwen2.5-Coder-1.5B-Instruct-lora_0"}'
30+
- '{"name": "tweet-summary-1", "path": "/adapters/ai-blond/Qwen-Qwen2.5-Coder-1.5B-Instruct-lora_1"}'
2931
env:
3032
- name: PORT
3133
value: "8000"
@@ -36,6 +38,8 @@ spec:
3638
key: token
3739
- name: VLLM_ALLOW_RUNTIME_LORA_UPDATING
3840
value: "true"
41+
- name: VLLM_CPU_KVCACHE_SPACE
42+
value: "4"
3943
ports:
4044
- containerPort: 8000
4145
name: http

0 commit comments

Comments
 (0)