Skip to content

Commit 95ddf2e

Browse files
Reduced GPU requirements
1 parent d66b732 commit 95ddf2e

File tree

1 file changed

+2
-2
lines changed

1 file changed

+2
-2
lines changed

pkg/manifests/vllm/deployment.yaml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@ kind: Deployment
1616
metadata:
1717
name: vllm-llama2-7b-pool
1818
spec:
19-
replicas: 3
19+
replicas: 1
2020
selector:
2121
matchLabels:
2222
app: vllm-llama2-7b-pool
@@ -39,7 +39,7 @@ spec:
3939
- "8000"
4040
- "--enable-lora"
4141
- "--max-loras"
42-
- "4"
42+
- "2"
4343
- "--max-cpu-loras"
4444
- "12"
4545
- "--lora-modules"

0 commit comments

Comments
 (0)