Skip to content

Commit cb83786

Browse files
committed
documentation cpu platform
Signed-off-by: Nir Rozenbaum <[email protected]>
1 parent 9b9f288 commit cb83786

File tree

1 file changed

+1
-0
lines changed

1 file changed

+1
-0
lines changed

site-src/guides/index.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -36,6 +36,7 @@ This quickstart guide is intended for engineers familiar with k8s and model serv
3636

3737
#### CPU-Based Model Server
3838

39+
This setup is using the formal `vllm-cpu` image, which according to the documentation can run vLLM on x86 CPU platform.
3940
For this setup, we use approximately 9.5GB of memory and 12 CPUs for each replica.
4041
While it is possible to deploy the model server with less resources, this is not recommended.
4142
For example, in our tests, loading the model using 8GB of memory and 1 CPU was possible but took almost 3.5 minutes and inference requests took unreasonable time.

0 commit comments

Comments
 (0)