documentation cpu platform

nirrozenbaum · nirrozenbaum · commit cb8378652093 · 2025-03-18T11:01:53.000+02:00
Signed-off-by: Nir Rozenbaum &lt;nirro@il.ibm.com&gt;
diff --git a/site-src/guides/index.md b/site-src/guides/index.md
@@ -36,6 +36,7 @@ This quickstart guide is intended for engineers familiar with k8s and model serv
 
 #### CPU-Based Model Server
 
+   This setup is using the formal `vllm-cpu` image, which according to the documentation can run vLLM on x86 CPU platform.
    For this setup, we use approximately 9.5GB of memory and 12 CPUs for each replica.  
    While it is possible to deploy the model server with less resources, this is not recommended.  
    For example, in our tests, loading the model using 8GB of memory and 1 CPU was possible but took almost 3.5 minutes and inference requests took unreasonable time.