You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: pkg/README.md
+3-3Lines changed: 3 additions & 3 deletions
Original file line number
Diff line number
Diff line change
@@ -13,10 +13,10 @@ This quickstart guide is intended for engineers familiar with k8s and model serv
13
13
14
14
1.**Deploy Sample Model Server**
15
15
16
-
Create a Hugging Face secret to download the [meta-llama/Llama-2-7b-hf](https://huggingface.co/meta-llama/Llama-2-7b-hf) model. Ensure that the token grants access to this model.
17
-
18
-
Replace `$HF_TOKEN` in `./manifests/vllm/deployment.yaml` with your Hugging Face secret and then deploy the model server.
16
+
Create a Hugging Face secret to download the model [meta-llama/Llama-2-7b-hf](https://huggingface.co/meta-llama/Llama-2-7b-hf). Ensure that the token grants access to this model.
17
+
Deploy a sample vLLM deployment with the proper protocol to work with the LLM Instance Gateway.
19
18
```bash
19
+
kubectl create secret generic hf-token --from-literal=token=$HF_TOKEN# Your Hugging Face Token with access to Llama2
0 commit comments