Skip to content

Commit 104d4bc

Browse files
committed
docs: add the Hugging Face secret to readme
Signed-off-by: Kay Yan <[email protected]>
1 parent f34e7f6 commit 104d4bc

File tree

1 file changed

+10
-3
lines changed

1 file changed

+10
-3
lines changed

pkg/README.md

+10-3
Original file line numberDiff line numberDiff line change
@@ -7,12 +7,19 @@ The current manifests rely on Envoy Gateway [v1.2.1](https://gateway.envoyproxy.
77

88
1. **Deploy Sample vLLM Application**
99

10-
A sample vLLM deployment with the proper protocol to work with LLM Instance Gateway can be found [here](https://github.com/kubernetes-sigs/llm-instance-gateway/tree/main/examples/poc/manifests/vllm/vllm-lora-deployment.yaml#L18).
10+
Create a Hugging Face secret to download the model [meta-llama/Llama-2-7b-hf](https://huggingface.co/meta-llama/Llama-2-7b-hf). Ensure that the token grants access to this model.
11+
Deploy A sample vLLM deployment with the proper protocol to work with LLM Instance Gateway.
12+
```bash
13+
kubectl create secret generic hf-token --from-literal=token=$HF_TOKEN # Your Hugging Face Token with access of Llama2
14+
kubectl apply -f ../examples/poc/manifests/vllm/vllm-lora-deployment.yaml
15+
```
1116

1217
1. **Deploy InferenceModel and InferencePool**
1318

14-
You can find a sample InferenceModel and InferencePool configuration, based on the vLLM deployments mentioned above, [here](https://github.com/kubernetes-sigs/llm-instance-gateway/tree/main/examples/poc/manifests/inferencepool-with-model.yaml).
15-
19+
Deploy a sample InferenceModel and InferencePool configuration, based on the vLLM deployments mentioned above.
20+
```bash
21+
kubectl apply -f ../examples/poc/manifests/inferencepool-with-model.yaml
22+
```
1623

1724
1. **Update Envoy Gateway Config to enable Patch Policy**
1825

0 commit comments

Comments
 (0)