Skip to content

Commit 2cb8be2

Browse files
committed
Replaces main refs with tag for POC Readme
Signed-off-by: Daneyon Hansen <[email protected]>
1 parent bf461e0 commit 2cb8be2

File tree

1 file changed

+8
-8
lines changed

1 file changed

+8
-8
lines changed

pkg/README.md

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@ This quickstart guide is intended for engineers familiar with k8s and model serv
1717
Deploy a sample vLLM deployment with the proper protocol to work with the LLM Instance Gateway.
1818
```bash
1919
kubectl create secret generic hf-token --from-literal=token=$HF_TOKEN # Your Hugging Face Token with access to Llama2
20-
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/pkg/manifests/vllm/deployment.yaml
20+
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v0.1.0-rc.1/pkg/manifests/vllm/deployment.yaml
2121
```
2222

2323
1. **Install the Inference Extension CRDs:**
@@ -31,22 +31,22 @@ This quickstart guide is intended for engineers familiar with k8s and model serv
3131
Deploy the sample InferenceModel which is configured to load balance traffic between the `tweet-summary-0` and `tweet-summary-1`
3232
[LoRA adapters](https://docs.vllm.ai/en/latest/features/lora.html) of the sample model server.
3333
```bash
34-
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/pkg/manifests/inferencemodel.yaml
34+
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v0.1.0-rc.1/pkg/manifests/inferencemodel.yaml
3535
```
3636

3737
1. **Update Envoy Gateway Config to enable Patch Policy**
3838

3939
Our custom LLM Gateway ext-proc is patched into the existing envoy gateway via `EnvoyPatchPolicy`. To enable this feature, we must extend the Envoy Gateway config map. To do this, simply run:
4040
```bash
41-
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/pkg/manifests/gateway/enable_patch_policy.yaml
41+
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v0.1.0-rc.1/pkg/manifests/gateway/enable_patch_policy.yaml
4242
kubectl rollout restart deployment envoy-gateway -n envoy-gateway-system
4343
```
4444
Additionally, if you would like to enable the admin interface, you can uncomment the admin lines and run this again.
4545

4646
1. **Deploy Gateway**
4747

4848
```bash
49-
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/pkg/manifests/gateway/gateway.yaml
49+
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v0.1.0-rc.1/pkg/manifests/gateway/gateway.yaml
5050
```
5151
> **_NOTE:_** This file couples together the gateway infra and the HTTPRoute infra for a convenient, quick startup. Creating additional/different InferencePools on the same gateway will require an additional set of: `Backend`, `HTTPRoute`, the resources included in the `./manifests/gateway/ext-proc.yaml` file, and an additional `./manifests/gateway/patch_policy.yaml` file. ***Should you choose to experiment, familiarity with xDS and Envoy are very useful.***
5252
@@ -60,14 +60,14 @@ This quickstart guide is intended for engineers familiar with k8s and model serv
6060
1. **Deploy the Inference Extension and InferencePool**
6161

6262
```bash
63-
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/pkg/manifests/ext_proc.yaml
63+
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v0.1.0-rc.1/pkg/manifests/ext_proc.yaml
6464
```
6565

6666
1. **Deploy Envoy Gateway Custom Policies**
6767

6868
```bash
69-
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/pkg/manifests/gateway/extension_policy.yaml
70-
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/pkg/manifests/gateway/patch_policy.yaml
69+
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v0.1.0-rc.1/pkg/manifests/gateway/extension_policy.yaml
70+
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v0.1.0-rc.1/pkg/manifests/gateway/patch_policy.yaml
7171
```
7272
> **_NOTE:_** This is also per InferencePool, and will need to be configured to support the new pool should you wish to experiment further.
7373
@@ -76,7 +76,7 @@ This quickstart guide is intended for engineers familiar with k8s and model serv
7676
For high-traffic benchmarking you can apply this manifest to avoid any defaults that can cause timeouts/errors.
7777

7878
```bash
79-
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/pkg/manifests/gateway/traffic_policy.yaml
79+
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v0.1.0-rc.1/pkg/manifests/gateway/traffic_policy.yaml
8080
```
8181

8282
1. **Try it out**

0 commit comments

Comments
 (0)