Skip to content

Commit 2aa3a8d

Browse files
committed
docs: markdown perfs
Signed-off-by: bitliu <[email protected]>
1 parent 947e44d commit 2aa3a8d

File tree

1 file changed

+10
-6
lines changed

1 file changed

+10
-6
lines changed

examples/poc/README.md

+10-6
Original file line numberDiff line numberDiff line change
@@ -1,26 +1,29 @@
11
# Envoy Ext Proc Gateway with LoRA Integration
22

3-
This project sets up an Envoy gateway with a custom external processing which implements advanced routing logic tailored for LoRA (Low-Rank Adaptation) adapters. The routing algorithm is based on the model specified (using Open AI API format), and ensuring efficient load balancing based on model server metrics.
3+
This project sets up an Envoy gateway with a custom external processing which implements advanced routing logic tailored for LoRA (Low-Rank Adaptation) adapters. The routing algorithm is based on the model specified (using Open AI API format), and ensuring efficient load balancing based on model server metrics.
44

5-
![alt text](./doc/envoy-gateway-bootstrap.png)
5+
![alt text](./envoy-gateway-bootstrap.png)
66

77
## Requirements
8+
89
- Kubernetes cluster
910
- Envoy Gateway v1.1 installed on your cluster: https://gateway.envoyproxy.io/v1.1/tasks/quickstart/
1011
- `kubectl` command-line tool
1112
- Go (for local development)
12-
- A vLLM based deployment using a custom fork, with LoRA Adapters. ***This PoC uses a modified vLLM [fork](https://github.com/kaushikmitr/vllm), the public image of the fork is here: `ghcr.io/tomatillo-and-multiverse/vllm:demo`***. A sample deployement is provided under `./manifests/samples/vllm-lora-deployment.yaml`.
13+
- A vLLM based deployment using a custom fork, with LoRA Adapters. ***This PoC uses a modified vLLM [fork](https://github.com/kaushikmitr/vllm), the public image of the fork is here: `ghcr.io/tomatillo-and-multiverse/vllm:demo`***. A sample deployement is provided under `./manifests/samples/vllm-lora-deployment.yaml`.
1314

1415
## Quickstart
1516

1617
### Steps
18+
1719
1. **Deploy Sample vLLM Application**
1820
NOTE: Create a HuggingFace API token and store it in a secret named `hf-token` with key hf_api_token`. This is configured in the `HUGGING_FACE_HUB_TOKEN` and `HF_TOKEN` environment variables in `./manifests/samples/vllm-lora-deployment.yaml`.
1921

2022
```bash
2123
kubectl apply -f ./manifests/samples/vllm-lora-deployment.yaml
2224
kubectl apply -f ./manifests/samples/vllm-lora-service.yaml
2325
```
26+
2427
2. **Install GatewayClass with Ext Proc**
2528
A custom GatewayClass `llm-gateway` which is configured with the llm routing ext proc will be installed into the `llm-gateway` namespace. It's configured to listen on port 8081 for traffic through ext-proc (in addition to the default 8080), see the `EnvoyProxy` configuration in `installation.yaml`. When you create Gateways, make sure the `llm-gateway` GatewayClass is used.
2629

@@ -29,14 +32,16 @@ This project sets up an Envoy gateway with a custom external processing which i
2932
```bash
3033
kubectl apply -f ./manifests/installation.yaml
3134
```
35+
3236
3. **Deploy Gateway**
33-
37+
3438
```bash
3539
kubectl apply -f ./manifests/samples/gateway.yaml
3640
```
3741

38-
4. **Try it out**
42+
4. **Try it out**
3943
Wait until the gateway is ready.
44+
4045
```bash
4146
IP=$(kubectl get gateway/llm-gateway -o jsonpath='{.status.addresses[0].value}')
4247
PORT=8081
@@ -49,7 +54,6 @@ This project sets up an Envoy gateway with a custom external processing which i
4954
}'
5055
```
5156

52-
5357
## License
5458

5559
This project is licensed under the MIT License.

0 commit comments

Comments
 (0)