Skip to content

Commit f381644

Browse files
committed
Update the endpoint picker proposal
1 parent c86ea56 commit f381644

File tree

1 file changed

+2
-2
lines changed
  • docs/proposals/003-endpoint-picker-protocol

1 file changed

+2
-2
lines changed

docs/proposals/003-endpoint-picker-protocol/README.md

+2-2
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ The EPP MUST implement the Envoy
1212
[external processing service](https://www.envoyproxy.io/docs/envoy/latest/api-v3/service/ext_proc/v3/external_processor)protocol.
1313

1414
For each HTTP request, the EPP MUST communicate to the proxy the picked model server endpoint, via
15-
adding the `target-pod` HTTP header in the request, or otherwise return an error.
15+
adding the `x-gateway-destination-endpoint` HTTP header in the request and as an unstructured entry in the [dynamic_metadata](https://github.com/envoyproxy/go-control-plane/blob/c19bf63a811c90bf9e02f8e0dc1dcef94931ebb4/envoy/service/ext_proc/v3/external_processor.pb.go#L320) field of the ext-proc response, or otherwise return an error.
1616

1717
## Model Server Protocol
1818

@@ -62,4 +62,4 @@ The model server MUST expose the following LoRA adapter metrics via the same Pro
6262
Requests will be queued if the model server has reached MaxActiveAdapter and canno load the
6363
requested adapter. Example: `"max_lora": "8"`.
6464
* `running_lora_adapters`: A comma separated list of adapters that are currently loaded in GPU
65-
memory and ready to serve requests. Example: `"running_lora_adapters": "adapter1, adapter2"`
65+
memory and ready to serve requests. Example: `"running_lora_adapters": "adapter1, adapter2"`

0 commit comments

Comments
 (0)