-
Notifications
You must be signed in to change notification settings - Fork 69
Envoy update #18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Envoy update #18
Changes from all commits
868a861
9fa80a9
c3571bd
c46a496
e98db98
234a0ac
cc9105f
32f050c
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,26 @@ | ||
apiVersion: v1 | ||
kind: ConfigMap | ||
metadata: | ||
name: envoy-gateway-config | ||
namespace: envoy-gateway-system | ||
data: | ||
# This manifest's main purpose is to set `enabledEnvoyPatchPolicy` to `true`. | ||
# Any field under `admin` is optional, and only for enabling the admin endpoints, for debugging. | ||
# Admin Interface: https://www.envoyproxy.io/docs/envoy/latest/operations/admin | ||
# PatchPolicy docs: https://gateway.envoyproxy.io/docs/tasks/extensibility/envoy-patch-policy/#enable-envoypatchpolicy | ||
envoy-gateway.yaml: | | ||
apiVersion: gateway.envoyproxy.io/v1alpha1 | ||
kind: EnvoyGateway | ||
provider: | ||
type: Kubernetes | ||
gateway: | ||
controllerName: gateway.envoyproxy.io/gatewayclass-controller | ||
extensionApis: | ||
enableEnvoyPatchPolicy: true | ||
enableBackend: true | ||
# admin: | ||
# enablePprof: true | ||
# address: | ||
# host: 127.0.0.1 | ||
# port: 19000 | ||
# enabledDumpConfig: true |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,69 @@ | ||
apiVersion: apps/v1 | ||
kind: Deployment | ||
metadata: | ||
name: instance-gateway-ext-proc | ||
namespace: default | ||
labels: | ||
app: instance-gateway-ext-proc | ||
spec: | ||
replicas: 1 | ||
selector: | ||
matchLabels: | ||
app: instance-gateway-ext-proc | ||
template: | ||
metadata: | ||
labels: | ||
app: instance-gateway-ext-proc | ||
spec: | ||
containers: | ||
- name: instance-gateway-ext-proc | ||
image: ghcr.io/tomatillo-and-multiverse/ext-proc:demo | ||
args: | ||
#TODO: specify label selector and dynamically update pods | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. we should actually pass the name of the LLMServerPool There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah, totally agree, I plan to have ext-proc pull the selection from the config on the LSP so there is always a single source of truth (once we have that set up) |
||
- -pods | ||
- "vllm-78665f78c4-h4kx4,vllm-78665f78c4-hnz84" | ||
- -podIPs | ||
- "10.24.11.6:8000,10.24.5.7:8000" | ||
- -enable-fairness | ||
- "false" | ||
ports: | ||
- containerPort: 9002 | ||
- name: curl | ||
image: curlimages/curl | ||
command: ["sleep", "3600"] | ||
--- | ||
apiVersion: v1 | ||
kind: Service | ||
metadata: | ||
name: instance-gateway-ext-proc | ||
namespace: default | ||
spec: | ||
selector: | ||
app: instance-gateway-ext-proc | ||
ports: | ||
- protocol: TCP | ||
port: 9002 | ||
targetPort: 9002 | ||
type: ClusterIP | ||
--- | ||
apiVersion: gateway.envoyproxy.io/v1alpha1 | ||
kind: EnvoyExtensionPolicy | ||
metadata: | ||
name: ext-proc-policy | ||
namespace: default | ||
kfswain marked this conversation as resolved.
Show resolved
Hide resolved
|
||
spec: | ||
extProc: | ||
- backendRefs: | ||
- group: "" | ||
kind: Service | ||
name: instance-gateway-ext-proc | ||
port: 9002 | ||
processingMode: | ||
request: | ||
body: Buffered | ||
response: | ||
messageTimeout: 5s | ||
targetRef: | ||
group: gateway.networking.k8s.io | ||
kind: HTTPRoute | ||
name: llm-route |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,47 @@ | ||
|
||
--- | ||
apiVersion: gateway.networking.k8s.io/v1 | ||
kind: Gateway | ||
metadata: | ||
name: <GATEWAY-NAME> | ||
spec: | ||
gatewayClassName: <GATEWAY-NAME> | ||
listeners: | ||
- name: http | ||
kfswain marked this conversation as resolved.
Show resolved
Hide resolved
|
||
protocol: HTTP | ||
port: 8080 | ||
- name: llm-gw | ||
protocol: HTTP | ||
port: 8081 | ||
--- | ||
kfswain marked this conversation as resolved.
Show resolved
Hide resolved
|
||
apiVersion: gateway.networking.k8s.io/v1 | ||
kind: GatewayClass | ||
metadata: | ||
name: <GATEWAY-NAME> | ||
spec: | ||
controllerName: gateway.envoyproxy.io/gatewayclass-controller | ||
--- | ||
apiVersion: gateway.envoyproxy.io/v1alpha1 | ||
kind: Backend | ||
metadata: | ||
name: backend-dummy | ||
spec: | ||
endpoints: | ||
- fqdn: | ||
# Both these values are arbitrary and unused as the PatchPolicy redirects requests. | ||
hostname: 'foo.bar.com' | ||
port: 8080 | ||
--- | ||
apiVersion: gateway.networking.k8s.io/v1 | ||
kind: HTTPRoute | ||
metadata: | ||
name: llm-route | ||
spec: | ||
parentRefs: | ||
- name: inference-gateway | ||
sectionName: llm-gw | ||
rules: | ||
- backendRefs: | ||
- group: gateway.envoyproxy.io | ||
kind: Backend | ||
name: backend-dummy |
kfswain marked this conversation as resolved.
Show resolved
Hide resolved
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,38 @@ | ||
apiVersion: gateway.envoyproxy.io/v1alpha1 | ||
kind: EnvoyPatchPolicy | ||
metadata: | ||
name: custom-response-patch-policy | ||
namespace: default | ||
spec: | ||
targetRef: | ||
group: gateway.networking.k8s.io | ||
kind: Gateway | ||
name: <GATEWAY-NAME> | ||
type: JSONPatch | ||
jsonPatches: | ||
# Necessary to create a cluster of the type: ORIGINAL_DST to allow for | ||
# direct pod scheduling. Which is heavily utilized in our scheduling. | ||
# Specifically the field `original_dst_lb_config` allows us to enable | ||
# `use_http_header` and `http_header_name`. | ||
# Source: https://www.envoyproxy.io/docs/envoy/latest/api-v3/config/cluster/v3/cluster.proto | ||
- type: "type.googleapis.com/envoy.config.cluster.v3.Cluster" | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🚀 |
||
name: original_destination_cluster | ||
operation: | ||
op: add | ||
path: "" | ||
value: | ||
name: original_destination_cluster | ||
type: ORIGINAL_DST | ||
original_dst_lb_config: | ||
use_http_header: true | ||
http_header_name: "target-pod" | ||
connect_timeout: 6s | ||
lb_policy: CLUSTER_PROVIDED | ||
dns_lookup_family: V4_ONLY | ||
|
||
- type: "type.googleapis.com/envoy.config.route.v3.RouteConfiguration" | ||
name: default/<GATEWAY-NAME>/llm-gw | ||
operation: | ||
op: replace | ||
path: "/virtual_hosts/1/routes/0/route/cluster" | ||
value: original_destination_cluster |
This file was deleted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have instructions for deploying the Envoy gateway controller?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup! The quickstart on line 10 points them there