File tree 3 files changed +37
-33
lines changed
3 files changed +37
-33
lines changed Original file line number Diff line number Diff line change @@ -30,7 +30,6 @@ The current manifests rely on Envoy Gateway [v1.2.1](https://gateway.envoyproxy.
30
30
```
31
31
Additionally, if you would like to enable the admin interface, you can uncomment the admin lines and run this again.
32
32
33
-
34
33
1 . ** Deploy Gateway**
35
34
36
35
``` bash
@@ -41,6 +40,12 @@ The current manifests rely on Envoy Gateway [v1.2.1](https://gateway.envoyproxy.
41
40
42
41
``` bash
43
42
kubectl apply -f ./manifests/ext_proc.yaml
43
+ ```
44
+
45
+ 1 . ** Deploy Envoy Gateway Custom Policies**
46
+
47
+ ``` bash
48
+ kubectl apply -f ./manifests/extension_policy.yaml
44
49
kubectl apply -f ./manifests/patch_policy.yaml
45
50
```
46
51
Original file line number Diff line number Diff line change @@ -103,35 +103,3 @@ spec:
103
103
port : 9002
104
104
targetPort : 9002
105
105
type : ClusterIP
106
- ---
107
- apiVersion : gateway.envoyproxy.io/v1alpha1
108
- kind : EnvoyExtensionPolicy
109
- metadata :
110
- name : ext-proc-policy
111
- namespace : default
112
- spec :
113
- extProc :
114
- - backendRefs :
115
- - group : " "
116
- kind : Service
117
- name : inference-gateway-ext-proc
118
- port : 9002
119
- processingMode :
120
- request :
121
- body : Buffered
122
- response :
123
- # The timeouts are likely not needed here. We can experiment with removing/tuning them slowly.
124
- # The connection limits are more important and will cause the opaque: ext_proc_gRPC_error_14 error in Envoy GW if not configured correctly.
125
- messageTimeout : 1000s
126
- backendSettings :
127
- circuitBreaker :
128
- maxConnections : 40000
129
- maxPendingRequests : 40000
130
- maxParallelRequests : 40000
131
- timeout :
132
- tcp :
133
- connectTimeout : 24h
134
- targetRef :
135
- group : gateway.networking.k8s.io
136
- kind : HTTPRoute
137
- name : llm-route
Original file line number Diff line number Diff line change
1
+ apiVersion : gateway.envoyproxy.io/v1alpha1
2
+ kind : EnvoyExtensionPolicy
3
+ metadata :
4
+ name : ext-proc-policy
5
+ namespace : default
6
+ spec :
7
+ extProc :
8
+ - backendRefs :
9
+ - group : " "
10
+ kind : Service
11
+ name : inference-gateway-ext-proc
12
+ port : 9002
13
+ processingMode :
14
+ request :
15
+ body : Buffered
16
+ response :
17
+ # The timeouts are likely not needed here. We can experiment with removing/tuning them slowly.
18
+ # The connection limits are more important and will cause the opaque: ext_proc_gRPC_error_14 error in Envoy GW if not configured correctly.
19
+ messageTimeout : 1000s
20
+ backendSettings :
21
+ circuitBreaker :
22
+ maxConnections : 40000
23
+ maxPendingRequests : 40000
24
+ maxParallelRequests : 40000
25
+ timeout :
26
+ tcp :
27
+ connectTimeout : 24h
28
+ targetRef :
29
+ group : gateway.networking.k8s.io
30
+ kind : HTTPRoute
31
+ name : llm-route
You can’t perform that action at this time.
0 commit comments