Skip to content

Commit b77ad33

Browse files
nicolexinrobscott
authored andcommitted
Adding getting started instructions for GKE, Istio, and Kgateway (kubernetes-sigs#577)
* Create resources.yaml for kgateway * Update getting started guide for KGateway * Replace Envoy Gateway user guide with GKE user guide * Create resources.yaml for GKE Gateway * Delete config/manifests/gateway/enable_patch_policy.yaml * Delete config/manifests/gateway/gateway.yaml * Delete config/manifests/gateway/patch_policy.yaml * Delete config/manifests/gateway/traffic_policy.yaml * Add http2 appProtocol to EPP service * Add user guide for Istio * Create resources.yaml for Istio * Fix GKE gateway name to match the user guide * Fix cleanup instructions to refer up-to-date YAMLs * Allow Istio gateway to use HTTPRoute from all namespaces * Update Kgateway port number to 80 * Update gateway port to 80 * Remove the sectionName from Kgateway HTTPRoute * Create common httproute YAML * Create healthcheck.yaml for GKE gateway * Separate gateway.yaml for GKE gateway * Separate gateway.yaml for Istio * Separate gateway.yaml for Kgateway * Update the user guide to use shared HTTPRoute YAML * Add EPP DestinationRule for Istio * Add instructions for bypassing TLS verification for Istio * Update CRDs to the latest v0.2.0 release Co-authored-by: Rob Scott <[email protected]> * Update gateway to use the v1 API Co-authored-by: Rob Scott <[email protected]> * Remove weight from HTTPRoute Co-authored-by: Rob Scott <[email protected]> * Update gateway.yaml Remove allowed routes from GKE gateway YAML * Remove allowedRoutes from Istio gateway * Remove allowedRoutes from Kgateway * Update latest instructions for installing Istio and addressing some comments * Fix indentation for installing CRDs * Addressing code review comments * Fix indentation * Update Istio installation instructions * Fix indentation * Fix indentation * Add more spacing to the CPU based model instructions * Removing comments from kgateway * Add clarification on the EPP secureServing default value. Co-authored-by: Rob Scott <[email protected]> * Add instructions for configuring timeout * Create httproute-with-timeout.yaml * Create gcp-backend-policy.yaml * Add cleanup for GCPBackendPolicy * Remove namespace from destination-rule.yaml * Rename inferencepool.yaml to inferencepool-resources.yaml * Rename inferencepool.yaml to inferencepool-resources.yaml * Rename inferencepool.yaml to inferencepool-resources.yaml --------- Co-authored-by: Rob Scott <[email protected]>
1 parent 2835c38 commit b77ad33

15 files changed

+292
-274
lines changed

config/manifests/gateway/enable_patch_policy.yaml

-27
This file was deleted.

config/manifests/gateway/gateway.yaml

-50
This file was deleted.
+10
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
kind: Gateway
2+
apiVersion: gateway.networking.k8s.io/v1
3+
metadata:
4+
name: inference-gateway
5+
spec:
6+
gatewayClassName: gke-l7-regional-external-managed
7+
listeners:
8+
- name: http
9+
port: 80
10+
protocol: HTTP
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
apiVersion: networking.gke.io/v1
2+
kind: GCPBackendPolicy
3+
metadata:
4+
name: inferencepool-backend-policy
5+
spec:
6+
targetRef:
7+
group: "inference.networking.x-k8s.io"
8+
kind: InferencePool
9+
name: vllm-llama3-8b-instruct
10+
default:
11+
timeoutSec: 300
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
kind: HealthCheckPolicy
2+
apiVersion: networking.gke.io/v1
3+
metadata:
4+
name: health-check-policy
5+
namespace: default
6+
spec:
7+
targetRef:
8+
group: "inference.networking.x-k8s.io"
9+
kind: InferencePool
10+
name: vllm-llama2-7b
11+
default:
12+
config:
13+
type: HTTP
14+
httpHealthCheck:
15+
requestPath: /health
16+
port: 8000
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
apiVersion: gateway.networking.k8s.io/v1
2+
kind: HTTPRoute
3+
metadata:
4+
name: llm-route
5+
spec:
6+
parentRefs:
7+
- group: gateway.networking.k8s.io
8+
kind: Gateway
9+
name: inference-gateway
10+
rules:
11+
- backendRefs:
12+
- group: inference.networking.x-k8s.io
13+
kind: InferencePool
14+
name: vllm-llama2-7b
15+
matches:
16+
- path:
17+
type: PathPrefix
18+
value: /
19+
timeouts:
20+
request: 300s
+18
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
apiVersion: gateway.networking.k8s.io/v1
2+
kind: HTTPRoute
3+
metadata:
4+
name: llm-route
5+
spec:
6+
parentRefs:
7+
- group: gateway.networking.k8s.io
8+
kind: Gateway
9+
name: inference-gateway
10+
rules:
11+
- backendRefs:
12+
- group: inference.networking.x-k8s.io
13+
kind: InferencePool
14+
name: vllm-llama2-7b
15+
matches:
16+
- path:
17+
type: PathPrefix
18+
value: /
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
apiVersion: networking.istio.io/v1
2+
kind: DestinationRule
3+
metadata:
4+
name: epp-insecure-tls
5+
spec:
6+
host: vllm-llama2-7b-epp
7+
trafficPolicy:
8+
tls:
9+
mode: SIMPLE
10+
insecureSkipVerify: true
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
apiVersion: gateway.networking.k8s.io/v1
2+
kind: Gateway
3+
metadata:
4+
name: inference-gateway
5+
spec:
6+
gatewayClassName: istio
7+
listeners:
8+
- name: http
9+
port: 80
10+
protocol: HTTP
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
apiVersion: gateway.networking.k8s.io/v1
2+
kind: Gateway
3+
metadata:
4+
name: inference-gateway
5+
spec:
6+
gatewayClassName: kgateway
7+
listeners:
8+
- name: http
9+
port: 80
10+
protocol: HTTP

config/manifests/gateway/patch_policy.yaml

-123
This file was deleted.

config/manifests/gateway/traffic_policy.yaml

-16
This file was deleted.

config/manifests/inferencepool.yaml renamed to config/manifests/inferencepool-resources.yaml

+1
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,7 @@ spec:
2222
- protocol: TCP
2323
port: 9002
2424
targetPort: 9002
25+
appProtocol: http2
2526
type: ClusterIP
2627
---
2728
apiVersion: apps/v1

0 commit comments

Comments
 (0)