diff --git a/examples/poc/README.md b/examples/poc/README.md
index b859fe50..0203e846 100644
--- a/examples/poc/README.md
+++ b/examples/poc/README.md
@@ -1,68 +1,55 @@
 # Envoy Ext Proc Gateway with LoRA Integration
 
-This project sets up an Envoy gateway to handle gRPC calls with integration of LoRA (Low-Rank Adaptation). The configuration aims to manage gRPC traffic through Envoy's external processing and custom routing based on headers and load balancing rules. The setup includes Kubernetes services and deployments for both the gRPC server and the vllm-lora application.
+This project sets up an Envoy gateway with a custom external processing which  implements advanced routing logic tailored for LoRA (Low-Rank Adaptation) adapters. The routing algorithm is based on the model specified (using Open AI API format), and ensuring efficient load balancing based on model server metrics.
+
+![alt text](./doc/envoy-gateway-bootstrap.png)
 
 ## Requirements
-- A vLLM based deployment (using the custom image provided below), with LoRA Adapters
 - Kubernetes cluster
 - Envoy Gateway v1.1 installed on your cluster: https://gateway.envoyproxy.io/v1.1/tasks/quickstart/
 - `kubectl` command-line tool
 - Go (for local development)
-
-## vLLM
-***This PoC uses a modified vLLM fork, the public image of the fork is here: `ghcr.io/tomatillo-and-multiverse/vllm:demo`***
-
-The fork is here: https://github.com/kaushikmitr/vllm.
-
-The summary of changes from standard vLLM are:
-- Active/Registered LoRA adapters are returned as a response header (used for lora-aware routing)
-- Queue size is returned as a response header
-- Active/Registered LoRA adapters are emitted as metrics (for out-of-band scraping during low traffic periods)
-
-
-## Overview
-
-This project contains the necessary configurations and code to set up and deploy a service using Kubernetes, Envoy, and Go. The service involves routing based on the model specified (using Open AI API format), collecting metrics, and ensuring efficient load balancing.
-
-![alt text](./envoy-gateway-bootstrap.png)
-
+- A vLLM based deployment using a custom fork, with LoRA Adapters.  ***This PoC uses a modified vLLM [fork](https://github.com/kaushikmitr/vllm), the public image of the fork is here: `ghcr.io/tomatillo-and-multiverse/vllm:demo`***. A sample deployement is provided under `./manifests/samples/vllm-lora-deployment.yaml`. 
 
 ## Quickstart
 
 ### Steps
+1. **Deploy Sample vLLM Application**
+   NOTE: Create a HuggingFace API token and store it in a secret named `hf-token` with key hf_api_token`. This is configured in the `HUGGING_FACE_HUB_TOKEN` and `HF_TOKEN` environment variables in `./manifests/samples/vllm-lora-deployment.yaml`.
 
-1. **Apply Kubernetes Manifests**
    ```bash
-   cd manifests
-   kubectl apply -f ext_proc.yaml
-   kubectl apply -f vllm/vllm-lora-service.yaml
-   kubectl apply -f vllm/vllm-lora-deployment.yaml
+   kubectl apply -f ./manifests/samples/vllm-lora-deployment.yaml
+   kubectl apply -f ./manifests/samples/vllm-lora-service.yaml
    ```
+2. **Install GatewayClass with Ext Proc**
+   A custom GatewayClass `llm-gateway` which is configured with the llm routing ext proc will be installed into the `llm-gateway` namespace. It's configured to listen on port 8081 for traffic through ext-proc (in addition to the default 8080), see the `EnvoyProxy` configuration in `installation.yaml`. When you create Gateways, make sure the `llm-gateway` GatewayClass is used.
 
-2. **Update `ext_proc.yaml`**
-   - Ensure the `ext_proc.yaml` is updated with the pod names and internal IP addresses of the vLLM replicas. This step is crucial for the correct routing of requests based on headers.
+   NOTE: Ensure the `llm-route-ext-proc` deployment is updated with the pod names and internal IP addresses of the vLLM replicas. This step is crucial for the correct routing of requests based on headers. This won't be needed once we make ext proc dynamically read the pods.
 
-2. **Update and apply `gateway.yaml`**
-   - Ensure the `gateway.yaml` is updated with the internal IP addresses of the ExtProc service. This step is also crucial for the correct routing of requests based on headers.
-    ```bash
-   cd manifests
-   kubectl apply -f gateway.yaml
+   ```bash
+   kubectl apply -f ./manifests/installation.yaml
+   ```
+3. **Deploy Gateway**
+   
+   ```bash
+   kubectl apply -f ./manifests/samples/gateway.yaml
    ```
 
-### Monitoring and Metrics
-
-- The Go application collects metrics and saves the latest response headers in memory.
-- Ensure Envoy is configured to route based on the metrics collected from the `/metric` endpoint of different service pods.
-
-## Contributing
+4. **Try it out** 
+   Wait until the gateway is ready.
+   ```bash
+   IP=$(kubectl get gateway/llm-gateway -o jsonpath='{.status.addresses[0].value}')
+   PORT=8081
+
+   curl -i ${IP}:${PORT}/v1/completions -H 'Content-Type: application/json' -d '{
+   "model": "tweet-summary",
+   "prompt": "Write as if you were a critic: San Francisco",
+   "max_tokens": 100,
+   "temperature": 0
+   }'
+   ```
 
-1. Fork the repository.
-2. Create a new branch.
-3. Make your changes.
-4. Open a pull request.
 
 ## License
 
 This project is licensed under the MIT License.
-
----
\ No newline at end of file
diff --git a/examples/poc/manifests/ext-proc.yaml b/examples/poc/manifests/ext-proc.yaml
deleted file mode 100644
index a40576bb..00000000
--- a/examples/poc/manifests/ext-proc.yaml
+++ /dev/null
@@ -1,68 +0,0 @@
-apiVersion: apps/v1
-kind: Deployment
-metadata:
-  name: grpc-server-deployment
-  labels:
-    app: grpc-server
-spec:
-  replicas: 1
-  selector:
-    matchLabels:
-      app: grpc-server
-  template:
-    metadata:
-      labels:
-        app: grpc-server
-    spec:
-      containers:
-      - name: grpc-server
-        image: # Image built from the Dockerfile in ./ext-proc
-        args:
-        #TODO: specify label selector and dynamically update pods
-        - -pods
-        - "vllm-575d76dbfc-l4w5z"
-        - -podIPs
-        - "10.100.0.7:8000"
-        - -enable-fairness
-        - "true"
-        ports:
-        - containerPort: 9002
-      - name: curl
-        image: curlimages/curl
-        command: ["sleep", "3600"]
----
-apiVersion: v1
-kind: Service
-metadata:
-  name: grpc-server-service
-spec:
-  selector:
-    app: grpc-server
-  ports:
-    - protocol: TCP
-      port: 9002
-      targetPort: 9002
-  type: ClusterIP
-
-#TODO: specify label selector and dynamically update pods
-# ---
-# kind: ClusterRole
-# apiVersion: rbac.authorization.k8s.io/v1
-# metadata:
-#   name: pod-read
-# rules:
-# - apiGroups: [""]
-#   resources: ["pods"]
-#   verbs: ["get", "watch", "list"]
-# --- 
-# kind: ClusterRoleBinding
-# apiVersion: rbac.authorization.k8s.io/v1
-# metadata:
-#   name: pod-read-binding
-# subjects:
-# - kind: ServiceAccount
-#   name: default
-#   namespace: default
-# roleRef:
-#   kind: ClusterRole
-#   name: pod-read
diff --git a/examples/poc/manifests/gateway.yaml b/examples/poc/manifests/installation.yaml
similarity index 72%
rename from examples/poc/manifests/gateway.yaml
rename to examples/poc/manifests/installation.yaml
index f2136304..57ecd185 100644
--- a/examples/poc/manifests/gateway.yaml
+++ b/examples/poc/manifests/installation.yaml
@@ -1,16 +1,18 @@
+apiVersion: v1
+kind: Namespace
+metadata:
+  name: llm-gateway
+
 ---
 apiVersion: gateway.envoyproxy.io/v1alpha1
 kind: EnvoyProxy
 metadata:
-  name: custom-proxy-config
-  namespace: envoy-gateway-system
+  name: llm-route-envoy-config
+  namespace: llm-gateway
 spec:
   provider:
     type: Kubernetes
     kubernetes:
-      envoyDeployment:
-        container:
-          image: envoyproxy/envoy:v1.31-latest
       envoyService:
         patch:
           type: StrategicMerge
@@ -78,7 +80,7 @@ spec:
             dns_lookup_family: V4_ONLY
           - name: ext_proc_cluster
             connect_timeout: 1000s
-            type: STATIC
+            type: LOGICAL_DNS
             http2_protocol_options: {}
             lb_policy: ROUND_ROBIN
             load_assignment:
@@ -88,28 +90,66 @@ spec:
                     - endpoint:
                         address:
                           socket_address:
-                            address: 34.118.231.147
+                            address: llm-route-ext-proc.llm-gateway.svc.cluster.local
                             port_value: 9002
 ---
 apiVersion: gateway.networking.k8s.io/v1
 kind: GatewayClass
 metadata:
-  name: inference-gateway
+  name: llm-gateway
 spec:
   controllerName: gateway.envoyproxy.io/gatewayclass-controller
   parametersRef:  
     group: gateway.envoyproxy.io
     kind: EnvoyProxy
-    name: custom-proxy-config
-    namespace: envoy-gateway-system
+    name: llm-route-envoy-config
+    namespace: llm-gateway
+
 ---
-apiVersion: gateway.networking.k8s.io/v1
-kind: Gateway
+apiVersion: apps/v1
+kind: Deployment
+metadata:
+  name: llm-route-ext-proc
+  namespace: llm-gateway
+  labels:
+    app: llm-route-ext-proc
+spec:
+  replicas: 1
+  selector:
+    matchLabels:
+      app: llm-route-ext-proc
+  template:
+    metadata:
+      labels:
+        app: llm-route-ext-proc
+    spec:
+      containers:
+      - name: llm-route-ext-proc
+        image: ghcr.io/tomatillo-and-multiverse/ext-proc:demo
+        args:
+        #TODO: specify label selector and dynamically update pods
+        - -pods
+        - "vllm-78665f78c4-h4kx4,vllm-78665f78c4-hnz84"
+        - -podIPs
+        - "10.24.11.6:8000,10.24.5.7:8000"
+        - -enable-fairness
+        - "false"
+        ports:
+        - containerPort: 9002
+      - name: curl
+        image: curlimages/curl
+        command: ["sleep", "3600"]
+---
+apiVersion: v1
+kind: Service
 metadata:
-  name: inference-gateway
+  name: llm-route-ext-proc
+  namespace: llm-gateway
 spec:
-  gatewayClassName: inference-gateway
-  listeners:
-    - name: http
-      protocol: HTTP
-      port: 8080
+  selector:
+    app: llm-route-ext-proc
+  ports:
+    - protocol: TCP
+      port: 9002
+      targetPort: 9002
+  type: ClusterIP
diff --git a/examples/poc/manifests/samples/gateway.yaml b/examples/poc/manifests/samples/gateway.yaml
new file mode 100644
index 00000000..0f3f1803
--- /dev/null
+++ b/examples/poc/manifests/samples/gateway.yaml
@@ -0,0 +1,12 @@
+
+---
+apiVersion: gateway.networking.k8s.io/v1
+kind: Gateway
+metadata:
+  name: llm-gateway
+spec:
+  gatewayClassName: llm-gateway
+  listeners:
+    - name: http
+      protocol: HTTP
+      port: 8080
diff --git a/examples/poc/manifests/vllm/vllm-lora-deployment.yaml b/examples/poc/manifests/samples/vllm-lora-deployment.yaml
similarity index 100%
rename from examples/poc/manifests/vllm/vllm-lora-deployment.yaml
rename to examples/poc/manifests/samples/vllm-lora-deployment.yaml
diff --git a/examples/poc/manifests/vllm/vllm-lora-service.yaml b/examples/poc/manifests/samples/vllm-lora-service.yaml
similarity index 90%
rename from examples/poc/manifests/vllm/vllm-lora-service.yaml
rename to examples/poc/manifests/samples/vllm-lora-service.yaml
index 9a529bae..ae55ec65 100644
--- a/examples/poc/manifests/vllm/vllm-lora-service.yaml
+++ b/examples/poc/manifests/samples/vllm-lora-service.yaml
@@ -4,7 +4,6 @@ metadata:
   name: vllm-lora
   namespace: default
 spec:
-  clusterIP: None
   selector:
     app: vllm
   ports: