Update replacing-inference-pool.md

nicolexin · web-flow · commit ca7e02eb8156 · 2025-04-11T10:48:40.000-07:00
diff --git a/site-src/guides/replacing-inference-pool.md b/site-src/guides/replacing-inference-pool.md
@@ -11,14 +11,16 @@ Use Cases for Replacing an InferencePool:
 - Upgrading or replacing your base model
 - Transitioning to new hardware
 
+## How to replace an InferencePool
+
 To replacing an InferencePool:
 
 1. **Deploy new infrastructure**: Create a new InferencePool configured with the new hardware / model server / base model that you chose.
 1. **Configure traffic splitting**: Use an HTTPRoute to split traffic between the existing InferencePool and the new InferencePool. The `backendRefs.weight` field controls the traffic percentage allocated to each pool.
 1. **Maintain InferenceModel integrity**: Keep your InferenceModel configuration unchanged. This ensures that the system applies the same LoRA adapters consistently across both base model versions.
 1. **Preserve rollback capability**: Retain the original nodes and InferencePool during the roll out to facilitate a rollback if necessary.
 
-## Example
+### Example
 
 You start with an existing lnferencePool named `llm-pool-v1`. To replace the original InferencePool, you create a new InferencePool named `llm-pool-v2`. By configuring an **HTTPRoute**, as shown below, you can incrementally split traffic between the original `llm-pool-v1` and new `llm-pool-v2`.