Skip to content

Commit ca7e02e

Browse files
authored
Update replacing-inference-pool.md
1 parent 53e16b8 commit ca7e02e

File tree

1 file changed

+3
-1
lines changed

1 file changed

+3
-1
lines changed

site-src/guides/replacing-inference-pool.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,14 +11,16 @@ Use Cases for Replacing an InferencePool:
1111
- Upgrading or replacing your base model
1212
- Transitioning to new hardware
1313

14+
## How to replace an InferencePool
15+
1416
To replacing an InferencePool:
1517

1618
1. **Deploy new infrastructure**: Create a new InferencePool configured with the new hardware / model server / base model that you chose.
1719
1. **Configure traffic splitting**: Use an HTTPRoute to split traffic between the existing InferencePool and the new InferencePool. The `backendRefs.weight` field controls the traffic percentage allocated to each pool.
1820
1. **Maintain InferenceModel integrity**: Keep your InferenceModel configuration unchanged. This ensures that the system applies the same LoRA adapters consistently across both base model versions.
1921
1. **Preserve rollback capability**: Retain the original nodes and InferencePool during the roll out to facilitate a rollback if necessary.
2022

21-
## Example
23+
### Example
2224

2325
You start with an existing lnferencePool named `llm-pool-v1`. To replace the original InferencePool, you create a new InferencePool named `llm-pool-v2`. By configuring an **HTTPRoute**, as shown below, you can incrementally split traffic between the original `llm-pool-v1` and new `llm-pool-v2`.
2426

0 commit comments

Comments
 (0)