Skip to content

Commit 39d626c

Browse files
committed
fixup: add extra note to explain the limitation of upgrade->downgrade->upgrade path
1 parent 6fc3a7d commit 39d626c

File tree

1 file changed

+28
-5
lines changed
  • keps/sig-scheduling/3521-pod-scheduling-readiness

1 file changed

+28
-5
lines changed

keps/sig-scheduling/3521-pod-scheduling-readiness/README.md

+28-5
Original file line numberDiff line numberDiff line change
@@ -328,6 +328,12 @@ the following parts:
328328
`SchedulingPaused` to the "phase" column of `kubectl get pod`. This new literal indicates whether it's
329329
scheduling-paused or not.
330330

331+
- **Downgrade->Upgrade path:** For a Pod that has non-empty scheduling gates, if it's downgraded to a
332+
version with this feature disabled, and then upgraded to a version with this feature enabled, the Pod
333+
is still treated as scheduling gated, but its condition may be misleading. Please check the
334+
"Were upgrade and rollback tested? Was the upgrade->downgrade->upgrade path tested?" portion in
335+
section [Rollout, Upgrade and Rollback Planning](#rollout-upgrade-and-rollback-planning) for more details.
336+
331337
### Risks and Mitigations
332338

333339
<!--
@@ -828,16 +834,33 @@ are missing a bunch of machinery and tooling and can't do that now.
828834
-->
829835

830836
- Start a local Kubernetes 1.26 cluster (`PodSchedulingReadiness` defaulted to false)
831-
- Create a Pod `pause` with one scheduling gate
837+
- Create a Pod `test-pod` with one scheduling gate
832838
- The Pod's scheduling gate gets dropped as expected due to disabled feature gate
833839
- Delete the Pod
834840
- Re-start API Server and scheduler with version 1.27, and specify `PodSchedulingReadiness=true`
835-
- Create the same Pod `pause` with one scheduling gate
836-
- The Pod stays in `SchedulingGated` state, and its `.spec.schedulingGate` is persisted
841+
- Create the same Pod `test-pod` with one scheduling gate
842+
- The Pod stays in `SchedulingGated` state, and its `.spec.schedulingGates` is persisted
837843
- Re-start API Server and scheduler with version 1.26
838-
- The Pod `pause` enters `Pending` state, with old `.spec.schedulingGate` reserved
844+
- The Pod `test-pod` enters `Pending` state, with old `.spec.schedulingGates` reserved<sup>1</sup>
839845
- Re-start API Server and scheduler with version 1.27, and specify `PodSchedulingReadiness=true`
840-
- The Pod stays in `Pending` state, with old `.spec.schedulingGate` reserved
846+
- The Pod stays in `Pending` state, with old `.spec.schedulingGates` reserved<sup>2</sup>
847+
848+
<sup>1</sup> It's pending because binding a node to a Pod with non-empty scheduling gates are not allowed:
849+
```yaml
850+
status:
851+
conditions:
852+
- message: 'running Bind plugin "DefaultBinder": Operation cannot be fulfilled on
853+
pods/binding "pause": pod pause has non-empty .spec.schedulingGates'
854+
reason: SchedulerError
855+
status: "False"
856+
type: PodScheduled
857+
phase: Pending
858+
```
859+
860+
<sup>2</sup> Although the Pod is still considered as "gated" internally, its `{PodScheduled, SchedulerError}`
861+
condition that was persisted previously is not updated - which is designed to be responsible by
862+
API Server but is not triggered in this downgrade->upgrade path. Please check
863+
https://github.com/kubernetes/enhancements/pull/3871#discussion_r1102315845 for more details.
841864

842865
###### Is the rollout accompanied by any deprecations and/or removals of features, APIs, fields of API types, flags, etc.?
843866

0 commit comments

Comments
 (0)