Skip to content

Commit ca36b50

Browse files
authored
Merge pull request #3871 from Huang-Wei/update-kep-3521
Update KEP 3521 with test result of upgrade->downgrade->upgrade path
2 parents b077d00 + 39d626c commit ca36b50

File tree

1 file changed

+33
-4
lines changed
  • keps/sig-scheduling/3521-pod-scheduling-readiness

1 file changed

+33
-4
lines changed

keps/sig-scheduling/3521-pod-scheduling-readiness/README.md

+33-4
Original file line numberDiff line numberDiff line change
@@ -328,6 +328,12 @@ the following parts:
328328
`SchedulingPaused` to the "phase" column of `kubectl get pod`. This new literal indicates whether it's
329329
scheduling-paused or not.
330330

331+
- **Downgrade->Upgrade path:** For a Pod that has non-empty scheduling gates, if it's downgraded to a
332+
version with this feature disabled, and then upgraded to a version with this feature enabled, the Pod
333+
is still treated as scheduling gated, but its condition may be misleading. Please check the
334+
"Were upgrade and rollback tested? Was the upgrade->downgrade->upgrade path tested?" portion in
335+
section [Rollout, Upgrade and Rollback Planning](#rollout-upgrade-and-rollback-planning) for more details.
336+
331337
### Risks and Mitigations
332338

333339
<!--
@@ -827,11 +833,34 @@ Longer term, we may want to require automated upgrade/rollback tests, but we
827833
are missing a bunch of machinery and tooling and can't do that now.
828834
-->
829835

830-
It will be tested manually prior to beta launch.
836+
- Start a local Kubernetes 1.26 cluster (`PodSchedulingReadiness` defaulted to false)
837+
- Create a Pod `test-pod` with one scheduling gate
838+
- The Pod's scheduling gate gets dropped as expected due to disabled feature gate
839+
- Delete the Pod
840+
- Re-start API Server and scheduler with version 1.27, and specify `PodSchedulingReadiness=true`
841+
- Create the same Pod `test-pod` with one scheduling gate
842+
- The Pod stays in `SchedulingGated` state, and its `.spec.schedulingGates` is persisted
843+
- Re-start API Server and scheduler with version 1.26
844+
- The Pod `test-pod` enters `Pending` state, with old `.spec.schedulingGates` reserved<sup>1</sup>
845+
- Re-start API Server and scheduler with version 1.27, and specify `PodSchedulingReadiness=true`
846+
- The Pod stays in `Pending` state, with old `.spec.schedulingGates` reserved<sup>2</sup>
847+
848+
<sup>1</sup> It's pending because binding a node to a Pod with non-empty scheduling gates are not allowed:
849+
```yaml
850+
status:
851+
conditions:
852+
- message: 'running Bind plugin "DefaultBinder": Operation cannot be fulfilled on
853+
pods/binding "pause": pod pause has non-empty .spec.schedulingGates'
854+
reason: SchedulerError
855+
status: "False"
856+
type: PodScheduled
857+
phase: Pending
858+
```
831859
832-
<<UNRESOLVED>>
833-
Add detailed scenarios and result here, and cc @wojtek-t.
834-
<</UNRESOLVED>>
860+
<sup>2</sup> Although the Pod is still considered as "gated" internally, its `{PodScheduled, SchedulerError}`
861+
condition that was persisted previously is not updated - which is designed to be responsible by
862+
API Server but is not triggered in this downgrade->upgrade path. Please check
863+
https://github.com/kubernetes/enhancements/pull/3871#discussion_r1102315845 for more details.
835864

836865
###### Is the rollout accompanied by any deprecations and/or removals of features, APIs, fields of API types, flags, etc.?
837866

0 commit comments

Comments
 (0)