Skip to content

Commit 0d9f0c4

Browse files
authored
Merge pull request #4260 from bart0sh/PR013-DRA-non-graceful-node-shutdowns
DRA: handle non graceful node shutdowns
2 parents 19975ac + aa42236 commit 0d9f0c4

File tree

1 file changed

+15
-0
lines changed
  • keps/sig-node/3063-dynamic-resource-allocation

1 file changed

+15
-0
lines changed

keps/sig-node/3063-dynamic-resource-allocation/README.md

+15
Original file line numberDiff line numberDiff line change
@@ -98,6 +98,7 @@ SIG Architecture for cross-cutting KEPs).
9898
- [Coordinating resource allocation through the scheduler](#coordinating-resource-allocation-through-the-scheduler)
9999
- [Resource allocation and usage flow](#resource-allocation-and-usage-flow)
100100
- [Scheduled pods with unallocated or unreserved claims](#scheduled-pods-with-unallocated-or-unreserved-claims)
101+
- [Handling non graceful node shutdowns](#handling-non-graceful-node-shutdowns)
101102
- [API](#api)
102103
- [resource.k8s.io](#resourcek8sio)
103104
- [core](#core)
@@ -1162,6 +1163,20 @@ Once all of those steps are complete, kubelet will notice that the claims are
11621163
ready and run the pod. Until then it will keep checking periodically, just as
11631164
it does for other reasons that prevent a pod from running.
11641165

1166+
### Handling non graceful node shutdowns
1167+
1168+
When a node is shut down unexpectedly and is tainted with an `out-of-service`
1169+
taint with NoExecute effect as explained in the [Non graceful node shutdown KEP](https://github.com/kubernetes/enhancements/tree/master/keps/sig-storage/2268-non-graceful-shutdown),
1170+
all running pods on the node will be deleted by the GC controller and the
1171+
resources used by the pods will be deallocated. However, they will not be
1172+
un-prepared as the node is down and Kubelet is not running on it.
1173+
1174+
Resource drivers should be able to handle this situation correctly and
1175+
should not expect `UnprepareNodeResources` to be always called.
1176+
If resources are unprepared when `Deallocate` is called, `Deallocate`
1177+
might need to perform additional actions to correctly deallocate
1178+
resources.
1179+
11651180
### API
11661181

11671182
The PodSpec gets extended. To minimize the changes in core/v1, all new types

0 commit comments

Comments
 (0)