|
15 | 15 | - [Risks and Mitigations](#risks-and-mitigations)
|
16 | 16 | - [The Job object too big](#the-job-object-too-big)
|
17 | 17 | - [Expotential backoff delay issue](#expotential-backoff-delay-issue)
|
| 18 | + - [Too fast Job status updates](#too-fast-job-status-updates) |
18 | 19 | - [Design Details](#design-details)
|
19 | 20 | - [Job API](#job-api)
|
20 | 21 | - [Tracking the number of failures per index](#tracking-the-number-of-failures-per-index)
|
@@ -328,6 +329,33 @@ fallback to the pod's creation time.
|
328 | 329 | This fix can be considered a preparatory PR before the KEP, as to some extent
|
329 | 330 | is solves the preexisting issue.
|
330 | 331 |
|
| 332 | +### Too fast Job status updates |
| 333 | + |
| 334 | +In this KEP the Job controller needs to keep updating the new status field |
| 335 | +`.status.failedIndexes` to reflect the current status of the Job. This can raise |
| 336 | +concerns of overwhelming the API server with status updates. |
| 337 | + |
| 338 | +First, observe that the new field does not entail additional Job status updates. |
| 339 | +When a pod terminates (either failure or success), it triggers Job status update |
| 340 | +to increment the `status.failed` or `.status.succeeded` counter fields. These |
| 341 | +updates are also used to update the pre-existing `status.completedIndexes` |
| 342 | +field, and the new `status.failedIndexes` field. |
| 343 | + |
| 344 | +Second, in order to mitigate this risk there is already a mechanism present in |
| 345 | +the Job controller, to bulk Job status updates per Job. |
| 346 | + |
| 347 | +The way the mechanism works is that Job controller maintains a queue of `syncJob` |
| 348 | +invocations per job |
| 349 | +(see [in code](https://github.com/kubernetes/kubernetes/blob/72a3990728b2a8979effb37b9800beb3117349f6/pkg/controller/job/job_controller.go#L118)). |
| 350 | +New items are added to the queue with a delay (1s for pod events, such as: |
| 351 | +delete, add, update). The delay allows for deduplication of the sync per Job. |
| 352 | + |
| 353 | +One place to queue a new item in the queue, specific to this KEP, is when |
| 354 | +the expotential backoff delay hasn't elapsed for any index (allowing pod |
| 355 | +recreation), then we requeue the next Job status update. The delay is computed |
| 356 | +as minimum of all delays computed for all indexes requiring pod recreation, |
| 357 | +but not less that 1s. |
| 358 | + |
331 | 359 | <!--
|
332 | 360 | What are the risks of this proposal, and how do we mitigate? Think broadly.
|
333 | 361 | For example, consider both security and how this will impact the larger
|
@@ -413,6 +441,9 @@ type JobStatus struct {
|
413 | 441 | }
|
414 | 442 | ```
|
415 | 443 |
|
| 444 | +Note that, the `PodFailurePolicyAction` type is already defined in master with |
| 445 | +three possible enum values: `Ignore`, `FailJob` and `Count` (see [here](https://github.com/kubernetes/kubernetes/blob/72a3990728b2a8979effb37b9800beb3117349f6/pkg/apis/batch/types.go#L113-L131)). |
| 446 | + |
416 | 447 | We allow to specify custom `.spec.backoffLimit` and `.spec.backoffLimitPerIndex`.
|
417 | 448 | This allows for a controlled downgrade. Also, when `.spec.backoffLimitPerIndex`
|
418 | 449 | is specified, then we default `.spec.backoffLimit` to max int32 value. This way
|
@@ -1103,7 +1134,7 @@ Describe them, providing:
|
1103 | 1134 |
|
1104 | 1135 | ###### Will enabling / using this feature result in increasing time taken by any operations covered by existing SLIs/SLOs?
|
1105 | 1136 |
|
1106 |
| -However, we don't expect this increase to be captured by existing |
| 1137 | +We don't expect this increase to be captured by existing |
1107 | 1138 | [SLO/SLIs](https://github.com/kubernetes/community/blob/master/sig-scalability/slos/slos.md).
|
1108 | 1139 |
|
1109 | 1140 | <!--
|
|
0 commit comments