You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This KEP proposes introducing a complementary field `MatchLabelKeys` to `PodAffinityTerm`.
181
+
This KEP proposes introducing a complementary field `MatchLabelSelectors` to `PodAffinityTerm`.
179
182
This enables users to finely control the scope where Pods are expected to co-exist (PodAffinity)
180
183
or not (PodAntiAffinity), on top of the existing `LabelSelector`.
181
184
@@ -207,7 +210,7 @@ The same issue applies to other scheduling directives as well. For example, Matc
207
210
208
211
### Goals
209
212
210
-
- Introduce `MatchLabelKeys` in `PodAffinityTerm` to let users define the scope where Pods are evaluated in required and preferred Pod(Anti)Affinity.
213
+
- Introduce `MatchLabelSelectors` in `PodAffinityTerm` to let users define the scope where Pods are evaluated in required and preferred Pod(Anti)Affinity.
211
214
212
215
### Non-Goals
213
216
@@ -216,7 +219,7 @@ What is out of scope for this KEP? Listing non-goals helps to focus discussion
216
219
and make progress.
217
220
-->
218
221
219
-
- Apply additional internal labels when evaluating `MatchLabelKeys`
222
+
- Apply additional internal labels when evaluating `MatchLabelSelectors`
220
223
221
224
## Proposal
222
225
@@ -245,7 +248,7 @@ and they want only replicas from the same replicaset to be evaluated.
245
248
246
249
The deployment controller adds [pod-template-hash](https://kubernetes.io/docs/concepts/workloads/controllers/deployment/#pod-template-hash-label) to underlying ReplicaSet and thus every Pod created from Deployment carries the hash string.
247
250
248
-
Therefore, users can use `pod-template-hash` in `MatchlabelKeys` to inform the scheduler to only evaluate Pods with the same `pod-template-hash` value.
251
+
Therefore, users can use `pod-template-hash` in `matchLabelSelector.Key` to inform the scheduler to only evaluate Pods with the same `pod-template-hash` value.
249
252
250
253
```yaml
251
254
apiVersion: apps/v1
@@ -263,8 +266,36 @@ metadata:
263
266
values:
264
267
- database
265
268
topologyKey: topology.kubernetes.io/zone
266
-
matchlabelKeys: # ADDED
267
-
- pod-template-hash
269
+
matchLabelSelectors: # ADDED
270
+
- key: pod-template-hash
271
+
operator: In
272
+
```
273
+
274
+
#### Story 2
275
+
276
+
Let's say all Pods on each tenant get `tenant` label via a controller or a manifest management tool like Helm.
277
+
Although the value of `tenant` label is unknown when composing the workload's manifest, the cluster admin still wants to achieve exclusive 1:1 tenant to domain placement.
278
+
279
+
By applying the following affinity globally using a mutating webhook, the cluster admin can ensure that the Pods from the same tenant will land on the same domain exclusively, meaning Pods from other `tenants` won't land on the same domain.
280
+
281
+
```yaml
282
+
affinity:
283
+
podAffinity: # ensures the pods of this tenant land on the same node pool
284
+
requiredDuringSchedulingIgnoredDuringExecution:
285
+
- matchLabelSelectors:
286
+
- key: tenant
287
+
operator: In
288
+
topologyKey: node-pool
289
+
podAntiAffinity: # ensures only Pods from this tenant lands on the same node pool
290
+
requiredDuringSchedulingIgnoredDuringExecution:
291
+
- matchLabelSelectors:
292
+
- key: tenant
293
+
operator: NotIn
294
+
labelSelector:
295
+
matchExpressions:
296
+
- key: tenant
297
+
operator: Exists
298
+
topologyKey: node-pool
268
299
```
269
300
270
301
### Notes/Constraints/Caveats (Optional)
@@ -297,7 +328,7 @@ Consider including folks who also work outside the SIG or subproject.
297
328
-->
298
329
299
330
In addition to using `pod-template-hash` added by the Deployment controller,
300
-
users can also provide the customized key in `MatchLabelKeys` to identify
331
+
users can also provide the customized key in `MatchLabelSelectors.Key` to identify
301
332
which pods should be grouped. If so, the user needs to ensure that it is
302
333
correct and not duplicated with other unrelated workloads.
303
334
@@ -310,30 +341,89 @@ required) or even code snippets. If there's any ambiguity about HOW your
310
341
proposal will be implemented, this is the place to discuss them.
311
342
-->
312
343
313
-
A new optional field `MatchLabelKeys` is introduced to `PodAffinityTerm`.
344
+
A new optional field `MatchLabelSelectors` is introduced to `PodAffinityTerm`.
314
345
315
346
```go
347
+
type LabelSelectorOperator string
348
+
349
+
type MatchLabelSelector struct {
350
+
// Key is used to lookup value from the incoming pod labels,
351
+
// and that key-value label is merged with `LabelSelector`.
352
+
// Key that doesn't exist in the incoming pod labels will be ignored.
353
+
Key string
354
+
// Operator defines how key-value, fetched via the above `Keys`, is merged into LabelSelector.
355
+
// If Operator is `In`, `key in (value)` is merged with LabelSelector.
356
+
// If Operator is `NotIn`, `key notin (value)` is merged with LabelSelector.
357
+
//
358
+
// +optional
359
+
Operator LabelSelectorOperator
360
+
}
361
+
316
362
type PodAffinityTerm struct {
317
-
LabelSelector *metav1.LabelSelector
318
-
Namespaces []string
319
-
TopologyKey string
320
-
NamespaceSelector *metav1.LabelSelector
321
-
322
-
// MatchLabelKeys is a set of pod label keys to select which pods will
323
-
// be taken into consideration. The keys are used to lookup values from the
324
-
// incoming pod labels, those key-value labels are ANDed with `LabelSelector`
325
-
// to select the group of existing pods which pods will be taken into consideration
326
-
// for the incoming pod's pod (anti) affinity. Keys that don't exist in the incoming
327
-
// pod labels will be ignored. The default value is empty.
328
-
// +optional
329
-
MatchLabelKeys []string
363
+
LabelSelector *metav1.LabelSelector
364
+
Namespaces []string
365
+
TopologyKey string
366
+
NamespaceSelector *metav1.LabelSelector
367
+
368
+
// MatchLabelSelectors is a set of pod label keys to select the group of existing pods
369
+
// which pods will be taken into consideration for the incoming pod's pod (anti) affinity.
370
+
// The default value is empty.
371
+
// +optional
372
+
MatchLabelSelectors []strinMatchLabelSelectorg
330
373
}
331
374
```
332
375
333
-
The inter-Pod Affinity plugin will obtain the labels from the pod
334
-
labels by the keys in `MatchLabelKeys`. The obtained labels will be merged
335
-
to `LabelSelector` of `PodAffinityTerm` to filter and group pods.
336
-
The pods belonging to the same group will be evaluated.
376
+
When a Pod is created, kube-apiserver will obtain the labels from the pod
377
+
labels by the key in `MatchLabelSelectors.Key`, and merge to `LabelSelector` of `PodAffinityTerm` depending on `Operator`:
378
+
- If Operator is `In`, `key in (value)` is merged with LabelSelector.
379
+
- If Operator is `NotIn`, `key notin (value)` is merged with LabelSelector.
380
+
381
+
Only `In` and `NotIn` are supported in `Operator` of `MatchLabelSelectors`,
382
+
and kube-apiserver rejects other operators (`Exist` and `DoesNotExist`).
383
+
384
+
For example, when this sample Pod is created,
385
+
386
+
```yaml
387
+
apiVersion: v1
388
+
kind: Pod
389
+
metadata:
390
+
name: sample
391
+
namespace: sample-namespace
392
+
labels:
393
+
tenant: tenant-a
394
+
...
395
+
affinity:
396
+
podAntiAffinity:
397
+
requiredDuringSchedulingIgnoredDuringExecution:
398
+
- matchLabelSelectors:
399
+
- key: tenant
400
+
operator: NotIn
401
+
labelSelector:
402
+
matchExpressions:
403
+
- key: tenant
404
+
operator: Exists
405
+
topologyKey: node-pool
406
+
```
407
+
408
+
kube-apiserver modifies the labelSelector like the following:
- Components depending on the feature gate: `kube-apiserver`
595
688
- [ ] Other
596
689
597
690
###### Does enabling the feature change any default behavior?
@@ -617,9 +710,9 @@ NOTE: Also set `disable-supported` to `true` or `false` in `kep.yaml`.
617
710
-->
618
711
619
712
The feature can be disabled in Alpha and Beta versions
620
-
by restarting kube-apiserver and kube-scheduler with the feature-gate off.
713
+
by restarting kube-apiserver the feature-gate off.
621
714
In terms of Stable versions, users can choose to opt-out by not setting the
622
-
`MatchLabelKeys` field.
715
+
`MatchLabelSelectors`field.
623
716
624
717
625
718
###### What happens if we reenable the feature if it was previously rolled back?
@@ -915,6 +1008,60 @@ not need to be as detailed as the proposal, but should include enough
915
1008
information to express the idea and why it was not acceptable.
916
1009
-->
917
1010
1011
+
### implement as a new enum in LabelSelector
1012
+
1013
+
Implement new enum values `ExistsWithSameValue` and `ExistsWithDifferentValue` in LabelSelector.
1014
+
- `ExistsWithSameValue`: look up the label value keyed with the key specified in the labelSelector, and match with Pods which have the same label value on the key.
1015
+
- `ExistsWithDifferentValue`: look up the label value keyed with the key specified in the labelSelector, and match with Pods which have the same label key, but with the different label value on the key.
1016
+
1017
+
But, this idea is rejected because:
1018
+
- it's difficult to prepare all existing clients to handle new enums.
1019
+
- labelSelector is going to be required to know who has this labelSelector to handle these new enums, and it's a tough road to change all code handling labelSelector.
1020
+
1021
+
#### Example
1022
+
1023
+
a set of Pods A doesn't want to co-exist with other set of Pods, but want the set of Pods A co-located
0 commit comments