Skip to content

Commit 68f741d

Browse files
authored
Merge pull request #4001 from sanposhiho/matchlabelselector
fix(kep-3633): redesign matchLabelKeys as matchLabelSelectors to introduce Operator field
2 parents 19a6057 + d5d098d commit 68f741d

File tree

2 files changed

+193
-46
lines changed

2 files changed

+193
-46
lines changed

keps/sig-scheduling/3633-matchlabelkeys-to-podaffinity/README.md renamed to keps/sig-scheduling/3633-matchlabelselectors-to-podaffinity/README.md

+186-39
Original file line numberDiff line numberDiff line change
@@ -58,7 +58,7 @@ If none of those approvers are still appropriate, then changes to that list
5858
should be approved by the remaining approvers and/or the owning SIG (or
5959
SIG Architecture for cross-cutting KEPs).
6060
-->
61-
# KEP-3633: Introduce MatchLabelKeys to PodAffinity and PodAntiAffinity
61+
# KEP-3633: Introduce MatchLabelSelectors to PodAffinity and PodAntiAffinity
6262

6363
<!--
6464
This is the title of your KEP. Keep it short, simple, and descriptive. A good
@@ -85,6 +85,7 @@ tags, and then generate with `hack/update-toc.sh`.
8585
- [Proposal](#proposal)
8686
- [User Stories (Optional)](#user-stories-optional)
8787
- [Story 1](#story-1)
88+
- [Story 2](#story-2)
8889
- [Notes/Constraints/Caveats (Optional)](#notesconstraintscaveats-optional)
8990
- [Risks and Mitigations](#risks-and-mitigations)
9091
- [Design Details](#design-details)
@@ -109,6 +110,8 @@ tags, and then generate with `hack/update-toc.sh`.
109110
- [Implementation History](#implementation-history)
110111
- [Drawbacks](#drawbacks)
111112
- [Alternatives](#alternatives)
113+
- [implement as a new enum in LabelSelector](#implement-as-a-new-enum-in-labelselector)
114+
- [Example](#example)
112115
- [Infrastructure Needed (Optional)](#infrastructure-needed-optional)
113116
<!-- /toc -->
114117

@@ -175,7 +178,7 @@ updates.
175178
[documentation style guide]: https://github.com/kubernetes/community/blob/master/contributors/guide/style-guide.md
176179
-->
177180

178-
This KEP proposes introducing a complementary field `MatchLabelKeys` to `PodAffinityTerm`.
181+
This KEP proposes introducing a complementary field `MatchLabelSelectors` to `PodAffinityTerm`.
179182
This enables users to finely control the scope where Pods are expected to co-exist (PodAffinity)
180183
or not (PodAntiAffinity), on top of the existing `LabelSelector`.
181184

@@ -207,7 +210,7 @@ The same issue applies to other scheduling directives as well. For example, Matc
207210

208211
### Goals
209212

210-
- Introduce `MatchLabelKeys` in `PodAffinityTerm` to let users define the scope where Pods are evaluated in required and preferred Pod(Anti)Affinity.
213+
- Introduce `MatchLabelSelectors` in `PodAffinityTerm` to let users define the scope where Pods are evaluated in required and preferred Pod(Anti)Affinity.
211214

212215
### Non-Goals
213216

@@ -216,7 +219,7 @@ What is out of scope for this KEP? Listing non-goals helps to focus discussion
216219
and make progress.
217220
-->
218221

219-
- Apply additional internal labels when evaluating `MatchLabelKeys`
222+
- Apply additional internal labels when evaluating `MatchLabelSelectors`
220223

221224
## Proposal
222225

@@ -245,7 +248,7 @@ and they want only replicas from the same replicaset to be evaluated.
245248

246249
The deployment controller adds [pod-template-hash](https://kubernetes.io/docs/concepts/workloads/controllers/deployment/#pod-template-hash-label) to underlying ReplicaSet and thus every Pod created from Deployment carries the hash string.
247250

248-
Therefore, users can use `pod-template-hash` in `MatchlabelKeys` to inform the scheduler to only evaluate Pods with the same `pod-template-hash` value.
251+
Therefore, users can use `pod-template-hash` in `matchLabelSelector.Key` to inform the scheduler to only evaluate Pods with the same `pod-template-hash` value.
249252

250253
```yaml
251254
apiVersion: apps/v1
@@ -263,8 +266,36 @@ metadata:
263266
values:
264267
- database
265268
topologyKey: topology.kubernetes.io/zone
266-
matchlabelKeys: # ADDED
267-
- pod-template-hash
269+
matchLabelSelectors: # ADDED
270+
- key: pod-template-hash
271+
operator: In
272+
```
273+
274+
#### Story 2
275+
276+
Let's say all Pods on each tenant get `tenant` label via a controller or a manifest management tool like Helm.
277+
Although the value of `tenant` label is unknown when composing the workload's manifest, the cluster admin still wants to achieve exclusive 1:1 tenant to domain placement.
278+
279+
By applying the following affinity globally using a mutating webhook, the cluster admin can ensure that the Pods from the same tenant will land on the same domain exclusively, meaning Pods from other `tenants` won't land on the same domain.
280+
281+
```yaml
282+
affinity:
283+
podAffinity: # ensures the pods of this tenant land on the same node pool
284+
requiredDuringSchedulingIgnoredDuringExecution:
285+
- matchLabelSelectors:
286+
- key: tenant
287+
operator: In
288+
topologyKey: node-pool
289+
podAntiAffinity: # ensures only Pods from this tenant lands on the same node pool
290+
requiredDuringSchedulingIgnoredDuringExecution:
291+
- matchLabelSelectors:
292+
- key: tenant
293+
operator: NotIn
294+
labelSelector:
295+
matchExpressions:
296+
- key: tenant
297+
operator: Exists
298+
topologyKey: node-pool
268299
```
269300

270301
### Notes/Constraints/Caveats (Optional)
@@ -297,7 +328,7 @@ Consider including folks who also work outside the SIG or subproject.
297328
-->
298329

299330
In addition to using `pod-template-hash` added by the Deployment controller,
300-
users can also provide the customized key in `MatchLabelKeys` to identify
331+
users can also provide the customized key in `MatchLabelSelectors.Key` to identify
301332
which pods should be grouped. If so, the user needs to ensure that it is
302333
correct and not duplicated with other unrelated workloads.
303334

@@ -310,30 +341,89 @@ required) or even code snippets. If there's any ambiguity about HOW your
310341
proposal will be implemented, this is the place to discuss them.
311342
-->
312343

313-
A new optional field `MatchLabelKeys` is introduced to `PodAffinityTerm`.
344+
A new optional field `MatchLabelSelectors` is introduced to `PodAffinityTerm`.
314345

315346
```go
347+
type LabelSelectorOperator string
348+
349+
type MatchLabelSelector struct {
350+
// Key is used to lookup value from the incoming pod labels,
351+
// and that key-value label is merged with `LabelSelector`.
352+
// Key that doesn't exist in the incoming pod labels will be ignored.
353+
Key string
354+
// Operator defines how key-value, fetched via the above `Keys`, is merged into LabelSelector.
355+
// If Operator is `In`, `key in (value)` is merged with LabelSelector.
356+
// If Operator is `NotIn`, `key notin (value)` is merged with LabelSelector.
357+
//
358+
// +optional
359+
Operator LabelSelectorOperator
360+
}
361+
316362
type PodAffinityTerm struct {
317-
LabelSelector *metav1.LabelSelector
318-
Namespaces []string
319-
TopologyKey string
320-
NamespaceSelector *metav1.LabelSelector
321-
322-
// MatchLabelKeys is a set of pod label keys to select which pods will
323-
// be taken into consideration. The keys are used to lookup values from the
324-
// incoming pod labels, those key-value labels are ANDed with `LabelSelector`
325-
// to select the group of existing pods which pods will be taken into consideration
326-
// for the incoming pod's pod (anti) affinity. Keys that don't exist in the incoming
327-
// pod labels will be ignored. The default value is empty.
328-
// +optional
329-
MatchLabelKeys []string
363+
LabelSelector *metav1.LabelSelector
364+
Namespaces []string
365+
TopologyKey string
366+
NamespaceSelector *metav1.LabelSelector
367+
368+
// MatchLabelSelectors is a set of pod label keys to select the group of existing pods
369+
// which pods will be taken into consideration for the incoming pod's pod (anti) affinity.
370+
// The default value is empty.
371+
// +optional
372+
MatchLabelSelectors []strinMatchLabelSelectorg
330373
}
331374
```
332375

333-
The inter-Pod Affinity plugin will obtain the labels from the pod
334-
labels by the keys in `MatchLabelKeys`. The obtained labels will be merged
335-
to `LabelSelector` of `PodAffinityTerm` to filter and group pods.
336-
The pods belonging to the same group will be evaluated.
376+
When a Pod is created, kube-apiserver will obtain the labels from the pod
377+
labels by the key in `MatchLabelSelectors.Key`, and merge to `LabelSelector` of `PodAffinityTerm` depending on `Operator`:
378+
- If Operator is `In`, `key in (value)` is merged with LabelSelector.
379+
- If Operator is `NotIn`, `key notin (value)` is merged with LabelSelector.
380+
381+
Only `In` and `NotIn` are supported in `Operator` of `MatchLabelSelectors`,
382+
and kube-apiserver rejects other operators (`Exist` and `DoesNotExist`).
383+
384+
For example, when this sample Pod is created,
385+
386+
```yaml
387+
apiVersion: v1
388+
kind: Pod
389+
metadata:
390+
name: sample
391+
namespace: sample-namespace
392+
labels:
393+
tenant: tenant-a
394+
...
395+
affinity:
396+
podAntiAffinity:
397+
requiredDuringSchedulingIgnoredDuringExecution:
398+
- matchLabelSelectors:
399+
- key: tenant
400+
operator: NotIn
401+
labelSelector:
402+
matchExpressions:
403+
- key: tenant
404+
operator: Exists
405+
topologyKey: node-pool
406+
```
407+
408+
kube-apiserver modifies the labelSelector like the following:
409+
410+
```diff
411+
affinity:
412+
podAntiAffinity:
413+
requiredDuringSchedulingIgnoredDuringExecution:
414+
- matchLabelSelectors:
415+
- key: tenant
416+
operator: NotIn
417+
labelSelector:
418+
matchExpressions:
419+
- key: tenant
420+
operator: Exists
421+
+ - key: tenant
422+
+ operator: NotIn
423+
+ values:
424+
+ - tenant-a
425+
topologyKey: node-pool
426+
```
337427
338428
### Test Plan
339429
@@ -394,9 +484,9 @@ https://storage.googleapis.com/k8s-triage/index.html
394484
-->
395485

396486
- These tests will be added.
397-
- `MatchLabelKeys` in `PodAffinity` (both in Filter and Score) works as expected.
398-
- `MatchLabelKeys` in `PodAntiAffinity` (both in Filter and Score) works as expected.
399-
- `MatchLabelKeys` with the feature gate enabled/disabled.
487+
- `MatchLabelSelectors` in `PodAffinity` (both in Filter and Score) works as expected.
488+
- `MatchLabelSelectors` in `PodAntiAffinity` (both in Filter and Score) works as expected.
489+
- `MatchLabelSelectors` with the feature gate enabled/disabled.
400490

401491
**Filter**
402492
- `k8s.io/kubernetes/test/integration/scheduler/filters/filters_test.go`: https://storage.googleapis.com/k8s-triage/index.html?test=TestPodTopologySpreadFilter
@@ -420,9 +510,12 @@ https://storage.googleapis.com/k8s-triage/index.html
420510
We expect no non-infra related flakes in the last month as a GA graduation criteria.
421511
-->
422512

423-
- These e2e tests will be added.
424-
- `MatchLabelKeys: ["pod-template-hash"]` in `PodAffinity` (both in Filter and Score) works as expected with a rolling upgrade scenario.
425-
- `MatchLabelKeys: ["pod-template-hash"]` in `PodAntiAffinity` (both in Filter and Score) works as expected with a rolling upgrade scenario.
513+
N/A
514+
515+
--
516+
517+
This feature doesn't introduce any new API endpoints and doesn't interact with other components.
518+
So, E2E tests doesn't add extra value to integration tests.
426519

427520
### Graduation Criteria
428521

@@ -491,7 +584,7 @@ in back-to-back releases.
491584
#### Alpha
492585

493586
- Feature implemented behind a feature flag
494-
- Unit tests and e2e tests are implemented
587+
- Unit tests and integration tests are implemented
495588
- No significant performance degradation is observed from the benchmark test
496589

497590
#### Beta
@@ -523,11 +616,11 @@ The previous PodAffinity/PodAntiAffinity behavior will not be broken. Users can
523616
their Pod specs as it is.
524617

525618
To use this enhancement, users need to enable the feature gate (during this feature is in the alpha.),
526-
and add `MatchLabelKeys` on their PodAffinity/PodAntiAffinity.
619+
and add `MatchLabelSelectors` on their PodAffinity/PodAntiAffinity.
527620

528621
**Downgrade**
529622

530-
kube-apiserver will ignore `MatchLabelKeys` in PodAffinity/PodAntiAffinity,
623+
kube-apiserver will ignore `MatchLabelSelectors` in PodAffinity/PodAntiAffinity,
531624
and thus, kube-scheduler will also do nothing with it.
532625

533626
### Version Skew Strategy
@@ -590,8 +683,8 @@ well as the [existing list] of feature gates.
590683
-->
591684

592685
- [x] Feature gate (also fill in values in `kep.yaml`)
593-
- Feature gate name: `MatchLabelKeysInPodAffinityAndPodAntiAffinity`
594-
- Components depending on the feature gate: `kube-scheduler`, `kube-apiserver`
686+
- Feature gate name: `MatchLabelSelectorsInPodAffinity`
687+
- Components depending on the feature gate: `kube-apiserver`
595688
- [ ] Other
596689

597690
###### Does enabling the feature change any default behavior?
@@ -617,9 +710,9 @@ NOTE: Also set `disable-supported` to `true` or `false` in `kep.yaml`.
617710
-->
618711

619712
The feature can be disabled in Alpha and Beta versions
620-
by restarting kube-apiserver and kube-scheduler with the feature-gate off.
713+
by restarting kube-apiserver the feature-gate off.
621714
In terms of Stable versions, users can choose to opt-out by not setting the
622-
`MatchLabelKeys` field.
715+
`MatchLabelSelectors` field.
623716

624717

625718
###### What happens if we reenable the feature if it was previously rolled back?
@@ -915,6 +1008,60 @@ not need to be as detailed as the proposal, but should include enough
9151008
information to express the idea and why it was not acceptable.
9161009
-->
9171010

1011+
### implement as a new enum in LabelSelector
1012+
1013+
Implement new enum values `ExistsWithSameValue` and `ExistsWithDifferentValue` in LabelSelector.
1014+
- `ExistsWithSameValue`: look up the label value keyed with the key specified in the labelSelector, and match with Pods which have the same label value on the key.
1015+
- `ExistsWithDifferentValue`: look up the label value keyed with the key specified in the labelSelector, and match with Pods which have the same label key, but with the different label value on the key.
1016+
1017+
But, this idea is rejected because:
1018+
- it's difficult to prepare all existing clients to handle new enums.
1019+
- labelSelector is going to be required to know who has this labelSelector to handle these new enums, and it's a tough road to change all code handling labelSelector.
1020+
1021+
#### Example
1022+
1023+
a set of Pods A doesn't want to co-exist with other set of Pods, but want the set of Pods A co-located
1024+
1025+
```yaml
1026+
spec:
1027+
affinity:
1028+
podAffinity:
1029+
requiredDuringSchedulingIgnoredDuringExecution:
1030+
- labelSelector:
1031+
matchExpressions:
1032+
- key: pod-set
1033+
operator: ExistsWithSameValue
1034+
topologyKey: kubernetes.io/hostname
1035+
podAntiAffinity:
1036+
requiredDuringSchedulingIgnoredDuringExecution:
1037+
- labelSelector:
1038+
matchExpressions:
1039+
- key: pod-set
1040+
operator: ExistsWithDifferentValue
1041+
topologyKey: kubernetes.io/hostname
1042+
```
1043+
1044+
smooth rolling upgrade for PodAntiAffinity:
1045+
1046+
```yaml
1047+
spec:
1048+
affinity:
1049+
podAntiAffinity:
1050+
requiredDuringSchedulingIgnoredDuringExecution:
1051+
- labelSelector:
1052+
matchExpressions:
1053+
- key: app
1054+
operator: In
1055+
values:
1056+
- pause
1057+
topologyKey: kubernetes.io/hostname
1058+
- labelSelector:
1059+
matchExpressions:
1060+
- key: pod-template-hash
1061+
operator: ExistsWithSameValue
1062+
topologyKey: kubernetes.io/hostname
1063+
```
1064+
9181065
## Infrastructure Needed (Optional)
9191066

9201067
<!--

keps/sig-scheduling/3633-matchlabelkeys-to-podaffinity/kep.yaml renamed to keps/sig-scheduling/3633-matchlabelselectors-to-podaffinity/kep.yaml

+7-7
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
title: Introduce MatchLabelKeys to PodAffinity and PodAntiAffinity
1+
title: Introduce MatchLabelSelectors to PodAffinity and PodAntiAffinity
22
kep-number: 3633
33
authors:
44
- "@sanposhiho"
@@ -15,15 +15,15 @@ see-also:
1515

1616
stage: alpha
1717

18-
latest-milestone: "v1.27"
18+
latest-milestone: "v1.28"
1919

2020
milestone:
21-
alpha: "v1.27"
22-
beta: "v1.28"
23-
stable: "v1.30"
21+
alpha: "v1.28"
22+
beta: "v1.29"
23+
stable: "v1.31"
2424

2525
feature-gates:
26-
- name: MatchLabelKeysInPodAffinity
26+
- name: MatchLabelSelectorsInPodAffinity
2727
components:
28-
- kube-scheduler
28+
- kube-apiserver
2929
disable-supported: true

0 commit comments

Comments
 (0)