Skip to content

Commit 5eea480

Browse files
authored
Update KEP 3157 (watch-list) for milestone 1.29 (#4207)
* Update to the latest KEP template * Update KEP 3157 (watch-list) for milestone 1.29
1 parent e742c68 commit 5eea480

File tree

3 files changed

+209
-16
lines changed

3 files changed

+209
-16
lines changed
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,5 @@
11
kep-number: 3157
22
alpha:
33
approver: "@deads2k"
4+
beta:
5+
approver: "@deads2k"

keps/sig-api-machinery/3157-watch-list/README.md

+201-14
Original file line numberDiff line numberDiff line change
@@ -99,6 +99,8 @@ tags, and then generate with `hack/update-toc.sh`.
9999
- [e2e tests](#e2e-tests)
100100
- [Graduation Criteria](#graduation-criteria)
101101
- [Alpha](#alpha)
102+
- [Beta](#beta)
103+
- [GA](#ga)
102104
- [Upgrade / Downgrade Strategy](#upgrade--downgrade-strategy)
103105
- [Version Skew Strategy](#version-skew-strategy)
104106
- [Production Readiness Review Questionnaire](#production-readiness-review-questionnaire)
@@ -660,8 +662,22 @@ We expect no non-infra related flakes in the last month as a GA graduation crite
660662
- The Feature is implemented behind `WatchList` feature flag
661663
- Initial e2e tests completed and enabled
662664
- Scalability/Performance tests confirm gains of this feature
665+
- Add support for watchlist to APF
666+
667+
#### Beta
663668
- Metrics are added to the kube-apiserver (see the [monitoring-requirements](#monitoring-requirements) section for more details)
664669
- Implement `SendInitialEvents` for `watch` requests in the etcd storage implementation
670+
- The feature is enabled for kube-apiserver and kube-controller-manager
671+
- The generic feature gate mechanism is implemented in client-go.
672+
It will be used to enable a new functionality for reflectors/informers.
673+
- Implement a consistency check detector that will compare data received through a new watchlist request
674+
with data obtained through a standard list request. The detector will be added to the reflector
675+
and activated when an environment variable is set. The environment variable will be set for all jobs run in the Kube CI.
676+
677+
#### GA
678+
- Consider using WatchProgressRequester to request progress notifications directly from etcd.
679+
This mechanism was developed in [Consistent Reads from Cache KEP](https://github.com/kubernetes/enhancements/tree/master/keps/sig-api-machinery/2340-Consistent-reads-from-cache#use-requestprogress-to-enable-automatic-watch-updates)
680+
and could reduce the overall latency for watchlist requests.
665681

666682
<!--
667683
**Note:** *Not required until targeted at a release.*
@@ -745,9 +761,9 @@ components? What are the guarantees? Make sure this is in the test plan.
745761
746762
Consider the following in developing a version skew strategy for this
747763
enhancement:
748-
- Does this enhancement involve coordinating behavior in the control plane and
749-
in the kubelet? How does an n-2 kubelet without this feature available behave
750-
when this feature is used?
764+
- Does this enhancement involve coordinating behavior in the control plane and nodes?
765+
- How does an n-3 kubelet or kube-proxy without this feature available behave when this feature is used?
766+
- How does an n-1 kube-controller-manager or kube-scheduler without this feature available behave when this feature is used?
751767
- Will any other components on the node change? For example, changes to CSI,
752768
CRI or CNI may require updating that component before the kubelet.
753769
-->
@@ -797,23 +813,33 @@ Pick one of these and delete the rest.
797813

798814
- [x] Feature gate (also fill in values in `kep.yaml`)
799815
- Feature gate name: WatchList
800-
- Components depending on the feature gate: the kube-apiserver
816+
- Components depending on the feature gate:
817+
- kube-apiserver
818+
- Feature gate name: WatchListClient (the actual name might be different because it hasn't been added yet)
819+
- Components depending on the feature gate:
820+
- kube-controller-manager via client-go library
801821
- [ ] Other
802822
- Describe the mechanism:
803823
- Will enabling / disabling the feature require downtime of the control
804-
plane?
824+
plane?
805825
- Will enabling / disabling the feature require downtime or reprovisioning
806826
of a node?
807827

808828
###### Does enabling the feature change any default behavior?
809-
No.
829+
No. Because users must enable the feature on the client side (client-go).
810830
<!--
811831
Any change of default behavior may be surprising to users or break existing
812832
automations, so be extremely careful here.
813833
-->
814834

815835
###### Can the feature be disabled once it has been enabled (i.e. can we roll back the enablement)?
816-
Yes, in that scenario the kube-apiserver will reject WATCH requests with the new query parameter forcing informers to fall back to the previous mode.
836+
Yes, by disabling `WatchList` FeatureGate for `kube-apiserver`.
837+
In this case `kube-apiserver` will reject WATCH requests with the new query parameter forcing informers to fall back to the previous mode.
838+
839+
Yes, by disabling `WatchListClient` FeatureGate for `kube-controller-manager`.
840+
In this case informers will follow standard LIST/WATCH semantics.
841+
842+
Note that for safety reasons, reflectors/informers will always fallback to a regular LIST operation regardless of the error that occurred.
817843
<!--
818844
Describe the consequences on existing workloads (e.g., if this is a runtime
819845
feature, can it break the existing applications?).
@@ -825,7 +851,8 @@ NOTE: Also set `disable-supported` to `true` or `false` in `kep.yaml`.
825851
The expected behavior of the feature will be restored.
826852

827853
###### Are there any tests for feature enablement/disablement?
828-
No.
854+
Yes. There is [an integration test](https://github.com/kubernetes/kubernetes/pull/120971) that verifies the fallback mechanism
855+
of the reflector when interacting with servers that has the `WatchList` feature enabled/disabled.
829856
<!--
830857
The e2e framework does not currently support enabling or disabling feature
831858
gates. However, unit tests in each component dealing with managing data, created
@@ -839,7 +866,13 @@ conversion tests if API types are being modified.
839866
This section must be completed when targeting beta to a release.
840867
-->
841868
###### How can a rollout or rollback fail? Can it impact already running workloads?
869+
Feature does not have a direct impact on rollout/rollback.
842870

871+
However, faulty behavior of a feature can result in incorrect functioning
872+
of components that rely on that feature. For the Beta version, we plan to enable it exclusively for kube-controller-manager.
873+
The main issues can arise during the initial informer synchronization, which may result in controller failures.
874+
875+
Furthermore, if data consistency issues arise, such as missing data, the controllers simply do not consider the missing data.
843876
<!--
844877
Try to be as paranoid as possible - e.g., what if some components will restart
845878
mid-rollout?
@@ -852,21 +885,154 @@ will rollout across nodes.
852885

853886
###### What specific metrics should inform a rollback?
854887

888+
`apiserver_terminated_watchers_total` - a large number of terminated watchers might indicate synchronization issues.
889+
For example, we have some client-side error where we're not getting data from the server. Or we have a server-side error, and the buffer is getting cluttered.
890+
891+
`apiserver_request_duration_second_bucket` - in general, a large number of "short" watch requests can indicate synchronization issues.
892+
893+
`apiserver_watch_list_duration_seconds` - the absence of this metric may indicate that the client did not receive a special bookmark.
894+
The issue here could be that the server never sent it due to an error or didn't even receive it from the database.
895+
896+
`apiserver_watch_list_duration_seconds` - long synchronization times may indicate that the server is lagging behind etcd.
897+
Forr example, not receiving progress notifications from the database frequently.
898+
899+
`apiserver_watch_cache_lag` - tells how far behind the server is compared to the database.
900+
Significant discrepancies affect the times for full data synchronization.
901+
902+
A good metric can also be the number of kube-controller-manager restarts.
903+
Which may indicate issues with informers synchronization.
904+
855905
<!--
856906
What signals should users be paying attention to when the feature is young
857907
that might indicate a serious problem?
858908
-->
859909

860910
###### Were upgrade and rollback tested? Was the upgrade->downgrade->upgrade path tested?
911+
Upgrade->downgrade->upgrade testing was done manually using the following steps:
912+
913+
Build and run Kubernetes from the master branch using Kind.
914+
```
915+
kind build node-image --arch "arm64"
916+
917+
kind create cluster --image kindest/node:latest
918+
919+
kubectl get no
920+
NAME STATUS ROLES AGE VERSION
921+
kind-control-plane Ready control-plane 26s v1.29.0-alpha.1.47+f8571dabf79717
922+
```
923+
924+
Check if the `kube-apiserver`(aka `kas`) has recorded the watchlist latency metric.
925+
```
926+
kubectl get --raw '/metrics' | grep "apiserver_watch_list_duration_seconds"
927+
# HELP apiserver_watch_list_duration_seconds [ALPHA] Response latency distribution in seconds for watch list requests broken by group, version, resource and scope.
928+
# TYPE apiserver_watch_list_duration_seconds histogram
929+
930+
apiserver_watch_list_duration_seconds_bucket{group="",resource="configmaps",scope="cluster",version="v1",le="6"} 1
931+
```
932+
933+
Disable the `WatchList` feature gate for the `kas` by editing the static pod manifest directly.
934+
```
935+
docker exec -ti kind-control-plane bash
936+
vim /etc/kubernetes/manifests/kube-apiserver.yaml
937+
```
938+
and pass `- --feature-gates=WatchList=false` to the `kas` container.
939+
940+
Check if the `kas` has not recorded the watchlist latency metric.
941+
```
942+
kubectl get --raw '/metrics' | grep "apiserver_watch_list_duration_seconds"
943+
```
944+
945+
Check if `kube-controler-manger`(aka `kcm`) is running.
946+
```
947+
kubectl get po -n kube-system
948+
NAME READY STATUS RESTARTS AGE
949+
950+
kube-controller-manager-kind-control-plane 1/1 Running 1 (44s ago) 3m28s
951+
```
952+
953+
Check if informers used by the `kcm` fell back to standard LIST/WATCH semantics.
954+
```
955+
kubectl logs -n kube-system kube-controller-manager-kind-control-plane | grep -e "watch-list"
956+
W1002 09:11:40.656641 1 reflector.go:340] The watch-list feature is not supported by the server, falling back to the previous LIST/WATCH semantics
957+
958+
```
959+
960+
Disable the `WatchList` feature gate for the `kcm` by editing the static pod manifest directly.
961+
```
962+
docker exec -ti kind-control-plane bash
963+
vim /etc/kubernetes/manifests/kube-controller-manager.yaml
964+
```
965+
and pass `- --feature-gates=WatchList=false` to the `kcm` container.
861966

967+
Check if `kcm` is running.
968+
```
969+
kubectl get po -n kube-system
970+
NAME READY STATUS RESTARTS AGE
971+
972+
kube-controller-manager-kind-control-plane 1/1 Running 0 12s
973+
```
974+
975+
Check if the `kas` has not recorded the watchlist latency metric.
976+
```
977+
kubectl get --raw '/metrics' | grep "apiserver_watch_list_duration_seconds"
978+
```
979+
980+
Check if there are no traces of informers for `kcm` falling back to standard LIST/WATCH semantics.
981+
```
982+
kubectl logs -n kube-system kube-controller-manager-kind-control-plane | grep -e "watch-list"
983+
```
984+
985+
Enable the `WatchList` feature gate for the `kas` by editing the static pod manifest directly.
986+
```
987+
docker exec -ti kind-control-plane bash
988+
vim /etc/kubernetes/manifests/kube-apiserver.yaml
989+
```
990+
and remove `- --feature-gates=WatchList=false` from the `kas` container.
991+
992+
Check if `kcm` is running.
993+
```
994+
kubectl get po -n kube-system
995+
NAME READY STATUS RESTARTS AGE
996+
997+
kube-controller-manager-kind-control-plane 1/1 Running 1 (22s ago) 86s
998+
```
999+
1000+
Check if the `kas` has not recorded the watchlist latency metric.
1001+
```
1002+
kubectl get --raw '/metrics' | grep "apiserver_watch_list_duration_seconds"
1003+
```
1004+
1005+
Enable the `WatchList` feature gate for the `kcm` by editing the static pod manifest directly.
1006+
```
1007+
docker exec -ti kind-control-plane bash
1008+
vim /etc/kubernetes/manifests/kube-controller-manager.yaml
1009+
```
1010+
and remove `- --feature-gates=WatchList=false` for the `cm` container.
1011+
1012+
Check if `kcm` is running.
1013+
```
1014+
kubectl get po -n kube-system
1015+
NAME READY STATUS RESTARTS AGE
1016+
1017+
kube-controller-manager-kind-control-plane 1/1 Running 0 13s
1018+
```
1019+
1020+
Check if the `kas` has recorded the watchlist latency metric.
1021+
```
1022+
kubectl get --raw '/metrics' | grep "apiserver_watch_list_duration_seconds"
1023+
# HELP apiserver_watch_list_duration_seconds [ALPHA] Response latency distribution in seconds for watch list requests broken by group, version, resource and scope.
1024+
# TYPE apiserver_watch_list_duration_seconds histogram
1025+
1026+
apiserver_watch_list_duration_seconds_bucket{group="",resource="configmaps",scope="cluster",version="v1",le="6"} 1
1027+
```
8621028
<!--
8631029
Describe manual testing that was done and the outcomes.
8641030
Longer term, we may want to require automated upgrade/rollback tests, but we
8651031
are missing a bunch of machinery and tooling and can't do that now.
8661032
-->
8671033

8681034
###### Is the rollout accompanied by any deprecations and/or removals of features, APIs, fields of API types, flags, etc.?
869-
1035+
No.
8701036
<!--
8711037
Even if applying deprecation policies, they may still surprise some users.
8721038
-->
@@ -878,6 +1044,7 @@ This section must be completed when targeting beta to a release.
8781044
-->
8791045

8801046
###### How can an operator determine if the feature is in use by workloads?
1047+
If `apiserver_watch_list_duration_seconds` metric has some data then this feature is in use.
8811048

8821049
<!--
8831050
Ideally, this should be a metric. Operations against the Kubernetes API (e.g.,
@@ -887,6 +1054,15 @@ logs or events for this purpose.
8871054

8881055
###### How can someone using this feature know that it is working for their instance?
8891056

1057+
Assuming that historical data is available then comparing the number of LIST and WATCH requests to the server will tell whether the feature was enabled.
1058+
When this feature is enabled, the number of LIST requests will be smaller.
1059+
The difference primarily arises from switching informers to a new mode of operation.
1060+
1061+
Checking whether `WatchListClient` FeatureGate has been set for the given component.
1062+
1063+
Knowing the `username` for a component, the audit logs could be examined to see whether `sendInitialEvents=true` in the `requestURI` has been set for that user.
1064+
1065+
Scanning the component's logs for the phrase `Reflector WatchList`. For requests lasting more than 10 seconds, traces will be reported.
8901066
<!--
8911067
For instance, if this is a pod-related feature, it should be possible to determine if the feature is functioning properly
8921068
for each individual pod.
@@ -905,6 +1081,7 @@ Recall that end users cannot usually observe component logs or access metrics.
9051081
- Details:
9061082

9071083
###### What are the reasonable SLOs (Service Level Objectives) for the enhancement?
1084+
None have been defined yet.
9081085

9091086
<!--
9101087
This is your opportunity to define what "normal" quality of service looks like
@@ -928,16 +1105,15 @@ Pick one more of these and delete the rest.
9281105
-->
9291106

9301107
- [ ] Metrics
931-
- Metric name: apiserver_cache_watcher_buffer_length (histogram, what was the buffer size)
932-
- Metric name: apiserver_watch_cache_lag (histogram, for how far the cache is behind the expected RV)
9331108
- Metric name: apiserver_terminated_watchers_total (counter, already defined, needs to be updated (by an attribute) so that we count closed watch requests due to an overfull buffer in the new mode)
1109+
- Metric name: apiserver_watch_list_duration_seconds (histogram, measures latency of watch-list requests)
9341110
- [Optional] Aggregation method:
9351111
- Components exposing the metric:
9361112
- [ ] Other (treat as last resort)
9371113
- Details:
9381114

9391115
###### Are there any missing metrics that would be useful to have to improve observability of this feature?
940-
1116+
No.
9411117
<!--
9421118
Describe the metrics themselves and the reasons why they weren't added (e.g., cost,
9431119
implementation difficulties, etc.).
@@ -950,7 +1126,7 @@ This section must be completed when targeting beta to a release.
9501126
-->
9511127

9521128
###### Does this feature depend on any specific services running in the cluster?
953-
1129+
No.
9541130
<!--
9551131
Think about both cluster-level services (e.g. metrics-server) as well
9561132
as node-level agents (e.g. specific version of CRI). Focus on external or
@@ -1067,8 +1243,18 @@ details). For now, we leave it here.
10671243

10681244
###### How does this feature react if the API server and/or etcd is unavailable?
10691245

1070-
###### What are other known failure modes?
1246+
When the kube-apiserver is unavailable then this feature will also be unavailable.
10711247

1248+
When etcd is unavailable, requests attempting to retrieve the most recent state of the cluster will fail.
1249+
1250+
###### What are other known failure modes?
1251+
- kube-controller-manager is unable to start.
1252+
- Detection: How can it be detected via metrics? Examine the prometheus `up` time series or examine the pod status or the number of restarts.
1253+
- Mitigations: What can be done to stop the bleeding, especially for already
1254+
running user workloads? Disable the feature. Pass `WatchList=false` to `feature-gates` command line flag.
1255+
- Diagnostics: What are the useful log messages and their required logging
1256+
levels that could help debug the issue? N/A
1257+
- Testing: Are there any tests for failure mode? If not, describe why. Yes, if kube-controller-manager is unable to start then a lot of existing e2e tests will fail.
10721258
<!--
10731259
For each of them, fill in the following information by copying the below template:
10741260
- [Failure mode brief description]
@@ -1083,6 +1269,7 @@ For each of them, fill in the following information by copying the below templat
10831269
-->
10841270

10851271
###### What steps should be taken if SLOs are not being met to determine the problem?
1272+
None SLOs have been defined for this feature yet.
10861273

10871274
## Implementation History
10881275
The KEP was proposed on 2022-01-14

keps/sig-api-machinery/3157-watch-list/kep.yaml

+6-2
Original file line numberDiff line numberDiff line change
@@ -15,23 +15,27 @@ approvers:
1515
- "@lavalamp"
1616

1717
# The target maturity stage in the current dev cycle for this KEP.
18-
stage: alpha
18+
stage: beta
1919

2020
# The most recent milestone for which work toward delivery of this KEP has been
2121
# done. This can be the current (upcoming) milestone, if it is being actively
2222
# worked on.
23-
latest-milestone: "v1.28"
23+
latest-milestone: "v1.29"
2424

2525
# The milestone at which this feature was, or is targeted to be, at each stage.
2626
milestone:
2727
alpha: "v1.27"
28+
beta: "v1.29"
2829

2930
# The following PRR answers are required at alpha release
3031
# List the feature gate name and the components for which it must be enabled
3132
feature-gates:
3233
- name: WatchList
3334
components:
3435
- kube-apiserver
36+
- name: WatchListClient (the actual name might be different because it hasn't been added yet))
37+
components:
38+
- kube-controller-manager via client-go library
3539
disable-supported: true
3640

3741
# The following PRR answers are required at beta release

0 commit comments

Comments
 (0)