- Release Signoff Checklist
- Summary
- Motivation
- Proposal
- Design Details
- Production Readiness Review Questionnaire
- Implementation History
- Drawbacks
- Alternatives
- Infrastructure Needed (Optional)
Items marked with (R) are required prior to targeting to a milestone / release.
- (R) Enhancement issue in release milestone, which links to KEP dir in kubernetes/enhancements (not the initial KEP PR)
- (R) KEP approvers have approved the KEP status as
implementable
- (R) Design details are appropriately documented
- (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input (including test refactors)
- (R) Graduation criteria is in place
- (R) Production readiness review completed
- (R) Production readiness review approved
- "Implementation History" section is up-to-date for milestone
- User-facing documentation has been created in kubernetes/website, for publication to kubernetes.io
- Supporting documentation—e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes
This KEP proposes actions to reduce the surface area of secret-based service account tokens.
As BoundServiceAccountTokenVolume is GA in 1.22, pods’ service account tokens would be obtained via TokenRequest API and stored as projected volume. This change obviates the need for auto-generation of secret-based service account tokens which are less secure than the bound token.
- No auto-generation of secret-based service account token.
- Removal of unused auto-generated secret-based service account tokens
- Change the service account control loop in Token Controller to not auto-create secret for service accounts. At the same time, warn usage of auto-created secret-based service account tokens and encourage users to use TokenRequest API or manually-created secret-based service account tokens.
- Purge unused auto-generated secret-based service account tokens.
- A warning mechanism should be implemented to help users migrate.
- Auto generated secret-based service account tokens are those requested by Token Controller.
- Only clean up auto-generated tokens which:
- are not referenced by pods
- have not been used to authenticate for some duration (time duration or number of releases)
- To consult active usage of secret-based tokens, metric
serviceaccount_legacy_tokens_total
or audit annotationauthentication.k8s.io/legacy-token
could be used.
- When feature LegacyServiceAccountTokenNoAutoGeneration is Beta, consumers
depending directly on waiting for and reading tokens out of auto-generated
secrets might stop working. To mitigate,
- Emit warnings when using auto-generated token secrets.
- Publish pointers to TokenRequest or the manual secret request flow.
- When LegacyServiceAccountTokenCleanUp is Beta, usage of auto-generated
secret-based token might stop working. To mitigate,
- When Alpha, annouce the cleanup starts at Beta
- Emit warnings when using auto-generated token secrets.
- Add pointers of TokenRequest API and manually created tokens in the validation result.
- Marked the auto-generated tokens as invalid if they are not used for more
than the duration configured by
--legacy-service-account-token-clean-up-period
(one year by default). And allow the users to re-activate the invalid auto-generated tokens within the duration of--legacy-service-account-token-clean-up-period
before the tokens are finally deleted.
Token Controller stops auto-creating secret for service accounts. This feature would be enabled when it is implemented since no new code is added and this can make sure new clusters are in good state.
To facilitate LegacyServiceAccountTokenCleanUp, we implement a simple controller
in kube-apiserver that maintains a bool value configmap kube-apiserver-legacy-service-account-token-tracking
in kube-system
to
indicates if tracking is enabled in the cluster. It is similar to the existing
ClusterAuthenticationTrustController
that maintains configmap/extension-apiserver-authentication
.
-
When LegacyServiceAccountTokenTracking is enabled in all apiservers,
- the controller creates/updates the configmap
kube-apiserver-legacy-service-account-token-tracking
inkube-system
namespace that stores the current date assince
. - when a legacy token is used, issue a warning, update the label
kubernetes.io/legacy-token-last-used
on the secret at date granularity, and record in a metric.
- the controller creates/updates the configmap
-
When LegacyServiceAccountTokenTracking is disabled in any apiserver,
- the controller ensures the configmap in
kube-system
namespace is deleted in a periodic way.
- the controller ensures the configmap in
Token Controller starts to remove unused auto-generated secrets (secrets bi-directionally referenced by the service account) and not mounted by pods.
When this feature is Beta and enabled by default, mark the secrets as invalid iff it is over a sufficient period of time (one year by default) since last used. The period can be configured by cluster admins.
Determine the date that a given secret was last used:
kubernetes.io/legacy-token-last-used
if exists and aftersince
stored in the configmapkube-apiserver-legacy-service-account-token-tracking
.- defaults to
since
If kube-apiserver-legacy-service-account-token-tracking
is unavailable, no secret would be removed.
Mark the secrets as invalid and recover:
- The secrets will be added a label
kubernetes.io/legacy-token-invalid-since
, with the date as value. - If the users use the invalid tokens, in the Validate() function of
"kubernetes/pkg/serviceaccount/legacy.go", it will detect the usage of
invalid tokens and return the error information, telling the users to
re-activate the token by updating the label value or use the tokenrequest. At
the same time, the tokens will be updated with the new
kubernetes.io/legacy-token-last-used
date. - If the users don't use the invalid tokens, after the duration configured
through
--legacy-service-account-token-clean-up-period
(one year by default) since the tokens are marked as invalid, the tokens will be finally deleted.
[X] I/we understand the owners of the involved components may require updates to existing tests to make this code solid enough prior to committing the changes necessary to implement this enhancement.
None
k8s.io/kubernetes/pkg/controller/serviceaccount
:2022-06-13
-67.5%
- Previously auto-generated secret-based token that's used within the configurable cleanup duration will continue to work.
- Previously auto-generated secret-based token that's used after the configurable cleanup duration will be deleted.
- Secret-based tokens would not be auto-generated.
- Still able to explicitly request a secret-based token.
- The explicitly requested token would not be deleted.
Alpha | Beta | GA |
---|---|---|
- | 1.24 | 1.26 |
Since in 1.24, all pods should be admitted in 1.22+ and they should be using bound tokens. One release ahead to enable this features would help to reduce legacy tokens for security practices.
- Approved by PRR and scalability
- Any known bugs fixed
- Tests passing
- Approved by PRR and scalability
- Any known bugs fixed
- Tests passing
- Document and communicate the available actions that consumers of auto-generated secret-based tokens should take. (migrate to either use tokenrequest or explicitly request secret-based tokens)
Alpha | Beta | GA |
---|---|---|
1.26 | 1.27 | 1.28 |
- In use by multiple distributions
- RedHat
- Approved by PRR and scalability
- Any known bugs fixed
- Tests passing
- Approved by PRR and scalability
- Any known bugs fixed
- Tests passing
Alpha | Beta | GA |
---|---|---|
1.28 | 1.29 | 1.30 |
- In use by multiple distributions
- Approved by PRR and scalability
- Any known bugs fixed
- Tests passing
- Approved by PRR and scalability
- Any known bugs fixed
- Tests passing
The features can be enabled/disabled via the feature gates in upgrade / downgrade. What would be changed is described in "Feature Enablement and Rollback" section.
The only touches control plane, so version skew strategy is not applicable.
- Feature gate (also fill in values in
kep.yaml
)- Feature gate name: LegacyServiceAccountTokenNoAutoGeneration
- Components depending on the feature gate: kube-controller-manager
- Feature gate name: LegacyServiceAccountTokenTracking
- Components depending on the feature gate: kube-apiserver
- Feature gate name: LegacyServiceAccountTokenCleanUp:
- Components depending on the feature gate: kube-controller-manager
- LegacyServiceAccountTokenNoAutoGeneration: no legacy tokens are auto-generated.
- LegacyServiceAccountTokenTracking: legacy tokens would have new label and a configmap would be created in kube-system.
- LegacyServiceAccountTokenCleanUp: unused auto-generated legacy tokens will be removed.
yes for all feature gates.
- LegacyServiceAccountTokenNoAutoGeneration: the same as enable the feature. before the reenablement, Token Controller would create tokens for serviceaccounts while the feature was off.
- LegacyServiceAccountTokenTracking: during this sequence of operations,
only the label
kubernetes.io/legacy-token-last-used
is persisted, but there is no impact on the functionality of this feature. - LegacyServiceAccountTokenCleanUp: the same as enable the feature.
yes for all feature gates, covered by integration tests.
- LegacyServiceAccountTokenNoAutoGeneration: workloads that expect new auto-created secrets and extract tokens from them would fail.
- LegacyServiceAccountTokenTracking: no impact.
- LegacyServiceAccountTokenCleanUp: workloads that reads auto-generated secrets after those secrets being considered unused by this feature and removed.
serviceaccount_legacy_tokens_total
: cumulative stale service account tokens
used.
this metric is only informational and cannot deterministically tell a rollback is needed. there is no good way for us to detect scrapers of auto-generated secrets.
no since there is not much change between a upgrade and upgrade->downgrade->upgrade.
see section What happens if we reenable the feature if it was previously rolled back
.
Is the rollout accompanied by any deprecations and/or removals of features, APIs, fields of API types, flags, etc.?
no
check if there is a configmap kube-apiserver-legacy-service-account-token-tracking
in namespace kube-system
.
What are the SLIs (Service Level Indicators) an operator can use to determine the health of the service?
- Metrics
- Metric name:
serviceaccount_legacy_tokens_total
- [Optional] Aggregation method:
- Components exposing the metric: kube-apiserver
- Metric name:
LegacyServiceAccountTokenNoAutoGeneration and LegacyServiceAccountTokenCleanUp might cause few workloads to fail but there is no way for us to inject metric in workloads to detect this.
none. we expect the number recorded in the above metric going down in the long term.
Are there any missing metrics that would be useful to have to improve observability of this feature?
none.
no.
up to one additional write request per day could be made to auto-generated secrets still in use.
no.
no.
no. instead, use of the feature reduces the number of API objects.
Will enabling / using this feature result in increasing time taken by any operations covered by existing SLIs/SLOs?
no.
Will enabling / using this feature result in non-negligible increase of resource usage (CPU, RAM, disk, IO, ...) in any components?
no.
kube-apiserver-legacy-service-account-token-tracking
configmap cannout be created.- unable to remove unused auto-generated secrets.
- failure to create
kube-apiserver-legacy-service-account-token-tracking
config map- Detection: check if
kube-apiserver-legacy-service-account-token-tracking
exists inkube-system
- Mitigations: there is no impact on existing systems.
- Diagnostics: check kube-apiserver log.
- Testing: TBD.
- Detection: check if
n/a.