Filter multiattach errors #1559

mattcary · 2024-01-02T20:51:43Z

/kind bug

What this PR does / why we need it:
User misconfiguration causing multiattach errors clouds up our SLO.

Filter user misconfigured multiattach errors.

/assign @msau42

msau42 · 2024-01-02T21:18:09Z

pkg/common/utils.go

@@ -374,6 +377,17 @@ func isContextError(err error) (codes.Code, error) {
 	return codes.Unknown, fmt.Errorf("Not a context error: %w", err)
 }

+// isUserMultiAttachError returns an InvalidArgument if the error is
+// multi-attach detected from the API server. If we get this error from the API
+// server, it means that the kubelet doesn't know about the multiattch so it is


I think it is possible there could be a race condition in K8s that also triggers this.

For example, with StatefulSet, the replacement Pod is created with the same name when the old Pod is deleted. Pod deletion is blocked on pod-volume unmounting, but not node-level unmount or detach. So a replacement Pod can be created before we have successfully detached.

Yes, but in that cause the kubelet knows the volume is still attached and so the controller will figure out not to attach? https://github.com/kubernetes/kubernetes/blob/master/pkg/controller/volume/attachdetach/reconciler/reconciler.go#L341

(I think that the only time I've seen this error from GCP is when the user has made two static PVs that refer to the same disk --- at least that's the case in the current SLOs that are firing).

Ah I see, in the race condition I am thinking of, ADC prevents the attach call from getting down to the CSI driver. So filtering the error at the CSI driver level is fine.

msau42 · 2024-01-02T23:53:06Z

/lgtm
/approve

k8s-ci-robot · 2024-01-02T23:53:16Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: mattcary, msau42

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [mattcary,msau42]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

mattcary · 2024-01-03T00:10:28Z

/cherry-pick release-1.12

k8s-infra-cherrypick-robot · 2024-01-03T00:11:19Z

@mattcary: new pull request created: #1560

In response to this:

/cherry-pick release-1.12

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot assigned msau42 Jan 2, 2024

k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/bug Categorizes issue or PR as related to a bug. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Jan 2, 2024

k8s-ci-robot requested review from amacaskill and pwschuurman January 2, 2024 20:51

k8s-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Jan 2, 2024

Filter multiattach errors

81773a0

mattcary force-pushed the multiattach branch from 3d3c8f0 to 81773a0 Compare January 2, 2024 20:58

msau42 reviewed Jan 2, 2024

View reviewed changes

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 2, 2024

k8s-ci-robot merged commit bce4256 into kubernetes-sigs:master Jan 2, 2024

k8s-infra-cherrypick-robot mentioned this pull request Jan 3, 2024

[release-1.12] Filter multiattach errors #1560

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Filter multiattach errors #1559

Filter multiattach errors #1559

mattcary commented Jan 2, 2024

msau42 Jan 2, 2024

mattcary Jan 2, 2024

mattcary Jan 2, 2024

msau42 Jan 2, 2024

msau42 commented Jan 2, 2024

k8s-ci-robot commented Jan 2, 2024

mattcary commented Jan 3, 2024

k8s-infra-cherrypick-robot commented Jan 3, 2024

Filter multiattach errors #1559

Filter multiattach errors #1559

Conversation

mattcary commented Jan 2, 2024

msau42 Jan 2, 2024

Choose a reason for hiding this comment

mattcary Jan 2, 2024

Choose a reason for hiding this comment

mattcary Jan 2, 2024

Choose a reason for hiding this comment

msau42 Jan 2, 2024

Choose a reason for hiding this comment

msau42 commented Jan 2, 2024

k8s-ci-robot commented Jan 2, 2024

mattcary commented Jan 3, 2024

k8s-infra-cherrypick-robot commented Jan 3, 2024