-
Notifications
You must be signed in to change notification settings - Fork 159
DisableDevice not working as expected #1146
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Ah, good catch. |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues. This bot triages un-triaged issues according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
/remove-lifecycle stale Oops, dropped this. I have a quick PR ready. |
We are seeing the same issue (we are on 1.9.5 atm):
Seems we also don't have these kind of directories |
That's expected, we have not cherry-picked #1225 and have not cut a new release since it was merged. The error message is benign. We had thought that disabling the device would prevent some race conditions, but on further investigation they don't come up. So the fact that the disabling isn't succeeding shouldn't cause any problems. But I agree it's annoying --- k8s errors have enough noise that we shouldn't be adding to them :-) I'll cherry pick this back to 1.9 and it will get into releases in the next week or too (we're in the middle of pushing out patch releases and I don't know if the CPs will get into this cycle or the next one---there are some CVEs that are being fixed by updating golang versions / image bases that I don't want to delay). |
Ah, so you mean it might not be the cause of what we observe? Aehm, I haven't actually described what we observe :) |
Yeah, I think what you're observing is unrelated to this error message. Sorry for the noise & confusion! What is in the node Y volumesAttached at the time you see this error (that's in the node status, you can see it with kubectl describe node)? If there's no volumeattachment on Y, then the attacher should try to reconcile the attachment away, but only if it things that the disk is attached to the node. So there is a dance happening between the pv controller and the attacher. Another thing to try is that if this situation comes up again, kill the csi-attacher container and let it restart. We've seen some cases with the provisioner where it looks like its informer cache gets stale --- although that seems unlikely and we don't have a consistent reproduction. But if a new csi attacher instance starts working correctly a stale cache seems less unlikely. |
Happened again today:
Again, no So I guess the important question is how the "old"
Just tried and I guess as expected it didn't change anything as there is no hint that this volume should be detached from Y. I'll try to dig more logs, would be happy if you had some ideas on how the old |
I believe the flow is: delete volumeattachment, csi-attacher notices node.volumesAttached volume w/o a volume attachment, and the detaches it. So I think the weird part is how the volumesAttached got cleared without the volume actually being detached? |
Sorry for the initial confusion, of course you were right that the error msg I showed was not related (after all, it came from the |
Disable still is not working as expected. I'm seeing the following error in logs:
|
It seems that disabling the device does not allow subsequent NodeUnstage/NodeStage commands to work. This may require the device to be re-enabled. Calling
Re-enabling this device by writing "running" to the device works. However there is a problem. We use
(3) is the only approach, if (1) and (2) aren't possible. |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues. This bot triages un-triaged issues according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues. This bot triages un-triaged issues according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle rotten |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. This bot triages issues according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /close not-planned |
@k8s-triage-robot: Closing this issue, marking it as "Not Planned". In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
The target path
/sys/block/google-pvc-843cdc45-7cf7-43d6-801b-84d69722ebdd/device/state
that the DisableDevice function is attempting to write at is missing.I printed the path in my local setup PR:
Logging into the node the actual target path is
/sys/block/sdb/device/state
for a scsi device for example.What is of interest is the /dev/sd* derived in deviceFsPath here
The text was updated successfully, but these errors were encountered: