Skip to content

read only volumes should not be expanded and we need OSS ROX testing #1088

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
leiyiz opened this issue Nov 21, 2022 · 10 comments · Fixed by kubernetes/kubernetes#114628 or #1215
Closed
Assignees
Labels
lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.

Comments

@leiyiz
Copy link
Contributor

leiyiz commented Nov 21, 2022

#972 added unconditional filesystem resize and it should not happen for read only volumes. We are not seeing failures in mounting as resizing RO volume only produce an warning but this could have broken ROX functionality.

  1. we should not try to expand RO volumes
  2. we should write an OSS ROX test ASAP
@mattcary
Copy link
Contributor

To clarify, it is the filesystem resize that's the problem, not the volume resize?

@judemars
Copy link
Contributor

I'll pick this up, can someone who's a contributor please assign it to me :)

@mattcary
Copy link
Contributor

/assign @judemars

@leiyiz
Copy link
Contributor Author

leiyiz commented Nov 23, 2022

To clarify, it is the filesystem resize that's the problem, not the volume resize?

yes, I'll update the issue description

@judemars
Copy link
Contributor

judemars commented Dec 5, 2022

Just a progress update. I think this work will have a few parts:

  1. Update mount-utils' resizefs_linux to add a check for readonly bit of filesystem to NeedResize method and return false if so, like so (I made the change directly in the pd repo so you'll have to expand the diff)
  2. Update mount-utils package version in pd repo to consume fix from 1) + update resize NodeStageVolume UTs to test resize is skipped if volume is readonly
  3. Add ROX k8s integration tests for snapshot, clone, and restore cases. Snapshot and clone will also test that resize is skipped for ROX mode.

I have a couple outstanding questions I'm working through so posting here in case folks have any insight:

  1. Updating resizefs in mount_utils will mean that the call to resize in NodeExpandVolume will also be skipped since it calls the same method. Is this okay? It seems like it would have to be since the resize will fail if it's readonly anyways. But as another follow up to that: what happens if the request to NodeExpandVolume has VolumeCapability set to *READER_ONLY? Should we also skip resize in that case?

  2. Do we need to update the resize code for windows as well as linux? According to filesystem is not resized when restoring from snapshot/cloning to larger size than origin #972 (comment) it goes through a CSI proxy for the resize instead of doing it directly, but I currently can't find the implementation to be able to check.

@judemars
Copy link
Contributor

judemars commented Dec 21, 2022

Update:
For 1) The mount-utils change is in, need to figure out how/when it gets incorporated into pd-driver repo

For 3) Adding ROX k8s integration tests: go/k8spr/114628 out for review

@judemars
Copy link
Contributor

For 2) it looks like I need to make a change like #1025 to update the version of k8s to 1.27 - @leiyiz (since you did that CL) are we scheduled to upgrade it on some cadence or should I manually do it now?

@mattcary
Copy link
Contributor

For (2) I'd also talk with @mauriciopoppe who has been coordinating windows support.

For (1) we need to update our mount-utils in vendor/. It's a go.mod update, I'll start a discussion on our internal chat as I've been a little fuzzy on this process.

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 21, 2023
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle rotten
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Apr 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.
Projects
None yet
5 participants