Skip to content

btrfs recalim on kernel v5.19+: use bg_reclaim_threshold #2091

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 16, 2025

Conversation

motiejus
Copy link
Contributor

Add a sysfs knob btrfs-allocation-{,meta}data-bg_reclaim_threshold, which will do the equivalent of:

echo VALUE > /sys/fs/btrfs/FS-UUID/allocation/data/bg_reclaim_threshold

Or, in case of metadata, equivalently:

echo VALUE > /sys/fs/btrfs/FS-UUID/allocation/metadata/bg_reclaim_threshold

Where VALUE is a number between 0 and 99 inclusive.

Adding it as a "special" mount option, similarly to read_ahead_kb, as that's quite convenient.

Some resources about bg_reclaim_threshold and more broadly balancing of the btrfs filesystem:

Author's interpretation

The higher the reclaim threshold, the more accurately btrfs will show unused space (Device Unallocated row of btrfs filesystem usage) at the expense of sometimes needlessly moving data around. The lower the threshold, the less rebalancing, the less accurate metrics of the remaining space.

The author of this commit prefers more IO in order to see more accurate Device Unallocated metrics, and therefore sets
btrfs-allocation-data-bg_reclaim_threshold=90.

What type of PR is this?
/kind feature

What this PR does / why we need it:

More control to the operator to manage btrfs free space tracking.

Which issue(s) this PR fixes:

Fixes #2088

Does this PR introduce a user-facing change?:

`btrfs` filesystems gain special mount options `btrfs-allocation-(data|metadata)-bg_reclaim_threshold`. The driver will write the provided value to `/sys/fs/btrfs/FS-UUID/allocation/(data|metadata)/bg_reclaim_threshold` respectively.

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. do-not-merge/invalid-commit-message Indicates that a PR should not merge because it has an invalid commit message. kind/feature Categorizes issue or PR as related to a new feature. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels May 14, 2025
@k8s-ci-robot
Copy link
Contributor

Hi @motiejus. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot requested review from leiyiz and tonyzhc May 14, 2025 10:20
@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label May 14, 2025
@motiejus motiejus mentioned this pull request May 14, 2025
@mattcary
Copy link
Contributor

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels May 14, 2025
Copy link
Contributor

@mattcary mattcary left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, just a minor nit.

@motiejus motiejus force-pushed the public_btrfs-reclaim branch from edc5a0a to df6a569 Compare May 14, 2025 18:32
@motiejus motiejus requested a review from mattcary May 14, 2025 18:32
@mattcary
Copy link
Contributor

/retest

Looks like VM failed to come up for e2e test, this is probably a flake.

@k8s-ci-robot
Copy link
Contributor

@motiejus: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-gcp-compute-persistent-disk-csi-driver-e2e-windows-2022 df6a569 link false /test pull-gcp-compute-persistent-disk-csi-driver-e2e-windows-2022

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@motiejus
Copy link
Contributor Author

I noticed the commit has Fixes #2088, which the linter is not overly happy about.

Before I fix that and force-push, is there anything more that needs to be addressed?

@mattcary
Copy link
Contributor

Hmm, it seems the linter is unhappy with the commit message, not the PR, but I can't figure out why.

Anyway

/lgtm
/approve

thank you!

I'll try to keep watch & re-lgtm if you have to force push something. Ping me if I'm slow.

@k8s-ci-robot k8s-ci-robot added lgtm "Looks good to me", indicates that a PR is ready to be merged. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels May 16, 2025
@motiejus
Copy link
Contributor Author

There is a failing Windows check — should I try to force-push and hope it's a flake that will self-resolve, or you will do something on your end?

Just checking about the next steps.

@mattcary
Copy link
Contributor

I can force merge if necessary. Can you try to clear the invalid commit message first?

Add a sysfs knob `btrfs-allocation-{,meta}data-bg_reclaim_threshold`, which will do
the equivalent of:

```
echo VALUE > /sys/fs/btrfs/FS-UUID/allocation/data/bg_reclaim_threshold
```

Or, in case of metadata, equivalently:

```
echo VALUE > /sys/fs/btrfs/FS-UUID/allocation/metadata/bg_reclaim_threshold
```

Where VALUE is a number between `0` and `99` inclusive.

Adding it as a "special" mount option, similarly to `read_ahead_kb`, as
that's quite convenient.

Some resources about `bg_reclaim_threshold` and more broadly balancing
of the btrfs filesystem:

- https://btrfs.readthedocs.io/en/latest/Administration.html#uuid-allocations-data-metadata-system
- https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=18bb8bbf13c1839b43c9e09e76d397b753989af2
- https://lwn.net/Articles/978826/ Linux v6.11+, this may obsolete `bg_reclaim_threshold`.

Author's interpretation
-----------------------

The higher the reclaim threshold, the more accurately btrfs will show
unused space (`Device Unallocated` row of `btrfs filesystem usage`) at
the expense of sometimes needlessly moving data around. The lower the
threshold, the less rebalancing, the less accurate metrics of the
remaining space.

The author of this commit prefers more IO in order to see more accurate
`Device Unallocated` metrics, and therefore sets
`btrfs-allocation-data-bg_reclaim_threshold=90`.
@motiejus motiejus force-pushed the public_btrfs-reclaim branch from df6a569 to 7f44bae Compare May 16, 2025 17:13
@k8s-ci-robot k8s-ci-robot removed lgtm "Looks good to me", indicates that a PR is ready to be merged. do-not-merge/invalid-commit-message Indicates that a PR should not merge because it has an invalid commit message. labels May 16, 2025
@motiejus
Copy link
Contributor Author

I can force merge if necessary. Can you try to clear the invalid commit message first?

Ah, sure, done. Please kick off the tests, hopefully all will pass this time 🤞

New changes are detected. LGTM label has been removed.

This one's weird, I only changed the commit message (removed Fixes line from the bottom):

$ git diff HEAD df6a5696fcc6
$

@mattcary
Copy link
Contributor

/ok-to-test

@mattcary
Copy link
Contributor

I can force merge if necessary. Can you try to clear the invalid commit message first?

Ah, sure, done. Please kick off the tests, hopefully all will pass this time 🤞

New changes are detected. LGTM label has been removed.

This one's weird, I only changed the commit message (removed Fixes line from the bottom):

$ git diff HEAD df6a5696fcc6
$

Who can question the wisdom of github AI overlords?

@mattcary
Copy link
Contributor

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 16, 2025
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: mattcary, motiejus

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot merged commit 8e820cd into kubernetes-sigs:master May 16, 2025
6 of 9 checks passed
@motiejus motiejus deleted the public_btrfs-reclaim branch May 16, 2025 19:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

btrfs: next steps
3 participants