-
Notifications
You must be signed in to change notification settings - Fork 268
⚠️ Allow explicitly empty volume AZ #2008
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Skipping CI for Draft Pull Request. |
✅ Deploy Preview for kubernetes-sigs-cluster-api-openstack ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
api/v1beta1/types.go
Outdated
// From specifies where we will obtain the availability zone for the | ||
// volume. The options are "Name" and "Machine". If "Name" is specified | ||
// then the Name field must also be specified. If "Machine" is specified | ||
// the volume will use the value of FailureDomain, if any, from the | ||
// associated Machine. | ||
// +kubebuilder:default:=Name | ||
// +optional | ||
AvailabilityZone string `json:"availabilityZone,omitempty"` | ||
From VolumeAZSource `json:"from,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks a bit like an union to me? We probably want validation that if from=Machine
you cannot set Name
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, it's absolutely a discriminated union, although one with a hopefully ergonomic optional default.
The latest version has validation and a complete suite of tests.
69ab5e1
to
03999c3
Compare
That test failure confirms that the new default behaviour works :) I've updated the e2e test to specify the |
/test pull-cluster-api-provider-openstack-e2e-full-test |
Running the e2e tests again because there's some non-determinism in there: if we weren't correctly setting the target AZ, they might land in the right place anyway by accident. /test pull-cluster-api-provider-openstack-test |
The merge conflict is only in |
/hold cancel |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just one question.
/* FIXME: These tests are failing | ||
It("should not allow additional volume with empty name", func() { | ||
machine.Spec.AdditionalBlockDevices = []infrav1.AdditionalBlockDevice{ | ||
{ | ||
Name: "", | ||
SizeGiB: 1, | ||
}, | ||
} | ||
Expect(k8sClient.Create(ctx, machine)).NotTo(Succeed(), "Creating a machine with an empty name additional block device should fail") | ||
}) | ||
|
||
It("should not allow additional volume with name root", func() { | ||
machine.Spec.AdditionalBlockDevices = []infrav1.AdditionalBlockDevice{ | ||
{ | ||
Name: "root", | ||
SizeGiB: 1, | ||
}, | ||
} | ||
Expect(k8sClient.Create(ctx, machine)).NotTo(Succeed(), "Creating a machine with a root named additional block device should fail") | ||
}) | ||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We keep this on FIXME
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think so, yeah. Should probably create an issue to track it with a pointer to the FIXME code, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done: #2013
Replaces an AvailabilityZone string for volumes with a VolumeAvailabilityZone struct which allows more flexibility in defaulting behaviour. Specifically it enables us to express both the current default behaviour where we take the volume AZ from the Machine, and a new default behaviour where to don't specify a volume AZ at all. In making this change to both RootVolume and AdditionalBlockDevices we use common code for both APIs. This has the result of updating RootVolume to be consistent with AdditionalBlockDevices.
/test pull-cluster-api-provider-openstack-e2e-full-test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only commenting on the API (though the additional test coverage looks good and is appreciated). As noted inline, I don't love the idea of perpetuating this myth that Compute and Block Storage AZs can be considered equivalent. However, it's a widely spread myth that clearly some users rely on so 🤷
/lgtm
// if From is "Name". The volume availability zone name may not contain | ||
// spaces. | ||
// +optional | ||
Name *VolumeAZName `json:"name,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't put my finger on why, but this feels clunky. I had envisioned the "inherit from compute AZ" behaviour to be either (a) an anti-feature or (b) if actually required something that would be configured either on a deployment- or machine-wide basis. I've no real reason for this though, other than a general dislike of the whole thing, nor do I have an alternate proposal. As such I'm mainly leaving this note here so that I said something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't preclude adding a machine or cluster-wide default override in the future. However, like you I'm not expecting the from: Machine
behaviour to be widely used. This is also why I made the discriminator (the From field) optional with an explicit default. This means that the anticipated use case goes from:
availabilityZone: foo
to
availabilityZone:
name: foo
which isn't awful, and leaves us with a get-out-of-jail-free card to fix it later without breaking backwards compatibility when we understand it better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO the situation gets better now, when the default behavior becomes set no AZ at all.
/lgtm Awesome that size on discs is specified in the same way for root and extra disks :) |
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: mdbooth The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@mdbooth: The following test failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
@lentzi90 This is worrying. This failure was a flake, and I'm reasonably confident we haven't broken it. However... I'm surprised it would merge a PR with a test failure. I'm assuming this additional test run is against the proposed merge commit, so I get why it runs. But why run it if it's going to merge anyway on failure? |
Oh yes this looks bad. I have no explanation for it. Could there be some higher level config in test-infra that runs it as a post-submit? I remember we removed our post-submits from here, but perhaps there is some config telling prow to rerun after merge? 🤔 |
https://prow.k8s.io/pr-history/?org=kubernetes-sigs&repo=cluster-api-provider-openstack&pr=2008 shows a pass of that test at that commit. It got tested and merged in a batch merge |
Replaces an AvailabilityZone string for volumes with a VolumeAvailabilityZone struct which allows more flexibility in defaulting behaviour. Specifically it enables us to express both the current default behaviour where we take the volume AZ from the Machine, and a new default behaviour where to don't specify a volume AZ at all.
In making this change to both RootVolume and AdditionalBlockDevices we use common code for both APIs. This has the result of updating RootVolume to be consistent with a volume specified via AdditionalBlockDevices.
Fixes #1649
/hold