-
Notifications
You must be signed in to change notification settings - Fork 159
Topology support #77
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Topology support #77
Conversation
@davidz627: GitHub didn't allow me to assign the following users: verult. Note that only kubernetes-sigs members and repo collaborators can be assigned. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: davidz627 The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
pkg/gce-pd-csi-driver/controller.go
Outdated
// TODO: Support preferred toplogies. | ||
if req.GetAccessibilityRequirements() != nil { | ||
reqTop := req.GetAccessibilityRequirements().GetRequisite() | ||
if reqTop == nil || len(reqTop) == 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think requisite is strictly required if preferences is specified
pkg/gce-pd-csi-driver/controller.go
Outdated
if reqTop[0].GetSegments() == nil { | ||
return nil, status.Error(codes.InvalidArgument, fmt.Sprintf("CreateVolume requisite toplogies specified specified but no segments")) | ||
} | ||
// Currently just pick the 0th requisite toplogy. This is a valid strategy but |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if no preferences are specified, then yeah cycling or random may be better
pkg/gce-pd-csi-driver/controller.go
Outdated
} | ||
// Currently just pick the 0th requisite toplogy. This is a valid strategy but | ||
// we may want to randomly cycle in the future (?) | ||
zoneSeg, ok := reqTop[0].GetSegments()["zone"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the topologyKey should be a constant that is also well-defined in the documentation (because users will have to set it in the storageclass). I think the current thought was something like "com.google.topology/zone"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this key could be used across GCP products, we should get more review on this constant, but that could be a separate discussion
pkg/gce-pd-csi-driver/controller.go
Outdated
configuredZone = zoneSeg | ||
} else { | ||
return nil, status.Error(codes.InvalidArgument, | ||
fmt.Sprintf("CreateVolume requisite topology specified but could not find zone in segment: %v", reqTop[0].GetSegments())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We may want to have even stricter validation and error if other topology keys that we don't understand are specified.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI, I proposed to add this restriction to CSI in this PR: container-storage-interface/spec#257 .
}, | ||
}, | ||
}, | ||
expVol: &csi.Volume{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The returned volume should also have topology set right?
}, | ||
}, | ||
{ | ||
name: "success with topology overwrite zone in params", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we actually wanted to error in this case because we don't just want to drop/ignore user config
@msau42 comments addressed |
@msau42 sorry this PR really balooned but everything was kind of intertwined :( The good news is the commits are split "decently" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed the core logic, mostly looks good!
pkg/gce-pd-csi-driver/controller.go
Outdated
} | ||
// Currently just pick the 0th requisite toplogy. This is a valid strategy but | ||
// we may want to randomly cycle in the future (?) | ||
zoneSeg, ok := reqTop[0].GetSegments()["zone"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this key could be used across GCP products, we should get more review on this constant, but that could be a separate discussion
pkg/gce-pd-csi-driver/controller.go
Outdated
configuredZone = zoneSeg | ||
} else { | ||
return nil, status.Error(codes.InvalidArgument, | ||
fmt.Sprintf("CreateVolume requisite topology specified but could not find zone in segment: %v", reqTop[0].GetSegments())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI, I proposed to add this restriction to CSI in this PR: container-storage-interface/spec#257 .
for k, v := range req.GetParameters() { | ||
if k == "csiProvisionerSecretName" || k == "csiProvisionerSecretNamespace" { | ||
// These are hardcoded secrets keys required to function but not needed by GCE PD | ||
continue | ||
} | ||
switch strings.ToLower(k) { | ||
case "type": | ||
case common.ParameterKeyType: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are zone & type parameters kept here because of in-tree migration?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we want to continue to support Parameters
@@ -341,12 +368,23 @@ func (gceCS *GCEControllerServer) ValidateVolumeCapabilities(ctx context.Context | |||
// TODO: Factor out the volume capability functionality and use as validation in all other functions as well | |||
glog.V(5).Infof("Using default ValidateVolumeCapabilities") | |||
// Validate Arguments | |||
if req.GetVolumeCapabilities() == nil || len(req.GetVolumeCapabilities()) == 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about AccessibleTopology
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
pkg/gce-pd-csi-driver/controller.go
Outdated
// Pick the preferred toplogy in order | ||
if len(prefTop) != 0 { | ||
if prefTop[0].GetSegments() == nil { | ||
return "", fmt.Errorf("preferred toplogies specified specified but no segments") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"specified" duplicated, and a couple of "toplogy" and "toplogies" in this file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
} | ||
return zone, nil | ||
} else if len(reqTop) != 0 { | ||
r := rand.Intn(len(reqTop)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The in-tree plugin has a round-robin mechanism to choose zones (here). Could we do something similar?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we are not guaranteed to have a monotonically increasing id/name for the CSI Driver. Using a hash strategy could have unwanted behavior. We would have to actually keep track of which we've done or not to get round-robin going. I think a random pick is actually our best option here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From a backwards compatibility standpoint for in-tree migration, do we need to support this though? cc @saad-ali
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does not affect migration. There are two potential cases for the migration path:
- The whole PD lifecycle is migrated to CSI, then CreateVolume is called and we pick a zone in whatever method we want (random here), all future references to this volume just continue to use the zone picked
- PD is precreated in some zone, this logic is never used, we just use whatever zone was used in the creation (the round robin method)
I don't think theres any point after creation where we make assumptions on "how the zone was picked". so therefore it can't create any issues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If a user created a statefulset:
- Before the migration, the PDs would be evenly spread across zones
- After the migration, the PDs would be randomly spread across zones
The question is, do we want to retain the old spreading behavior for backwards compatibility? Or should we just declare multi-zone in-tree provisioning broken and require users to used delayed provisioning going forward?
|
||
if len(configuredZone) == 0 { | ||
// Default to zone that the driver is in | ||
configuredZone = gceCS.CloudProvider.GetZone() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The zone-picking strategy should be consistent, regardless of whether zones are specified through parameters or AccessibilityRequirement
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what do you mean by this? consistent with what?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think what @verult means is that once you enable the topology capability, then you should always get an accessibility requirement. So if you run into a case where there is no zone specified, then you should error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe accessibility requirements can be unset:
// If this field is not specified and the SP has the
// ACCESSIBILITY_CONSTRAINTS plugin capability, the SP MAY choose
// where the provisioned volume is accessible from.
The current logic is:
- Only one of
Parameters
orAccessibilityRequirements
may be set. Error if both are set. - If neither are set (valid according to CSI Spec) default to zone that driver is in
This logic makes sense to me according to the Spec. Please let me know if you think it should be changed and we can discuss
"cloud.google.com/go/compute/metadata" | ||
) | ||
|
||
type MetadataService interface { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comments, especially around the intended purpose of the interface in order to limit its unwanted expansion. I'm worried about future developers adding arbitrary methods here.
nit: "Service" gives me an impression of a long running process, and it's already an overloaded term in the container world
8ce2bdf
to
b3cd61d
Compare
addressed comments. |
c2c0f0b
to
6934ae8
Compare
pkg/common/constants.go
Outdated
ParameterKeyType = "type" | ||
|
||
// Keys for Topology | ||
TopologyKeyZone = "com.google.gcp-compute-persistent-disk-csi-driver.topology/zone" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should not put the driver name in the key. Filestore driver should be able to share this key too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought we didn't want collisions between topology keys? I may be misunderstanding what this is used for exactly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We want to be able to share topology keys amongst drivers from the same vendor.
- To avoid proliferation of labels on Node objects
- To allow users to use a common set of topology labels in their Pods.
pkg/gce-pd-csi-driver/controller.go
Outdated
configuredZone = v | ||
default: | ||
return nil, status.Error(codes.InvalidArgument, fmt.Sprintf("invalid option %q", k)) | ||
} | ||
} | ||
|
||
// Topology overwrites Storage Class Params if specified |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should not allow both zone/zones and AllowedTopologies to be specified at the same time. But I'm not sure if we can actually detect that because csi provisioner will always fill in the requisite topologies, even when AllowedTopologies is not specified. cc @verult
So I think the next best option would be to use the zone/zones parameters because that's what the user specified, so we shouldn't just ignore it.
|
||
if len(configuredZone) == 0 { | ||
// Default to zone that the driver is in | ||
configuredZone = gceCS.CloudProvider.GetZone() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think what @verult means is that once you enable the topology capability, then you should always get an accessibility requirement. So if you run into a case where there is no zone specified, then you should error.
} | ||
return zone, nil | ||
} else if len(reqTop) != 0 { | ||
r := rand.Intn(len(reqTop)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From a backwards compatibility standpoint for in-tree migration, do we need to support this though? cc @saad-ali
|
||
// Pick the preferred topology in order | ||
if len(prefTop) != 0 { | ||
if prefTop[0].GetSegments() == nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could have a common function to pick extract a zone from a topology, to simplify the preference vs required logic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
picking a zone from topology is different between preference vs required so it doesn't make sense to me to abstract this out, I may be misunderstanding your suggestion
e814f05
to
583c125
Compare
@msau42 hoping to get this in by today if it looks good to you, I think all open questions have been resolved. Let me squash commits before your lgtm though, there are currently too many |
lgtm, do you want to squash? |
Refactor common utils and added metadata service. Refactor of e2e testing framework to accomidate parallel tests.
583c125
to
f11eada
Compare
@msau42 rebased, thanks! |
/lgtm |
The driver will return the node toplogy on
GetNodeInfo
and supports simpleRequisite
topology specification on the create volume call.Also the testing infrastructure was changed to support topology.
Fixes: #13
/assign @msau42
/assign @verult