Skip to content

Set MaxVolumesPerNode on NodeGetInfo call based on Node Type #19

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
davidz627 opened this issue Jun 21, 2018 · 5 comments
Closed

Set MaxVolumesPerNode on NodeGetInfo call based on Node Type #19

davidz627 opened this issue Jun 21, 2018 · 5 comments
Labels
good first issue Denotes an issue ready for a new contributor, according to the "help wanted" guidelines. help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/feature Categorizes issue or PR as related to a new feature.
Milestone

Comments

@davidz627
Copy link
Contributor

davidz627 commented Jun 21, 2018

Currently NodeGetInfoResponse returns the default of 0 for MaxVolumesPerNode so the CO will decide how many volumes can be published on a node.

For GCE we need to return a different number based on node type as the Max Attachable Volumes depends on the number of vCPUs the instance has.

For the actual limits see:
See: https://cloud.google.com/compute/docs/disks/
"persistent disk limits" section

You should be able to GET the instance from the cloud and pull the number of vCPUs from that.
Bonus: We seem to need information from the node object a lot, caching the relevant information somewhere would be nice. Maybe in the GCENodeServer object

@davidz627 davidz627 added help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. good first issue Denotes an issue ready for a new contributor, according to the "help wanted" guidelines. kind/enhancement labels Jun 21, 2018
@davidz627 davidz627 added this to the Beta milestone Jun 21, 2018
@davidz627 davidz627 added kind/feature Categorizes issue or PR as related to a new feature. and removed kind/enhancement labels Jun 21, 2018
@davidz627 davidz627 modified the milestones: Beta, GA Sep 26, 2018
@Mockery-Li
Copy link
Contributor

Hi, I want to take this as my first issue, and I hope to make sure I did not misunderstand the problem.
Is it about modifying GetMaxVolumesPerNode() in csi.pb.go (both V0 and V1), to pull the number of vCPUs as the default return value when it knows it's under GCE environment?

@davidz627
Copy link
Contributor Author

Hi @Mockery-Li! Thank you for your interest :)

To implement MaxVolumesPerNode you will not have to modify csi.pb.go. That is the CSI Specification Protobuf and is generated from the CSI Specification

What this issue entails is modifying NodeGetInfo in pkg/gce-pd-csi-driver/node.go to return a value MaxVolumesPerNode.

The max number of nodes can be determined via machine type: see Persistent Disk Limits section. You should be able to get the Machine type through the Metadata Service in pkg/gce-cloud-provider/metadata/metadata.go (I am not 100% sure about this part).

This should just be a matter of piping the metadata service through to the Node service and making the machine type checking call in the NodeGetInfo function and casing on the Machine Type.

Hope that is enough information to get you started. Feel free to reach out to me on the Kubernetes Slack (I am davidz627 there as well) or post questions on this issue if you are wanting guidance. Looking forward to your Pull Request!

@msau42
Copy link
Contributor

msau42 commented Mar 18, 2019

You can probably also copy a lot of the functionality from here

@davidz627
Copy link
Contributor Author

davidz627 commented Mar 18, 2019

That's true but I think there are a few differences in the current state (and the driver) from that code.

  1. I think the metadata server might be an "easier" source of the machine type information as we wont need the GCP credentials on the node deployment: see machine-type
  2. I think GCE has changed their supported attach limits for disks since that code was written (?). From what I see on the documentation:
    "For most instances with custom machine types or predefined machine types, you can attach up to 128 persistent disks.
    Instances with shared-core machine types are limited to a maximum of 16 persistent disks."

@Mockery-Li
Copy link
Contributor

Thank you. I'm trying both methods.
To adapt the function @msau42 mentioned, we may need to get CloudProvider inside NodeService? It seems a little bit strange to access that directly using cloud := ns.Driver.cs.CloudProvider, but this is the only way I've come up with.
As for using machine-type like @davidz627 said, I have not figured out how to use the metadata (projectID/zone/name). The detailed question has been sent through Slack, need some further help, thanks.

jsafrane pushed a commit to jsafrane/gcp-compute-persistent-disk-csi-driver that referenced this issue Dec 14, 2021
…ncy-openshift-4.10-ose-gcp-pd-csi-driver

Updating ose-gcp-pd-csi-driver images to be consistent with ART
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Denotes an issue ready for a new contributor, according to the "help wanted" guidelines. help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/feature Categorizes issue or PR as related to a new feature.
Projects
None yet
Development

No branches or pull requests

3 participants