Skip to content

Limit ListVolumesResponse to be within CSI response limitations #998

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
pwschuurman opened this issue May 20, 2022 · 5 comments · Fixed by #999
Closed

Limit ListVolumesResponse to be within CSI response limitations #998

pwschuurman opened this issue May 20, 2022 · 5 comments · Fixed by #999

Comments

@pwschuurman
Copy link
Contributor

gRPC sets an implicit response size limit of 4MB: https://github.com/grpc/grpc/blob/master/include/grpc/impl/codegen/grpc_types.h#L503

This can affect callers of the PD CSI Driver (eg: csi-attacher). Ideally the client would set an appropriate gRPC limit. In the absence of this limit however, the GCP CSI driver should respect 4MB implicitly.

Ideally the PD CSI driver would limits responses to an actual size limit (eg: trim Entry in ListVolumesResponse to under 4MB), however an alternate solution could be to set a default limit of 1000 Entry in the array, and use the built-in pagination in the CSI spec to return additional responses.

@mattcary
Copy link
Contributor

Ah, interesting. We used to just pass through the oneplatform pagination, but stopped doing that when we had to filter things in the driver. Adding pagination is a little complicated because different instances of the driver my get called between pages. The spec says that paging must not be assumed to be consistent.

@pwschuurman
Copy link
Contributor Author

Adding pagination is a little complicated because different instances of the driver my get called between pages. The spec says that paging must not be assumed to be consistent.

Fair, worst case this would just cause us to not list some volumes, which would increase the forceSync churn (and therefore ControllerPublishVolume) for VA objects: https://github.com/kubernetes-csi/external-attacher/blob/master/pkg/controller/csi_handler.go#L188

@mattcary
Copy link
Contributor

@pwschuurman Yeah I agree

@leiyiz
Copy link
Contributor

leiyiz commented May 20, 2022

How is "which driver's ListVolume get called" decided? is it based on a leader election system or just random?

@mattcary
Copy link
Contributor

mattcary commented Jun 6, 2022

How is "which driver's ListVolume get called" decided? is it based on a leader election system or just random?

I'm not sure what I said was correct. A particular attacher, say, is always going to be calling the same instance of ListVolume since they're in the same pod. That driver may go away and a different leader get elected as part of the attacher-controller doing its thing, but I think in that case the listing will have restarted from the beginning.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants