Skip to content

Behaviour of controller.CreateVolume when a Snapshot is Not Ready? #694

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
arianitu opened this issue Jan 14, 2021 · 6 comments
Closed

Behaviour of controller.CreateVolume when a Snapshot is Not Ready? #694

arianitu opened this issue Jan 14, 2021 · 6 comments

Comments

@arianitu
Copy link

arianitu commented Jan 14, 2021

I'm having a lot of trouble discovering the behaviour of CSI when it comes to Snapshots that have a readyToUse set to false.

Specifically, I have two basic questions:

  1. In the case where a PVC is being created from a VolumeSnapshot and that VolumeSnapshot is currently not readyToUse, what is the behaviour of the CSI driver? From what I have looked at, it seems like https://github.com/kubernetes-csi/external-provisioner#csi-error-and-timeout-handling defines the timing out. By default, it retries every --retry-interval-start doubling every time until it hits --retry-interval-max
  2. Since the CSI sidecar will timeout after 5 minutes. How does CSI handle the case where a VolumeSnapshot is very large, like Terrabytes, where readyToUse may take an hour?

Is the expected behaviour that the user would manually check the VolumeSnapshot and wait until it's readyToUse flag is set to true? So for example, the user would wait 20 minutes, then apply the PVC to the cluster pointing to the Volume Snapshot?

  1. Is there any way to let the controller.CreateVolume wait longer, for big snapshots, so I don't have to manually wait? I understand you can change the configuration values of the sidecar container, but that applies globally for all CSI drivers? Is there perhaps a method to get waiting behaviour with exponential backoff that doesn't timeout so quickly (5 minutes?)
@arianitu
Copy link
Author

Diving into the code here.

We see snapshots are assigned like so:

	snapshotID := ""
	content := req.GetVolumeContentSource()
	if content != nil {
		if content.GetSnapshot() != nil {
			// TODO(#161): Add support for Volume Source (cloning) introduced in CSI v1.0.0
			snapshotID = content.GetSnapshot().GetSnapshotId()

			// Verify that snapshot exists
			sl, err := gceCS.getSnapshotByID(ctx, snapshotID)
			if err != nil {
				return nil, status.Errorf(codes.Internal, "CreateVolume failed to get snapshot %s: %v", snapshotID, err)
			} else if len(sl.Entries) == 0 {
				return nil, status.Errorf(codes.NotFound, "CreateVolume source snapshot %s does not exist", snapshotID)
			}
		}
	}

This snapshotID eventually gets passed to:

disk, err = createSingleZoneDisk(ctx, gceCS.CloudProvider, name, zones, params, capacityRange, capBytes, snapshotID, multiWriter)

which calls:

	err := cloudProvider.InsertDisk(ctx, meta.ZonalKey(name, diskZone), params, capBytes, capacityRange, nil, snapshotID, multiWriter)

and eventually calls:

func (cloud *CloudProvider) InsertDisk(ctx context.Context, volKey *meta.Key, params common.DiskParameters, capBytes int64, capacityRange *csi.CapacityRange, replicaZones []string, snapshotID string, multiWriter bool) error {
...
	diskToCreate := &computev1.Disk{
		Name:        volKey.Name,
		SizeGb:      common.BytesToGbRoundUp(capBytes),
		Description: description,
		Type:        cloud.GetDiskTypeURI(volKey, params.DiskType),
	}

	if snapshotID != "" {
		diskToCreate.SourceSnapshot = snapshotID
	}

	if params.DiskEncryptionKMSKey != "" {
		diskToCreate.DiskEncryptionKey = &computev1.CustomerEncryptionKey{
			KmsKeyName: params.DiskEncryptionKMSKey,
		}
	}

	if gceAPIVersion == GCEAPIVersionBeta {
		var insertOp *computebeta.Operation
		betaDiskToCreate := convertV1DiskToBetaDisk(diskToCreate)
		betaDiskToCreate.MultiWriter = multiWriter
		insertOp, err = cloud.betaService.Disks.Insert(cloud.project, volKey.Zone, betaDiskToCreate).Context(ctx).Do()
		if insertOp != nil {
			opName = insertOp.Name
		}
	} else {
		var insertOp *computev1.Operation
		insertOp, err = cloud.service.Disks.Insert(cloud.project, volKey.Zone, diskToCreate).Context(ctx).Do()
		if insertOp != nil {
			opName = insertOp.Name
		}
	}
        ...
        err = cloud.waitForZonalOp(ctx, opName, volKey.Zone)

So at the end of the day, the snapshotId gets passed to a Google Cloud Disk Create call setting VolumeSource. cloud.waitForZonalOp waits for the operation to complete (up to a maximum of 5 minutes.) What happens when you call a Google Cloud Disk Create with a SnapshotID that is not ready, does it fail, or does it work and eventually creates the volume when the snapshot is ready?

Understanding this would be nice, the core question I am trying to ask is basically:

  • Can I create a VolumeSnapshot (of a big PVC, like multiple TB.) and immediately create a PVC that points to a VolumeSnapshot that is set to readyToUse false?

@arianitu
Copy link
Author

It looks like the behaviour, generally, from looking at #482, #541 and #527 is that the csi-sidecar like I discussed in 1) has a default timeout and controller.CreateVolume errors out when a snapshot is not ready to be used.

I am unsure which line it returns an error, is it cloud.waitForZonalOp or is it after when the disk is returned from the Google API where the code does a diskIsReady check?

We want to set some timeouts for bigger PVCs, is the best place to do it in the sidecar and leave the timeouts inside controller.CreateVolume alone? @saikat-royc you seem like the best person to ask, comments would really be appreciated.

@annapendleton
Copy link

annapendleton commented Jan 15, 2021

The csi external provisioner has a check for the snapshot content being ready before calling the pd specific csi-sidecar - so the call does not reach the pd controller.CreateVolume logic linked above.

For code-path, see controller.go lines:

  • "getSnapshotSource" call 1053 returns err that snapshot is not ready and is plumbed up through 912 to "prepareProvision" call 600
  • "prepareProvision" returns "controller.ProvisioningNoChange" error to the "Provision" call 720

From "Provision" call, state "controller.ProvisioningNoChange" is returned to a separate work item queue handler controller.go. Walking through the handler code, this results in retries as its a non-finalizing error and gets re-queued:

  • "ProvisionClaimHandler" call 404 passes the "controller.ProvisioningNoChange" upward to "syncClaim" call 1090 where it returns an error - resulting in the claim being requeued in "syncClaimHandler" call 1042
  • "processNextClaimHandler" call 959 is where the retry magic happens and those values are considered.

The amount of retries is defined by the external provisioner "DefaultFailedProvisionThreshold". Currently default in the code is 15 - setting this to 0 will make it an indefinite reconciliation.

The external provisioner skips this default setting by instantiating the controller.Provisioner separately from that construction call - implicitly setting those values to 0 by not setting them here

@annapendleton
Copy link

TLDR; you should not have to do anything extra to get your desired behavior, the csi driver should cover your use case.

@msau42
Copy link
Contributor

msau42 commented Jan 15, 2021

Here's the line in csi that sets the threshold to 0: https://github.com/kubernetes-csi/external-provisioner/blob/41af8a3920bf305cfa2da19f87d62d954c37fc98/cmd/csi-provisioner/csi-provisioner.go#L320

So it retries indefinitely.

@arianitu
Copy link
Author

@annapendleton @msau42 Thank you so much for the detailed response, that's great news!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants