Skip to content

CMEK Disk Creation Sometimes Fails with "disk already exists with same name" #558

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
saad-ali opened this issue Jul 17, 2020 · 0 comments · Fixed by #563
Closed

CMEK Disk Creation Sometimes Fails with "disk already exists with same name" #558

saad-ali opened this issue Jul 17, 2020 · 0 comments · Fixed by #563
Assignees
Milestone

Comments

@saad-ali
Copy link
Contributor

Problem

Provisioning of GCE PDs with CMEK enable sometimes fails with disk already exists with same name

  Type     Reason                Age                From                                                                                                 Message
  ----     ------                ----               ----                                                                                                 -------
  Warning  ProvisioningFailed    14s (x2 over 15s)  pd.csi.storage.gke.io_gke-cluster-1-default-pool-4cede575-43h6_de91f0bc-68b9-451d-826a-43e526adc6a1  failed to provision volume with StorageClass "csi-gce-pd-cmek": rpc error: code = DeadlineExceeded desc = context deadline exceeded
  Normal   ExternalProvisioning  8s (x3 over 16s)   persistentvolume-controller                                                                          waiting for a volume to be created, either by external provisioner "pd.csi.storage.gke.io" or manually created by system administrator
  Normal   Provisioning          4s (x5 over 16s)   pd.csi.storage.gke.io_gke-cluster-1-default-pool-4cede575-43h6_de91f0bc-68b9-451d-826a-43e526adc6a1  External provisioner is provisioning volume for claim "default/pvc-demo"
  Warning  ProvisioningFailed    4s (x3 over 14s)   pd.csi.storage.gke.io_gke-cluster-1-default-pool-4cede575-43h6_de91f0bc-68b9-451d-826a-43e526adc6a1  failed to provision volume with StorageClass "csi-gce-pd-cmek": rpc error: code = AlreadyExists desc = CreateVolume disk already exists with same name and is incompatible: actual disk KMS key name projects/test-project/locations/us-central1/keyRings/TestKeyRing/cryptoKeys/test-key/cryptoKeyVersions/8 did not match expected param projects/test-project/locations/us-central1/keyRings/TestKeyRing/cryptoKeys/test-key

Repro Steps

  1. Deploy GCE PD CSI Driver with csi-provisioner sidecar parameter --timeout=1s
    • This will make it easier to simulate a timeout.
  2. Create a StorageClass enabling CMEK encryption:
    apiVersion: storage.k8s.io/v1
    kind: StorageClass
    metadata:
      name: csi-gce-pd-cmek
      annotations:
        storageclass.kubernetes.io/is-default-class: "true"
    provisioner: pd.csi.storage.gke.io
    parameters:
      type: pd-standard
      disk-encryption-kms-key: projects/test-project/locations/us-central1/keyRings/TestKeyRing/cryptoKeys/test-key
    
  3. Provision a PVC using the StorageClass above.
    • It may take multiple tries to hit the timeout (but I was able to hit it on my first try once I reduced the timeout to 1sec).

Proposed Fixes

There are two fixes for this:

  1. Increase the timeout for the external-provisioner sidecar, this won't fix the issue, but it will reduce the likelihood of this happening.
  2. Make sure GCE PD CSI Driver CreateVolume call does not fail for CMEK if operation is retried

/assign

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
1 participant