Skip to content

Errors creating PersistentVolumeClaim over 63 characters #1160

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
ssapra opened this issue Feb 28, 2023 · 12 comments · Fixed by #1173
Closed

Errors creating PersistentVolumeClaim over 63 characters #1160

ssapra opened this issue Feb 28, 2023 · 12 comments · Fixed by #1173

Comments

@ssapra
Copy link

ssapra commented Feb 28, 2023

Hello,

I noticed the label change in 1.9.0 has started causing failures when I request a PVC with a long name.

For example, I have a storage class like such:

➜ kubectl get storageclass
NAME                                      PROVISIONER            RECLAIMPOLICY   VOLUMEBINDINGMODE      ALLOWVOLUMEEXPANSION   AGE
standard                                  kubernetes.io/gce-pd   Delete          Immediate              false                  23h

I have installed the 1.9.0 CSI driver

➜ kubectl get daemonset -n gce-pd-csi-driver
NAME                  DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR              AGE
csi-gce-pd-node       4         4         4       4            4           kubernetes.io/os=linux     23h
csi-gce-pd-node-win   0         0         0       0            0           kubernetes.io/os=windows   23h
➜ kubectl get daemonset csi-gce-pd-node -n gce-pd-csi-driver -o jsonpath={.spec.template.spec.containers[1].image}
k8s.gcr.io/cloud-provider-gcp/gcp-compute-persistent-disk-csi-driver:v1.9.0

When I create a PersistentVolumeClaim with a long name:

cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: abcdefghijklmnop-this-is-my-super-long-pvc-name-with-extra-characters
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 1G
  storageClassName: standard
  volumeMode: Filesystem
EOF

I see the PVC is stuck in Pending and there is an error about label length:

➜ kubectl describe pvc abcdefghijklmnop-this-is-my-super-long-pvc-name-with-extra-characters

External provisioner is provisioning volume for claim "default/abcdefghijklmnop-this-is-my-super-long-pvc-name-with-extra-characters"
Warning ProvisioningFailed 3s (x3 over 6s) pd.csi.storage.gke.io_18afad10-3745-4049-9341-5535b464a7ed_57c19df9-4cb0-4b3b-9728-142c9fd9764f failed to provision volume with StorageClass "standard": rpc error: code = InvalidArgument desc = CreateVolume failed to create single zonal disk pvc-b5617215-a4a3-4d9a-805c-05de0d434617: failed to insert zonal disk: unknown Insert disk error: googleapi: Error 400: Invalid value for field 'resource.labels': ''. Label value 'abcdefghijklmnop-this-is-my-super-long-pvc-name-with-extra-characters' violates format constraints. The value can only contain lowercase letters, numeric characters, underscores and dashes. The value can be at most 63 characters long. International characters are allowed., invalid

I see in the change from #1090 that the PVC, PV, and namespace name that were used in the disk description are now being added as labels to the disk.

But this poses a problem because Kubernetes resource names can be up to 254 characters while GCP resources can only have labels up to 64 characters.

For now we will pin our CSI driver to 1.8.2 to unblock ourselves but we would appreciate a fix or workaround so we can continue to use the stable release.

@ssapra ssapra changed the title Errors creating Persistent Volume over 63 characters Errors creating PersistentVolumeClaim over 63 characters Feb 28, 2023
@itspngu
Copy link

itspngu commented Mar 15, 2023

I can confirm this issue. For us it surfaces when using the prometheus operator:

E0315 11:48:01.784429      74 utils.go:250] CreateVolume failed to create single zonal disk pvc-69fbf4ae-ae52-4e0b-976f-27ec984949ba: failed to insert zonal disk: unknown Insert disk error: googleapi: Error 400: Invalid value for field 'resource.labels': ''. Label value 'prometheus-kube-prometheus-stack-prometheus-db-prometheus-kube-prometheus-stack-prometheus-0' violates format constraints. The value can only contain lowerc
ase letters, numeric characters, underscores and dashes. The value can be at most 63 characters long. International characters are allowed., invalid
E0315 11:48:01.784480      74 utils.go:74] /csi.v1.Controller/CreateVolume returned with error: rpc error: code = InvalidArgument desc = CreateVolume failed to create single zonal disk pvc-69fbf4ae-ae52-4e0b-976f-27ec984949ba: failed to insert zonal disk: unknown Insert disk error: googleapi: Error 400: Invalid value for field 'resource.labels': ''. Label value 'prometheus-kube-prometheus-stack-prometheus-db-prometheus-kube-
prometheus-stack-prometheus-0' violates format constraints. The value can only contain lowercase letters, numeric characters, underscores and dashes. The value can be at most 63 characters long. International characters are allowed., invalid

@sunnylovestiramisu
Copy link
Contributor

sunnylovestiramisu commented Mar 17, 2023

In our code:

case ParameterKeyPVCName:
    p.Tags[tagKeyCreatedForClaimName] = v
    p.Labels[labelKeyCreatedForClaimName] = v

And for tags there is also a restriction of value length 63: https://cloud.google.com/resource-manager/docs/tags/tags-creating-and-managing

This should get fixed too

@sunnylovestiramisu
Copy link
Contributor

The tag change has already been in 1.8.x, if you have a long PVC name the tag creation should fail already?

@ssapra
Copy link
Author

ssapra commented Mar 17, 2023

There are no issues with tags because the tags are not being used as tags in GCP but instead are encoded and set for the description. And my GCP project has not enabled the Tags feature.

For example, one of my disks in GCP shows this for description:

{
  "kubernetes.io/created-for/pv/name": "pvc-0c30ce13-d330-4fab-b930-7409cab4090a",
  "kubernetes.io/created-for/pvc/name": "mysql-data-restore-mysql-sample-8027-to-8028-2",
  "kubernetes.io/created-for/pvc/namespace": "instance-upgrade-tests",
  "storage.gke.io/created-by": "pd.csi.storage.gke.io"
}

@sunnylovestiramisu
Copy link
Contributor

sunnylovestiramisu commented Mar 17, 2023

These strings are all less than 63 characters?

{
  "kubernetes.io/created-for/pv/name": "pvc-0c30ce13-d330-4fab-b930-7409cab4090a",
  "kubernetes.io/created-for/pvc/name": "mysql-data-restore-mysql-sample-8027-to-8028-2",
  "kubernetes.io/created-for/pvc/namespace": "instance-upgrade-tests",
  "storage.gke.io/created-by": "pd.csi.storage.gke.io"
}

@itspngu
Copy link

itspngu commented Mar 17, 2023

That's an example from a working disk, meant to illustrate what the driver was already doing pre-1.9 - I think.

The problem described in the original issue prevents creation of PDs from PVCs with long (>63 chars) disks because they violate the label length restrictions on the GCP side, the PD is never created and consuming pods stay pending.

For reference, this is a PD created by driver version 1.9.1 - its name isn't longer than 63 characters so everything works.

$ gcloud compute disks describe pvc-0beb8bd0-7239-40de-8d7e-c4d0c8c42693

creationTimestamp: '2023-03-14T14:53:51.529-07:00'
description: '{"kubernetes.io/created-for/pv/name":"pvc-0beb8bd0-7239-40de-8d7e-c4d0c8c42693","kubernetes.io/created-for/pvc/name":"persistence-broker-server-0","kubernetes.io/created-for/pvc/namespace":"broker","storage.gke.io/created-by":"pd.csi.storage.gke.io"}'
id: '<redacted>'
kind: compute#disk
labelFingerprint: FpBLrJHftQg=
labels:
  cluster-environment: some-internal-thing-we-do
  cluster-name: some-internal-thing-we-do
  cluster-project: some-internal-thing-we-do
  kubernetes_io_created-for_pv_name: pvc-0beb8bd0-7239-40de-8d7e-c4d0c8c42693
  kubernetes_io_created-for_pvc_name: persistence-broker-server-0
  kubernetes_io_created-for_pvc_namespace: broker
lastAttachTimestamp: '2023-03-16T13:31:28.445-07:00'
lastDetachTimestamp: '2023-03-16T13:31:23.313-07:00'
name: pvc-0beb8bd0-7239-40de-8d7e-c4d0c8c42693
physicalBlockSizeBytes: '4096'
selfLink: https://www.googleapis.com/compute/v1/projects/<redacted>/zones/europe-west3-a/disks/pvc-0beb8bd0-7239-40de-8d7e-c4d0c8c42693
sizeGb: '10'
status: READY
type: https://www.googleapis.com/compute/v1/projects/<redacted>/zones/europe-west3-a/diskTypes/pd-ssd
users:
- https://www.googleapis.com/compute/v1/projects/<redacted>/zones/europe-west3-a/instances/some-instance
zone: https://www.googleapis.com/compute/v1/projects/<redacted>/zones/europe-west3-a

When a PVC has a name longer than 63 characters, the driver throws an error, and no disk is created.

Specifically, the label kubernetes_io_created-for_pvc_name: is problematic, because Kubernetes allows up to 254 characters, and some configurations make use of that "space", for better or worse, as is the case with the wonderful (I'm being sarcastic :D) naming scheme kube-prometheus-stack uses for some of the PVCs it creates, such as prometheus-kube-prometheus-stack-prometheus-db-prometheus-kube- prometheus-stack-prometheus-0.

@itspngu
Copy link

itspngu commented Mar 17, 2023

If I may offer my take on a solution to this problem which doesn't involve having (GKE, or roll-your-own) downstream customers rewrite their configuration to stay within the 63 character limit, it would be great if this new auto-labelling mechanism had a feature toggle so it could be opted out of. Silently dropping the labels if they are too long, or truncating them to 63 characters, might surprise people who decide to use them in automation or somesuch.

Either way, leaving this as is will lead to a lot of time spent figuring out why pods stay pending, because the first error message only appears in the pod status 5 minutes after it being scheduled and is a rather vague hint about a timeout while waiting for a volume to be provisioned. To find the actual root cause, you have to check the PD driver logs.

@ssapra
Copy link
Author

ssapra commented Mar 17, 2023

Yes, my example above was to show how the tags were used to build the description and not actually used as tags on the GCP resource.

I can create a PVC with a name longer than 63 characters before using 1.9.0.

$ gcloud compute disks describe pvc-eb7f829e-aa3d-4a27-aa71-4ddf692a3acc --format='(name,description)'

description: '{"kubernetes.io/created-for/pv/name":"pvc-eb7f829e-aa3d-4a27-aa71-4ddf692a3acc","kubernetes.io/created-for/pvc/name":"postgres-sample-with-a-very-long-name-monitor-postgres-sample-with-a-very-long-name-monitor-0","kubernetes.io/created-for/pvc/namespace":"single-instance-tests","storage.gke.io/created-by":"pd.csi.storage.gke.io"}'
name: pvc-eb7f829e-aa3d-4a27-aa71-4ddf692a3acc

However, one caveat is that I actually have 1.7.2 deployed because stable-master/image.yaml is not pointing at the right tag in the v1.8.2 tagged release.

@sunnylovestiramisu
Copy link
Contributor

Discussed offline, we will revert this PR and create a new 1.9.2 release with the revert. @Sneha-at will take a look when back from vacation.

@pia-drcash
Copy link

@sunnylovestiramisu @Sneha-at Adding "namespace" and "pv/pvc name" labels to gcp disks is an extremely useful feature. Please, add a fix to cut everything out of 64 letters instead of disabling such labels entirely!

@Sneha-at
Copy link
Contributor

Sneha-at commented Apr 10, 2023

@pia-drcash we would be able to keep "namespace name" labels but are facing issue with "PV/PVC name" labels. Do you need the PV/PVC name labels too? If we truncate names at 64 we will not be able to uniquely identify the PV/PVCs. Could share your use case for both "PV" and "PVC" name labels

@itspngu
Copy link

itspngu commented Apr 11, 2023

@sunnylovestiramisu @Sneha-at Adding "namespace" and "pv/pvc name" labels to gcp disks is an extremely useful feature. Please, add a fix to cut everything out of 64 letters instead of disabling such labels entirely!

It is going to be a lot more reliable to parse this data from the JSON in the disks' descriptions if you want to use this info in automation. See this post above for info.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants