Pods can't mount PVC as disk seems not recreated from restore, despite pvc is up. #739

nsteinmetz · 2021-04-08T10:30:59Z

On a GKE cluster with new csi driver enabled and version v1.18.16-gke.2100, I evaluate velero (v1.5 and latest CSI plugin for velero) and volume snapshots.

For snapshots creation, it works well:

I can see volumesnapshots objects on kubernetes side
I can see snapshots in the GCP console

When trying to restore content:

I first delete deployments, pv and pvc from k8s
then I deleted disks from GCP
launch velero restore command

From PV/PVC side, seems it works as expected. All are bound BUT i don't see newly created disks on GCP side for the impacted pv/pvc.

For a PVC, I can see that it's restored from a snapshot:

Name:          pvc-mysql-k8s-nst3-datataskio
Namespace:     default
StorageClass:  datatask-sc
Status:        Bound
Volume:        pvc-f3bf0015-1c15-430e-bfe8-1b222d161a5d
Labels:        app=git
               dt-volume=mysql
               velero.io/backup-name=naive-backup9
               velero.io/restore-name=naive-backup9-20210408120114
               velero.io/volume-snapshot-name=velero-pvc-mysql-k8s-nst3-datataskio-vm66z
Annotations:   pv.kubernetes.io/bind-completed: yes
               pv.kubernetes.io/bound-by-controller: yes
               velero.io/backup-name: naive-backup9
               velero.io/volume-snapshot-name: velero-pvc-mysql-k8s-nst3-datataskio-vm66z
Finalizers:    [kubernetes.io/pvc-protection]
Capacity:      50Gi
Access Modes:  RWO
VolumeMode:    Filesystem
DataSource:
  APIGroup:  snapshot.storage.k8s.io
  Kind:      VolumeSnapshot
  Name:      velero-pvc-mysql-k8s-nst3-datataskio-vm66z
Used By:     mysql-6f996b96b6-9pmvg
Events:      <none>

but from pod, it cannot mount the pvc:

Name:           mysql-6f996b96b6-9pmvg
Namespace:      default
Priority:       0
Node:           gke-k8s-nst3-datataskio-default-pool-114e9218-12g8/10.1.0.3
Start Time:     Thu, 08 Apr 2021 12:13:34 +0200
Labels:         app=mysql
                pod-template-hash=6f996b96b6
Annotations:    <none>
Status:         Pending
IP:             
IPs:            <none>
Controlled By:  ReplicaSet/mysql-6f996b96b6
Containers:
  mysql:
    Container ID:  
    Image:         mysql:5.7
    Image ID:      
    Port:          3306/TCP
    Host Port:     0/TCP
    Args:
      --ignore-db-dir=lost+found
    State:          Waiting
      Reason:       ContainerCreating
    Ready:          False
    Restart Count:  0
    Limits:
      cpu:  50m
    Requests:
      cpu:  50m
    Environment:
      MYSQL_ROOT_PASSWORD:  <set to the key 'MYSQL_ROOT_PASSWORD' in secret 'mysql-secret'>  Optional: false
      MYSQL_DATABASE:       <set to the key 'MYSQL_DATABASE' in secret 'mysql-secret'>       Optional: false
      MYSQL_USER:           <set to the key 'MYSQL_USER' in secret 'mysql-secret'>           Optional: false
      MYSQL_PASSWORD:       <set to the key 'MYSQL_PASSWORD' in secret 'mysql-secret'>       Optional: false
    Mounts:
      /var/lib/mysql from gce-pd-mysql (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-pkdbr (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  gce-pd-mysql:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  pvc-mysql-k8s-nst3-datataskio
    ReadOnly:   false
  default-token-pkdbr:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-pkdbr
    Optional:    false
QoS Class:       Burstable
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                 node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason              Age                From                     Message
  ----     ------              ----               ----                     -------
  Normal   Scheduled           13m                default-scheduler        Successfully assigned default/mysql-6f996b96b6-9pmvg to gke-k8s-nst3-datataskio-default-pool-114e9218-12g8
  Warning  FailedAttachVolume  54s (x7 over 13m)  attachdetach-controller  AttachVolume.Attach failed for volume "pvc-f3bf0015-1c15-430e-bfe8-1b222d161a5d" : rpc error: code = NotFound desc = Could not find disk Key{"pvc-f3bf0015-1c15-430e-bfe8-1b222d161a5d", zone: "europe-west1-b"}: googleapi: Error 404: The resource 'projects/datataskio/zones/europe-west1-b/disks/pvc-f3bf0015-1c15-430e-bfe8-1b222d161a5d' was not found, notFound
  Warning  FailedMount         20s (x6 over 11m)  kubelet                  Unable to attach or mount volumes: unmounted volumes=[gce-pd-mysql], unattached volumes=[gce-pd-mysql default-token-pkdbr]: timed out waiting for the condition

What should I do ?
Is this use case supported yet ?
Or should I just recreate disk from snapshots on GCP console side instead ?

From the docs, it should be supported as far as I understand it.
https://cloud.google.com/kubernetes-engine/docs/how-to/persistent-volumes/volume-snapshots

GKE cluster use this version of the driver:

gke.gcr.io/gcp-compute-persistent-disk-csi-driver:v1.0.1-gke.0

The text was updated successfully, but these errors were encountered:

mattcary · 2021-04-12T17:47:37Z

I think this may because velero is trying to restore the disk via cloning, which is not supported yet: #161

nsteinmetz · 2021-04-13T08:10:20Z

Thanks, I'll recheck this later as I made a quick POC and will do the final implementation and a more complete bunch of tests in the coming days.

I also noticed that some disks named restore-* appeared but I don't know if it was with this driver or the in-tree one. I think the latter.

mattcary · 2021-04-13T15:49:45Z

If restore-* disks appeared under the in-tree driver then perhaps velero is using the default storage class somewhere? Unfortunately I'm not familiar with velero so don't have any constructive advice.

…

On Tue, Apr 13, 2021 at 1:10 AM Nicolas Steinmetz ***@***.***> wrote: Thanks, I'll recheck this later as I made a quick POC and will do the final implementation and a more complete bunch of tests in the coming days. I also noticed that some disks named restore-* appeared but I don't know if it was with this driver or the in-tree one. I think the latter. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#739 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AIJCBAFFK43DNZFEHYQ4XH3TIP375ANCNFSM42STDYNA> .

nsteinmetz · 2021-04-13T15:54:44Z

Just tried with CSI again:

no disk restored as mentionned previously - should be due to the missing cloning support
if I recreate manually the disk from snapshots and restore via velero pv/pvc, seems I missed something as some content were not restored. But at least deployments are up and running.

I think restore-* were from velero for gcpPersistentDisk ; I'll confirm tomorrow ;-)

=> closing in favor of #161

nsteinmetz · 2021-04-14T13:11:35Z

Just for your info: restore-* disks are created by Velero when using the PV/PVC with the in-tree driver.

nsteinmetz closed this as completed Apr 13, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pods can't mount PVC as disk seems not recreated from restore, despite pvc is up. #739

Pods can't mount PVC as disk seems not recreated from restore, despite pvc is up. #739

nsteinmetz commented Apr 8, 2021 •

edited

Loading

mattcary commented Apr 12, 2021

nsteinmetz commented Apr 13, 2021

mattcary commented Apr 13, 2021 via email

nsteinmetz commented Apr 13, 2021

nsteinmetz commented Apr 14, 2021

Pods can't mount PVC as disk seems not recreated from restore, despite pvc is up. #739

Pods can't mount PVC as disk seems not recreated from restore, despite pvc is up. #739

Comments

nsteinmetz commented Apr 8, 2021 • edited Loading

mattcary commented Apr 12, 2021

nsteinmetz commented Apr 13, 2021

mattcary commented Apr 13, 2021 via email

nsteinmetz commented Apr 13, 2021

nsteinmetz commented Apr 14, 2021

nsteinmetz commented Apr 8, 2021 •

edited

Loading