You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
What steps did you take and what happened:
[A clear and concise description of what the bug is.]
A vm instance(corresponding to an OpenstackMachine CR) creation failed by some reason and there is no vif binded on that instance. That OpenstackMachine is marked as unhealthy and capo-controller tries to replace it. During the deletion reconcile loop, there is a task to remove ports binded on the instance. The port deletion task is failing with the following errors in capo-controller.
I1218 12:40:46.844270 1 controller.go:114] "Observed a panic in reconciler: runtime error: index out of range [0] with length 0" controller="openstackmachine" controllerGroup="infrastructure.cluster.x-k8s.io" controllerKind="OpenStackMachine" OpenStackMachine="magnum-system/kube-a1d9n-default-worker-infra-t4kdt-f9xfs" namespace="magnum-system" name="kube-a1d9n-default-worker-infra-t4kdt-f9xfs" reconcileID=67e5c4eb-0d10-4d4f-ad5d-acd383736303
panic: runtime error: index out of range [0] with length 0 [recovered]
panic: runtime error: index out of range [0] with length 0
goroutine 460 [running]:
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile.func1()
/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:115 +0x1fa
panic({0x19480c0, 0xc0014870f8})
/usr/local/go/src/runtime/panic.go:884 +0x212
sigs.k8s.io/cluster-api-provider-openstack/pkg/cloud/services/networking.(*Service).GarbageCollectErrorInstancesPort(0xc008e98f40, {0x1d09a90, 0xc00b825900}, {0xc000b7ecc0, 0x2b}, {0xc001cdc200, 0x1, 0x0?})
/workspace/pkg/cloud/services/networking/port.go:318 +0x249
sigs.k8s.io/cluster-api-provider-openstack/pkg/cloud/services/compute.(*Service).DeleteInstance(0xc000c99800, 0xc00af36280?, {0x1d09a90, 0xc00b825900}, 0xc008e988e0, 0xc00067e690)
/workspace/pkg/cloud/services/compute/instance.go:620 +0x60c
sigs.k8s.io/cluster-api-provider-openstack/controllers.(*OpenStackMachineReconciler).reconcileDelete(0x1d1cfa0?, {0x1d1de00, 0xc001f53a40}, 0xc007fa9520, 0xc0039b8000, 0xc00af36280, 0xc00b825900)
/workspace/controllers/openstackmachine_controller.go:278 +0x61d
sigs.k8s.io/cluster-api-provider-openstack/controllers.(*OpenStackMachineReconciler).Reconcile(0xc00048c360, {0x1d19838, 0xc00aa251d0}, {{{0xc000c00e10?, 0x10?}, {0xc000b7ecc0?, 0x40dc07?}}})
/workspace/controllers/openstackmachine_controller.go:150 +0xa8d
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile(0x1d19838?, {0x1d19838?, 0xc00aa251d0?}, {{{0xc000c00e10?, 0x17a1c00?}, {0xc000b7ecc0?, 0x10?}}})
/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:118 +0xc8
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler(0xc0009780a0, {0x1d19790, 0xc00053e080}, {0x18bd520?, 0xc000bd2960?})
/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:314 +0x3a5
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem(0xc0009780a0, {0x1d19790, 0xc00053e080})
/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:265 +0x1d9
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2()
/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:226 +0x85
created by sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2
/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:222 +0x333
This just one failed OpenstackMachine breaks the whole capo-controller now. It ends up to crashloopbackoff of the capo-controller and causes validation webhook failures for operations against Openstack related CRs.
What did you expect to happen:
The failed instances with no port should be deleted without any issue.
Anything else you would like to add:
[Miscellaneous information that will assist in solving the issue.]
Environment:
Cluster API Provider OpenStack version (Or git rev-parse HEAD if manually built): 0.9.0 >=
Cluster-API version: 1.5.x
OpenStack version: stable/zed
Minikube/KIND version:
Kubernetes version (use kubectl version): 1.27
OS (e.g. from /etc/os-release): Ubuntu 22.04
The text was updated successfully, but these errors were encountered:
/kind bug
What steps did you take and what happened:
[A clear and concise description of what the bug is.]
A vm instance(corresponding to an OpenstackMachine CR) creation failed by some reason and there is no vif binded on that instance. That OpenstackMachine is marked as unhealthy and capo-controller tries to replace it. During the deletion reconcile loop, there is a task to remove ports binded on the instance. The port deletion task is failing with the following errors in capo-controller.
This just one failed OpenstackMachine breaks the whole capo-controller now. It ends up to
crashloopbackoff
of the capo-controller and causes validation webhook failures for operations against Openstack related CRs.What did you expect to happen:
The failed instances with no port should be deleted without any issue.
Anything else you would like to add:
[Miscellaneous information that will assist in solving the issue.]
Environment:
git rev-parse HEAD
if manually built): 0.9.0 >=kubectl version
): 1.27/etc/os-release
): Ubuntu 22.04The text was updated successfully, but these errors were encountered: