Skip to content

additionalPorts and allowedCidrs loops and never completes LB provisioning #1687

Closed as not planned
@huxcrux

Description

@huxcrux

/kind bug

What steps did you take and what happened:
When creating a cluster and defining booth one or more additionalPorts and at least one CIDR under allowedCidrs this causes the LB to never become fully functional and no cluster to be created.

It seems like there is a check missing for if the LB is out of PENDING_UPDATE causing the API to respond with

capo-system/capo-controller-manager-7c78d95c95-7qphw[manager]: I0926 13:49:42.720981 1 recorder.go:104] "events: Failed to create listener k8s-clusterapi-cluster-default-hux-lab1-kubeapi-31991: Expected HTTP response code [201 202] when accessing [POST https://:9876/v2.0/lbaas/listeners], but got 409 instead\n{"faultcode": "Client", "faultstring": "Load Balancer 7d690fcb-ea67-4462-a49e-dda806c8f792 is immutable and cannot be updated.", "debuginfo": null}" type="Warning" object={Kind:OpenStackCluster Namespace:default Name:hux-lab1 UID:d973908d-ccd0-4f5b-9f9f-d628143d021a APIVersion:infrastructure.cluster.x-k8s.io/v1alpha7 ResourceVersion:33567788 FieldPath:} reason="Failedcreatelistener"

I have also tested booth additionalPorts and allowedCidrs one by one and it works as intended, it's just when booth are used at the same time this is occuring.

Also worth noticing is that the allowedCidrs option is being set on the LB and I have verified that the security groups for the LB contains the correct rules meaning this is just a race condition during provisioning

What did you expect to happen:
The flow seems to be:

  1. Create LB
  2. Create Listener
  3. If allowedCidrs is defined update the listener with the allowedCidrs list.
  4. Create the 2nd listener (this is done without checking if the LB is no longer in PENDING_UPDATE state causing an API failure and CAPO the allocates a ned FIP for the LB and try reconciling listeners again, however since the LB have a FIP connected it will continue to loop in this state)

Anything else you would like to add:
A gist with completed output: https://gist.github.com/huxcrux/7c288f1c0b045de67eac1beaf3211e6e
I redacted IPs and Openstack API URIs. Every time FIP attachment fails it's a new FIP that has been allocated.

Environment:

  • Cluster API Provider OpenStack version (Or git rev-parse HEAD if manually built): 0.8.0
  • Cluster-API version: 1.5.2
  • OpenStack version: Ussuri
  • Minikube/KIND version: Kind 0.20.0
  • Kubernetes version (use kubectl version): 1.28.1
  • OS (e.g. from /etc/os-release): Ubuntu 20.04

Metadata

Metadata

Assignees

No one assigned

    Labels

    kind/bugCategorizes issue or PR as related to a bug.lifecycle/rottenDenotes an issue or PR that has aged beyond stale and will be auto-closed.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions