Description of problem: Kuryr controller on OCP 3.11.452 in CrashLoopBackOff due to missing loadbalancer. Version-Release number of selected component (if applicable): OCP 3.11.452 on OSP13r13 How reproducible: Install OCP 3.11.452 on OSP13z13, create a namespace with multiple services and pods and leave it running. After a while (3 days for me) I found kuryr controller crashing because loadbalancers don't exist. Expected results: Loadbalancers should not disappear, or at least, get re-created by kuryr, if possible. Additional info: Attaching logs of kuryr controller, service and endpoints for a service that doesn't have a corresponding Octavia loadbalancer anymore.
- Made sure the Kuryr controller code had the following patch:https://github.com/openshift/kuryr-kubernetes/pull/572 - All tempest tests passed - Simulated by setting a LB moving to an ERROR state and restarting the Kuryr controller and make sure it's ready Created a service: apiVersion: v1 kind: Service metadata: name: demo labels: app: demo spec: selector: app: demo ports: - port: 80 protocol: TCP targetPort: 8080 Created a deployment: apiVersion: apps/v1 kind: Deployment metadata: name: demo labels: app: demo spec: replicas: 3 selector: matchLabels: app: demo template: metadata: labels: app: demo spec: containers: - name: demo image: kuryr/demo ports: - containerPort: 8080 Set the LB state to ERROR: source ~/stackrc && ssh heat-admin@$(openstack server list -f value -c Name -c Networks | grep controller-0 | awk -F= '{print $2}') sudo docker exec -uroot -it galera-bundle-docker-0 mysql MariaDB [(none)]>use octavia; MariaDB [(none)]> UPDATE load_balancer SET provisioning_status='ERROR' WHERE name='default/demo'; Restart the Kuryr controller and make sure it's ready Version: OSP13 2021-09-20.1 OCP v3.11.524
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 3.11.542 security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:3915