Description of problem: It's possible that the Listener timeouts data was included in the CRD status, even though there is no annotation on the Service, probably the Listener was not present on the CRD, but created by Octavia and the controller attempts to find it and ends up includes the timeouts value. The timeouts included on the CRD then differs from the spec values causing the constant update of the Listener and the load-balancer remains on PENDING_UPDATE state. $ openstack loadbalancer list |grep cluster-baremetal-operator-service openstack | aba05c65-2628-406b-b71f-b12d36a9215b | openshift-machine-api/cluster-baremetal-operator-service | 3583506d9c92457b9971b468f85fa720 | 172.30.244.252 | PENDING_UPDATE | ovn | $ oc get klb cluster-baremetal-operator-service -n openshift-machine-api -o yaml apiVersion: openstack.org/v1 kind: KuryrLoadBalancer metadata: creationTimestamp: "2021-05-28T12:00:47Z" finalizers: - kuryr.openstack.org/kuryrloadbalancer-finalizers generation: 6577 name: cluster-baremetal-operator-service namespace: openshift-machine-api resourceVersion: "113259" uid: bb7c616d-35a3-45e5-b053-88663a3b016d spec: endpointSlices: - endpoints: - addresses: - 10.128.16.101 conditions: ready: true targetRef: kind: Pod name: cluster-baremetal-operator-856b58cc6c-hvdqp namespace: openshift-machine-api resourceVersion: "30913" uid: 3a5818b1-8138-4216-96d6-4fd569bdaacf ports: - name: https port: 8443 protocol: TCP ip: 172.30.244.252 ports: - name: https port: 8443 protocol: TCP targetPort: https project_id: 3583506d9c92457b9971b468f85fa720 provider: ovn security_groups_ids: - 475e073a-c0bc-47a7-97f0-6955258b0bde subnet_id: b853adda-2f75-4cf8-99c3-21568a9b8d0c timeout_client_data: 0 timeout_member_data: 0 type: ClusterIP status: listeners: - id: 443151ce-a7da-406b-90d1-37c7edd745e0 loadbalancer_id: aba05c65-2628-406b-b71f-b12d36a9215b name: openshift-machine-api/cluster-baremetal-operator-service:TCP:8443 port: 8443 project_id: 3583506d9c92457b9971b468f85fa720 protocol: TCP timeout_client_data: 50000 timeout_member_data: 50000 Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
Checked with: OCP 4.8.0-0.nightly-2021-06-14-145150 OSP RHOS-16.1-RHEL-8-20210323.n.0 Followed the following to verify: 1. Run on ocp installation 2. Create a deployment: oc create deployment demo --image=quay.io/kuryr/demo 3. Expose the deployment: oc expose deploy/demo --port=80 --target-port=8080 4. Edit the kuryrloadbalancer CRD and remove listener, pools and members from it: oc edit klb demo 5. check that the crd contains all the information about the lb resources again which was populated by kuryr 6. check that all the kuryrloadbalancer CRDs on the cluster do not have timeouts specified on the status section (in other words any timeout with value 50000): oc get klb -A |grep "timeout_"
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:2438