Description of problem: LB type service is not working (not responding) when the OVN octavia provider is used. E0412 13:10:21.522328 1 controller.go:310] error processing service lb-test-ns/lb-test-svc (will retry): failed to ensure load balancer: failed to patch service object lb-test-ns/lb-test-svc: services "lb-test-svc" is forbidden: User "system:serviceaccount:kube-system:cloud-controller-manager" cannot patch resource "services" in API group "" in the namespace "lb-test-ns" I0412 13:10:21.522551 1 event.go:294] "Event occurred" object="lb-test-ns/lb-test-svc" fieldPath="" kind="Service" apiVersion="v1" type="Warning" reason="SyncLoadBalancerFailed" message="Error syncing load balancer: failed to ensure load balancer: failed to patch service object lb-test-ns/lb-test-svc: services \"lb-test-svc\" is forbidden: User \"system:serviceaccount:kube-system:cloud-controller-manager\" cannot patch resource \"services\" in API group \"\" in the namespace \"lb-test-ns\"" Version-Release number of selected component (if applicable): OCP 4.11.0-0.nightly-2022-04-08-205307 OSP 16.1.7 How reproducible: always Steps to Reproduce: 1. Install 4.11 with ExternalCloudProvider $ openshift-install create manifests --log-level=debug --dir=/home/stack/ostest/ $ cd ostest/ $ cat <<EOF >manifests/manifest_feature_gate.yaml apiVersion: config.openshift.io/v1 kind: FeatureGate metadata: annotations: include.release.openshift.io/self-managed-high-availability: "true" include.release.openshift.io/single-node-developer: "true" release.openshift.io/create-only: "true" name: cluster spec: customNoUpgrade: enabled: - ExternalCloudProvider featureSet: CustomNoUpgrade EOF $ openshift-install create cluster --log-level=debug --dir=/home/stack/ostest/ 2. Change the cloud provider Octavia config in order to use the OVN Octavia driver $ oc get cm cloud-provider-config -n openshift-config -o yaml [...] config: | [Global] secret-name = openstack-credentials secret-namespace = kube-system ca-file = /etc/kubernetes/static-pod-resources/configmaps/cloud-config/ca-bundle.pem [LoadBalancer] use-octavia = True lb-provider = ovn <------- lb-method = SOURCE_IP_PORT <---------- kind: ConfigMap [...] The nodes are going to unschedulable and then back to ready, so the change is supposedly applied. $ oc get nodes NAME STATUS ROLES AGE VERSION ostest-ffjcv-master-0 Ready master 4h46m v1.23.3+54654d2 ostest-ffjcv-master-1 Ready master 4h45m v1.23.3+54654d2 ostest-ffjcv-master-2 Ready master 4h37m v1.23.3+54654d2 ostest-ffjcv-worker-0-7bwf8 Ready worker 4h26m v1.23.3+54654d2 ostest-ffjcv-worker-0-dbj9s Ready worker 4h26m v1.23.3+54654d2 ostest-ffjcv-worker-0-gzjf2 Ready worker 4h26m v1.23.3+54654d2 3. Create the loadbalancer type svc with below manifest: cat <<EOF | oc apply -f - --- apiVersion: project.openshift.io/v1 kind: Project metadata: name: lb-test-ns labels: kubernetes.io/metadata.name: lb-test-ns --- apiVersion: apps/v1 kind: Deployment metadata: name: lb-test-dep namespace: lb-test-ns labels: app: lb-test-dep spec: replicas: 2 selector: matchLabels: app: lb-test-dep template: metadata: labels: app: lb-test-dep spec: containers: - image: quay.io/kuryr/demo name: demo --- apiVersion: v1 kind: Service metadata: name: lb-test-svc namespace: lb-test-ns labels: app: lb-test-dep spec: ports: - port: 80 targetPort: 8080 selector: app: lb-test-dep type: LoadBalancer EOF 4. Check LB, pod and svc creation LB -- | 8d001d70-e891-4379-a850-2335819aa7cd | kube_service_kubernetes_lb-test-ns_lb-test-svc | a64676dfa4b24cc9adfb620fef7b6506 | 10.196.3.144 | ACTIVE | ovn | Pods ---- lb-test-ns lb-test-dep-68d6754b4d-mjkkh 1/1 Running 0 153m lb-test-ns lb-test-dep-68d6754b4d-x47fh 1/1 Running 0 153m svc --- lb-test-ns lb-test-svc LoadBalancer 172.30.131.186 10.46.22.227 80:32383/TCP 154m 5. Check connectivity to the svc $ curl 10.46.22.227 Actual results: $ curl 10.46.22.227 (no reply) Expected results: reply from the svc pods Additional info: $ oc -n lb-test-ns describe svc lb-test-svc Name: lb-test-svc Namespace: lb-test-ns Labels: app=lb-test-dep Annotations: <none> Selector: app=lb-test-dep Type: LoadBalancer IP Family Policy: SingleStack IP Families: IPv4 IP: 172.30.131.186 IPs: 172.30.131.186 LoadBalancer Ingress: 10.46.22.227 Port: <unset> 80/TCP TargetPort: 8080/TCP NodePort: <unset> 32383/TCP Endpoints: 10.128.2.13:8080,10.129.2.9:8080 Session Affinity: None External Traffic Policy: Cluster Events: Type Reason Age From Message ---- ------ ---- ---- ------- Warning SyncLoadBalancerFailed 159m service-controller Error syncing load balancer: failed to ensure load balancer: load balancer 8d001d70-e891-4379-a850-2335819aa7cd is not ACTIVE, current provisioning status: PENDING_CREATE Warning SyncLoadBalancerFailed 158m (x2 over 158m) service-controller Error syncing load balancer: failed to ensure load balancer: failed to patch service object lb-test-ns/lb-test-svc: services "lb-test-svc" is forbidden: User "system:serviceaccount:kube-system:cloud-controller-manager" cannot patch resource "services" in API group "" in the namespace "lb-test-ns" Normal EnsuringLoadBalancer 158m (x5 over 159m) service-controller Ensuring load balancer Normal EnsuredLoadBalancer 158m (x2 over 158m) service-controller Ensured load balancer CCM logs (oc -n openshift-cloud-controller-manager logs openstack-cloud-controller-manager-5d6b64cc45-vjckq) -------- I0412 13:09:59.295060 1 event.go:294] "Event occurred" object="lb-test-ns/lb-test-svc" fieldPath="" kind="Service" apiVersion="v1" type="Normal" reason="EnsuringLoadBalancer" message="Ensuring load balancer" I0412 13:09:59.412977 1 loadbalancer.go:1947] "EnsureLoadBalancer" cluster="kubernetes" service="lb-test-ns/lb-test-svc" W0412 13:09:59.728918 1 loadbalancer.go:1708] LoadBalancerSourceRanges is ignored I0412 13:09:59.798352 1 loadbalancer.go:1836] "Creating fully populated loadbalancer" lbName="kube_service_kubernetes_lb-test-ns_lb-test-svc" service="lb-test-ns/lb-test-svc" I0412 13:10:01.808313 1 loadbalancer.go:151] "Waiting for load balancer ACTIVE" lbID="8d001d70-e891-4379-a850-2335819aa7cd" I0412 13:10:02.936338 1 loadbalancer.go:165] "Load balancer ACTIVE" lbID="8d001d70-e891-4379-a850-2335819aa7cd" E0412 13:10:02.936665 1 controller.go:310] error processing service lb-test-ns/lb-test-svc (will retry): failed to ensure load balancer: load balancer 8d001d70-e891-4379-a850-2335819aa7cd is not ACTIVE, current provisioning status: PENDING_CREATE I0412 13:10:02.937889 1 event.go:294] "Event occurred" object="lb-test-ns/lb-test-svc" fieldPath="" kind="Service" apiVersion="v1" type="Warning" reason="SyncLoadBalancerFailed" message="Error syncing load balancer: failed to ensure load balancer: load balancer 8d001d70-e891-4379-a850-2335819aa7cd is not ACTIVE, current provisioning status: PENDING_CREATE" I0412 13:10:07.937329 1 loadbalancer.go:1947] "EnsureLoadBalancer" cluster="kubernetes" service="lb-test-ns/lb-test-svc" I0412 13:10:07.937893 1 event.go:294] "Event occurred" object="lb-test-ns/lb-test-svc" fieldPath="" kind="Service" apiVersion="v1" type="Normal" reason="EnsuringLoadBalancer" message="Ensuring load balancer" W0412 13:10:08.055965 1 loadbalancer.go:1708] LoadBalancerSourceRanges is ignored E0412 13:10:10.457513 1 controller.go:310] error processing service lb-test-ns/lb-test-svc (will retry): failed to ensure load balancer: failed to patch service object lb-test-ns/lb-test-svc: services "lb-test-svc" is forbidden: User "system:serviceaccount:kube-system:cloud-controller-manager" cannot patch resource "services" in API group "" in the namespace "lb-test-ns" I0412 13:10:10.458023 1 event.go:294] "Event occurred" object="lb-test-ns/lb-test-svc" fieldPath="" kind="Service" apiVersion="v1" type="Warning" reason="SyncLoadBalancerFailed" message="Error syncing load balancer: failed to ensure load balancer: failed to patch service object lb-test-ns/lb-test-svc: services \"lb-test-svc\" is forbidden: User \"system:serviceaccount:kube-system:cloud-controller-manager\" cannot patch resource \"services\" in API group \"\" in the namespace \"lb-test-ns\"" I0412 13:10:20.459134 1 loadbalancer.go:1947] "EnsureLoadBalancer" cluster="kubernetes" service="lb-test-ns/lb-test-svc" I0412 13:10:20.460278 1 event.go:294] "Event occurred" object="lb-test-ns/lb-test-svc" fieldPath="" kind="Service" apiVersion="v1" type="Normal" reason="EnsuringLoadBalancer" message="Ensuring load balancer" W0412 13:10:20.586724 1 loadbalancer.go:1708] LoadBalancerSourceRanges is ignored I0412 13:10:21.015741 1 event.go:294] "Event occurred" object="lb-test-ns/lb-test-svc" fieldPath="" kind="Service" apiVersion="v1" type="Normal" reason="EnsuredLoadBalancer" message="Ensured load balancer" I0412 13:10:21.043077 1 loadbalancer.go:1947] "EnsureLoadBalancer" cluster="kubernetes" service="lb-test-ns/lb-test-svc" I0412 13:10:21.044079 1 event.go:294] "Event occurred" object="lb-test-ns/lb-test-svc" fieldPath="" kind="Service" apiVersion="v1" type="Normal" reason="EnsuringLoadBalancer" message="Ensuring load balancer" W0412 13:10:21.156610 1 loadbalancer.go:1708] LoadBalancerSourceRanges is ignored E0412 13:10:21.522328 1 controller.go:310] error processing service lb-test-ns/lb-test-svc (will retry): failed to ensure load balancer: failed to patch service object lb-test-ns/lb-test-svc: services "lb-test-svc" is forbidden: User "system:serviceaccount:kube-system:cloud-controller-manager" cannot patch resource "services" in API group "" in the namespace "lb-test-ns" I0412 13:10:21.522551 1 event.go:294] "Event occurred" object="lb-test-ns/lb-test-svc" fieldPath="" kind="Service" apiVersion="v1" type="Warning" reason="SyncLoadBalancerFailed" message="Error syncing load balancer: failed to ensure load balancer: failed to patch service object lb-test-ns/lb-test-svc: services \"lb-test-svc\" is forbidden: User \"system:serviceaccount:kube-system:cloud-controller-manager\" cannot patch resource \"services\" in API group \"\" in the namespace \"lb-test-ns\"" I0412 13:10:26.523345 1 loadbalancer.go:1947] "EnsureLoadBalancer" cluster="kubernetes" service="lb-test-ns/lb-test-svc" I0412 13:10:26.524912 1 event.go:294] "Event occurred" object="lb-test-ns/lb-test-svc" fieldPath="" kind="Service" apiVersion="v1" type="Normal" reason="EnsuringLoadBalancer" message="Ensuring load balancer" W0412 13:10:26.636587 1 loadbalancer.go:1708] LoadBalancerSourceRanges is ignored I0412 13:10:27.051463 1 event.go:294] "Event occurred" object="lb-test-ns/lb-test-svc" fieldPath="" kind="Service" apiVersion="v1" type="Normal" reason="EnsuredLoadBalancer" message="Ensured load balancer"
I can reproduce this using Octavia backend. This is an RBAC issue, not a LB backend issue.
Removing the Triaged keyword because: * the QE automation assessment (flag qe_test_coverage) is missing
I've created 2 PRs to address this bug: The actual fix is a permission fix in CCCMO: https://github.com/openshift/cluster-cloud-controller-manager-operator/pull/184. This adds the required permissions for CPO to annotate the service object. A secondary issue was that we noticed that CPO would succeed on the second attempt, but without adding the expected annotation. This should be fixed by https://github.com/kubernetes/kubernetes/pull/109601.
Verified in 4.11.0-0.nightly-2022-05-05-015322 on top of OSP 16.1.7. Followed reproducer steps to verify this BZ, the RBAC issues are fixed but the svc is still not reachable due to some bugs in OSP Octavia OVN component, tracking it in bug 2082496.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:5069