Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 2074606

Summary: occm does not have permissions to annotate SVC objects
Product: OpenShift Container Platform Reporter: Jon Uriarte <juriarte>
Component: Cloud ComputeAssignee: Matthew Booth <mbooth>
Cloud Compute sub component: OpenStack Provider QA Contact: Jon Uriarte <juriarte>
Status: CLOSED ERRATA Docs Contact:
Severity: high    
Priority: urgent CC: aos-bugs, m.andre, mfedosin, pprinett, stephenfin
Version: 4.11Keywords: Triaged
Target Milestone: ---   
Target Release: 4.11.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-08-10 11:06:30 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Jon Uriarte 2022-04-12 15:55:33 UTC
Description of problem:

LB type service is not working (not responding) when the OVN octavia provider is used.

E0412 13:10:21.522328       1 controller.go:310] error processing service lb-test-ns/lb-test-svc (will retry): failed to ensure load balancer: failed to patch service object lb-test-ns/lb-test-svc: services "lb-test-svc" is forbidden: User "system:serviceaccount:kube-system:cloud-controller-manager" cannot patch resource "services" in API group "" in the namespace "lb-test-ns"
I0412 13:10:21.522551       1 event.go:294] "Event occurred" object="lb-test-ns/lb-test-svc" fieldPath="" kind="Service" apiVersion="v1" type="Warning" reason="SyncLoadBalancerFailed" message="Error syncing load balancer: failed to ensure load balancer: failed to patch service object lb-test-ns/lb-test-svc: services \"lb-test-svc\" is forbidden: User \"system:serviceaccount:kube-system:cloud-controller-manager\" cannot patch resource \"services\" in API group \"\" in the namespace \"lb-test-ns\""

Version-Release number of selected component (if applicable):
OCP 4.11.0-0.nightly-2022-04-08-205307
OSP 16.1.7

How reproducible: always


Steps to Reproduce:
1. Install 4.11 with ExternalCloudProvider

      $ openshift-install create manifests --log-level=debug --dir=/home/stack/ostest/
      $ cd ostest/
      $ cat <<EOF >manifests/manifest_feature_gate.yaml
      apiVersion: config.openshift.io/v1
      kind: FeatureGate
      metadata:
        annotations:
          include.release.openshift.io/self-managed-high-availability: "true"
          include.release.openshift.io/single-node-developer: "true"
          release.openshift.io/create-only: "true"
        name: cluster
      spec:
        customNoUpgrade:
          enabled:
          - ExternalCloudProvider
        featureSet: CustomNoUpgrade
      EOF

      $ openshift-install create cluster --log-level=debug --dir=/home/stack/ostest/

2. Change the cloud provider Octavia config in order to use the OVN Octavia driver

$ oc get cm cloud-provider-config -n openshift-config -o yaml                                                                                                                                             
[...]
  config: |
    [Global]
    secret-name = openstack-credentials
    secret-namespace = kube-system
    ca-file = /etc/kubernetes/static-pod-resources/configmaps/cloud-config/ca-bundle.pem
    [LoadBalancer]
    use-octavia = True
    lb-provider = ovn <-------
    lb-method = SOURCE_IP_PORT <----------
kind: ConfigMap
[...]

The nodes are going to unschedulable and then back to ready, so the change is supposedly applied.

$ oc get nodes
NAME                          STATUS   ROLES    AGE     VERSION
ostest-ffjcv-master-0         Ready    master   4h46m   v1.23.3+54654d2
ostest-ffjcv-master-1         Ready    master   4h45m   v1.23.3+54654d2
ostest-ffjcv-master-2         Ready    master   4h37m   v1.23.3+54654d2
ostest-ffjcv-worker-0-7bwf8   Ready    worker   4h26m   v1.23.3+54654d2
ostest-ffjcv-worker-0-dbj9s   Ready    worker   4h26m   v1.23.3+54654d2
ostest-ffjcv-worker-0-gzjf2   Ready    worker   4h26m   v1.23.3+54654d2


3. Create the loadbalancer type svc with below manifest:

cat <<EOF | oc apply -f -
---
apiVersion: project.openshift.io/v1
kind: Project
metadata:
  name: lb-test-ns
  labels:
    kubernetes.io/metadata.name: lb-test-ns
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: lb-test-dep
  namespace: lb-test-ns
  labels:
    app: lb-test-dep
spec:
  replicas: 2
  selector:
    matchLabels:
      app: lb-test-dep
  template:
    metadata:
      labels:
        app: lb-test-dep
    spec:
      containers:
      - image: quay.io/kuryr/demo
        name: demo
---
apiVersion: v1
kind: Service
metadata:
  name: lb-test-svc
  namespace: lb-test-ns
  labels:
    app: lb-test-dep
spec:
  ports:
  - port: 80
    targetPort: 8080
  selector:
    app: lb-test-dep
  type: LoadBalancer
EOF

4. Check LB, pod and svc creation

LB
--
| 8d001d70-e891-4379-a850-2335819aa7cd | kube_service_kubernetes_lb-test-ns_lb-test-svc | a64676dfa4b24cc9adfb620fef7b6506 | 10.196.3.144 | ACTIVE              | ovn      |                                                                 


Pods
----
lb-test-ns                                         lb-test-dep-68d6754b4d-mjkkh                                1/1     Running     0               153m
lb-test-ns                                         lb-test-dep-68d6754b4d-x47fh                                1/1     Running     0               153m

svc
---
lb-test-ns    lb-test-svc            LoadBalancer   172.30.131.186   10.46.22.227      80:32383/TCP          154m                        


5. Check connectivity to the svc
$ curl 10.46.22.227


Actual results:
$ curl 10.46.22.227
(no reply)

Expected results: reply from the svc pods


Additional info:

$ oc -n lb-test-ns describe svc lb-test-svc
Name:                     lb-test-svc
Namespace:                lb-test-ns
Labels:                   app=lb-test-dep
Annotations:              <none>
Selector:                 app=lb-test-dep
Type:                     LoadBalancer
IP Family Policy:         SingleStack
IP Families:              IPv4
IP:                       172.30.131.186
IPs:                      172.30.131.186
LoadBalancer Ingress:     10.46.22.227
Port:                     <unset>  80/TCP
TargetPort:               8080/TCP
NodePort:                 <unset>  32383/TCP
Endpoints:                10.128.2.13:8080,10.129.2.9:8080
Session Affinity:         None
External Traffic Policy:  Cluster
Events:
  Type     Reason                  Age                  From                Message
  ----     ------                  ----                 ----                -------
  Warning  SyncLoadBalancerFailed  159m                 service-controller  Error syncing load balancer: failed to ensure load balancer: load balancer 8d001d70-e891-4379-a850-2335819aa7cd is not ACTIVE, current provisioning status: PENDING_CREATE
  Warning  SyncLoadBalancerFailed  158m (x2 over 158m)  service-controller  Error syncing load balancer: failed to ensure load balancer: failed to patch service object lb-test-ns/lb-test-svc: services "lb-test-svc" is forbidden: User "system:serviceaccount:kube-system:cloud-controller-manager" cannot patch resource "services" in API group "" in the namespace "lb-test-ns"                                                                                                      
  Normal   EnsuringLoadBalancer    158m (x5 over 159m)  service-controller  Ensuring load balancer
  Normal   EnsuredLoadBalancer     158m (x2 over 158m)  service-controller  Ensured load balancer

CCM logs (oc -n openshift-cloud-controller-manager logs openstack-cloud-controller-manager-5d6b64cc45-vjckq)
--------
I0412 13:09:59.295060       1 event.go:294] "Event occurred" object="lb-test-ns/lb-test-svc" fieldPath="" kind="Service" apiVersion="v1" type="Normal" reason="EnsuringLoadBalancer" message="Ensuring load balancer"
I0412 13:09:59.412977       1 loadbalancer.go:1947] "EnsureLoadBalancer" cluster="kubernetes" service="lb-test-ns/lb-test-svc"
W0412 13:09:59.728918       1 loadbalancer.go:1708] LoadBalancerSourceRanges is ignored
I0412 13:09:59.798352       1 loadbalancer.go:1836] "Creating fully populated loadbalancer" lbName="kube_service_kubernetes_lb-test-ns_lb-test-svc" service="lb-test-ns/lb-test-svc"
I0412 13:10:01.808313       1 loadbalancer.go:151] "Waiting for load balancer ACTIVE" lbID="8d001d70-e891-4379-a850-2335819aa7cd"
I0412 13:10:02.936338       1 loadbalancer.go:165] "Load balancer ACTIVE" lbID="8d001d70-e891-4379-a850-2335819aa7cd"
E0412 13:10:02.936665       1 controller.go:310] error processing service lb-test-ns/lb-test-svc (will retry): failed to ensure load balancer: load balancer 8d001d70-e891-4379-a850-2335819aa7cd is not ACTIVE, current provisioning status: PENDING_CREATE
I0412 13:10:02.937889       1 event.go:294] "Event occurred" object="lb-test-ns/lb-test-svc" fieldPath="" kind="Service" apiVersion="v1" type="Warning" reason="SyncLoadBalancerFailed" message="Error syncing load balancer: failed to ensure load balancer: load balancer 8d001d70-e891-4379-a850-2335819aa7cd is not ACTIVE, current provisioning status: PENDING_CREATE"
I0412 13:10:07.937329       1 loadbalancer.go:1947] "EnsureLoadBalancer" cluster="kubernetes" service="lb-test-ns/lb-test-svc"
I0412 13:10:07.937893       1 event.go:294] "Event occurred" object="lb-test-ns/lb-test-svc" fieldPath="" kind="Service" apiVersion="v1" type="Normal" reason="EnsuringLoadBalancer" message="Ensuring load balancer"
W0412 13:10:08.055965       1 loadbalancer.go:1708] LoadBalancerSourceRanges is ignored
E0412 13:10:10.457513       1 controller.go:310] error processing service lb-test-ns/lb-test-svc (will retry): failed to ensure load balancer: failed to patch service object lb-test-ns/lb-test-svc: services "lb-test-svc" is forbidden: User "system:serviceaccount:kube-system:cloud-controller-manager" cannot patch resource "services" in API group "" in the namespace "lb-test-ns"
I0412 13:10:10.458023       1 event.go:294] "Event occurred" object="lb-test-ns/lb-test-svc" fieldPath="" kind="Service" apiVersion="v1" type="Warning" reason="SyncLoadBalancerFailed" message="Error syncing load balancer: failed to ensure load balancer: failed to patch service object lb-test-ns/lb-test-svc: services \"lb-test-svc\" is forbidden: User \"system:serviceaccount:kube-system:cloud-controller-manager\" cannot patch resource \"services\" in API group \"\" in the namespace \"lb-test-ns\""
I0412 13:10:20.459134       1 loadbalancer.go:1947] "EnsureLoadBalancer" cluster="kubernetes" service="lb-test-ns/lb-test-svc"
I0412 13:10:20.460278       1 event.go:294] "Event occurred" object="lb-test-ns/lb-test-svc" fieldPath="" kind="Service" apiVersion="v1" type="Normal" reason="EnsuringLoadBalancer" message="Ensuring load balancer"
W0412 13:10:20.586724       1 loadbalancer.go:1708] LoadBalancerSourceRanges is ignored
I0412 13:10:21.015741       1 event.go:294] "Event occurred" object="lb-test-ns/lb-test-svc" fieldPath="" kind="Service" apiVersion="v1" type="Normal" reason="EnsuredLoadBalancer" message="Ensured load balancer"
I0412 13:10:21.043077       1 loadbalancer.go:1947] "EnsureLoadBalancer" cluster="kubernetes" service="lb-test-ns/lb-test-svc"
I0412 13:10:21.044079       1 event.go:294] "Event occurred" object="lb-test-ns/lb-test-svc" fieldPath="" kind="Service" apiVersion="v1" type="Normal" reason="EnsuringLoadBalancer" message="Ensuring load balancer"
W0412 13:10:21.156610       1 loadbalancer.go:1708] LoadBalancerSourceRanges is ignored
E0412 13:10:21.522328       1 controller.go:310] error processing service lb-test-ns/lb-test-svc (will retry): failed to ensure load balancer: failed to patch service object lb-test-ns/lb-test-svc: services "lb-test-svc" is forbidden: User "system:serviceaccount:kube-system:cloud-controller-manager" cannot patch resource "services" in API group "" in the namespace "lb-test-ns"
I0412 13:10:21.522551       1 event.go:294] "Event occurred" object="lb-test-ns/lb-test-svc" fieldPath="" kind="Service" apiVersion="v1" type="Warning" reason="SyncLoadBalancerFailed" message="Error syncing load balancer: failed to ensure load balancer: failed to patch service object lb-test-ns/lb-test-svc: services \"lb-test-svc\" is forbidden: User \"system:serviceaccount:kube-system:cloud-controller-manager\" cannot patch resource \"services\" in API group \"\" in the namespace \"lb-test-ns\""
I0412 13:10:26.523345       1 loadbalancer.go:1947] "EnsureLoadBalancer" cluster="kubernetes" service="lb-test-ns/lb-test-svc"
I0412 13:10:26.524912       1 event.go:294] "Event occurred" object="lb-test-ns/lb-test-svc" fieldPath="" kind="Service" apiVersion="v1" type="Normal" reason="EnsuringLoadBalancer" message="Ensuring load balancer"
W0412 13:10:26.636587       1 loadbalancer.go:1708] LoadBalancerSourceRanges is ignored
I0412 13:10:27.051463       1 event.go:294] "Event occurred" object="lb-test-ns/lb-test-svc" fieldPath="" kind="Service" apiVersion="v1" type="Normal" reason="EnsuredLoadBalancer" message="Ensured load balancer"

Comment 2 Matthew Booth 2022-04-19 16:10:30 UTC
I can reproduce this using Octavia backend. This is an RBAC issue, not a LB backend issue.

Comment 3 ShiftStack Bugwatcher 2022-04-20 07:04:24 UTC
Removing the Triaged keyword because:
* the QE automation assessment (flag qe_test_coverage) is missing

Comment 4 Matthew Booth 2022-04-22 10:24:01 UTC
I've created 2 PRs to address this bug:

The actual fix is a permission fix in CCCMO: https://github.com/openshift/cluster-cloud-controller-manager-operator/pull/184. This adds the required permissions for CPO to annotate the service object.

A secondary issue was that we noticed that CPO would succeed on the second attempt, but without adding the expected annotation. This should be fixed by https://github.com/kubernetes/kubernetes/pull/109601.

Comment 7 Jon Uriarte 2022-05-06 09:55:42 UTC
Verified in 4.11.0-0.nightly-2022-05-05-015322 on top of OSP 16.1.7.

Followed reproducer steps to verify this BZ, the RBAC issues are fixed but the svc is still
not reachable due to some bugs in OSP Octavia OVN component, tracking it in bug 2082496.

Comment 9 errata-xmlrpc 2022-08-10 11:06:30 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:5069