Bug 2074606 - occm does not have permissions to annotate SVC objects
Summary: occm does not have permissions to annotate SVC objects
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Cloud Compute
Version: 4.11
Hardware: Unspecified
OS: Unspecified
urgent
high
Target Milestone: ---
: 4.11.0
Assignee: Matthew Booth
QA Contact: Jon Uriarte
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-04-12 15:55 UTC by Jon Uriarte
Modified: 2022-08-10 11:06 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-08-10 11:06:30 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github kubernetes kubernetes pull 109601 0 None open Prevent dirty service object leaking between reconciles 2022-04-22 10:21:00 UTC
Github openshift cluster-cloud-controller-manager-operator pull 184 0 None open Bug 2074606: Allow OpenStack CCM to annotate Service objects 2022-04-21 13:36:41 UTC
Red Hat Product Errata RHSA-2022:5069 0 None None None 2022-08-10 11:06:50 UTC

Description Jon Uriarte 2022-04-12 15:55:33 UTC
Description of problem:

LB type service is not working (not responding) when the OVN octavia provider is used.

E0412 13:10:21.522328       1 controller.go:310] error processing service lb-test-ns/lb-test-svc (will retry): failed to ensure load balancer: failed to patch service object lb-test-ns/lb-test-svc: services "lb-test-svc" is forbidden: User "system:serviceaccount:kube-system:cloud-controller-manager" cannot patch resource "services" in API group "" in the namespace "lb-test-ns"
I0412 13:10:21.522551       1 event.go:294] "Event occurred" object="lb-test-ns/lb-test-svc" fieldPath="" kind="Service" apiVersion="v1" type="Warning" reason="SyncLoadBalancerFailed" message="Error syncing load balancer: failed to ensure load balancer: failed to patch service object lb-test-ns/lb-test-svc: services \"lb-test-svc\" is forbidden: User \"system:serviceaccount:kube-system:cloud-controller-manager\" cannot patch resource \"services\" in API group \"\" in the namespace \"lb-test-ns\""

Version-Release number of selected component (if applicable):
OCP 4.11.0-0.nightly-2022-04-08-205307
OSP 16.1.7

How reproducible: always


Steps to Reproduce:
1. Install 4.11 with ExternalCloudProvider

      $ openshift-install create manifests --log-level=debug --dir=/home/stack/ostest/
      $ cd ostest/
      $ cat <<EOF >manifests/manifest_feature_gate.yaml
      apiVersion: config.openshift.io/v1
      kind: FeatureGate
      metadata:
        annotations:
          include.release.openshift.io/self-managed-high-availability: "true"
          include.release.openshift.io/single-node-developer: "true"
          release.openshift.io/create-only: "true"
        name: cluster
      spec:
        customNoUpgrade:
          enabled:
          - ExternalCloudProvider
        featureSet: CustomNoUpgrade
      EOF

      $ openshift-install create cluster --log-level=debug --dir=/home/stack/ostest/

2. Change the cloud provider Octavia config in order to use the OVN Octavia driver

$ oc get cm cloud-provider-config -n openshift-config -o yaml                                                                                                                                             
[...]
  config: |
    [Global]
    secret-name = openstack-credentials
    secret-namespace = kube-system
    ca-file = /etc/kubernetes/static-pod-resources/configmaps/cloud-config/ca-bundle.pem
    [LoadBalancer]
    use-octavia = True
    lb-provider = ovn <-------
    lb-method = SOURCE_IP_PORT <----------
kind: ConfigMap
[...]

The nodes are going to unschedulable and then back to ready, so the change is supposedly applied.

$ oc get nodes
NAME                          STATUS   ROLES    AGE     VERSION
ostest-ffjcv-master-0         Ready    master   4h46m   v1.23.3+54654d2
ostest-ffjcv-master-1         Ready    master   4h45m   v1.23.3+54654d2
ostest-ffjcv-master-2         Ready    master   4h37m   v1.23.3+54654d2
ostest-ffjcv-worker-0-7bwf8   Ready    worker   4h26m   v1.23.3+54654d2
ostest-ffjcv-worker-0-dbj9s   Ready    worker   4h26m   v1.23.3+54654d2
ostest-ffjcv-worker-0-gzjf2   Ready    worker   4h26m   v1.23.3+54654d2


3. Create the loadbalancer type svc with below manifest:

cat <<EOF | oc apply -f -
---
apiVersion: project.openshift.io/v1
kind: Project
metadata:
  name: lb-test-ns
  labels:
    kubernetes.io/metadata.name: lb-test-ns
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: lb-test-dep
  namespace: lb-test-ns
  labels:
    app: lb-test-dep
spec:
  replicas: 2
  selector:
    matchLabels:
      app: lb-test-dep
  template:
    metadata:
      labels:
        app: lb-test-dep
    spec:
      containers:
      - image: quay.io/kuryr/demo
        name: demo
---
apiVersion: v1
kind: Service
metadata:
  name: lb-test-svc
  namespace: lb-test-ns
  labels:
    app: lb-test-dep
spec:
  ports:
  - port: 80
    targetPort: 8080
  selector:
    app: lb-test-dep
  type: LoadBalancer
EOF

4. Check LB, pod and svc creation

LB
--
| 8d001d70-e891-4379-a850-2335819aa7cd | kube_service_kubernetes_lb-test-ns_lb-test-svc | a64676dfa4b24cc9adfb620fef7b6506 | 10.196.3.144 | ACTIVE              | ovn      |                                                                 


Pods
----
lb-test-ns                                         lb-test-dep-68d6754b4d-mjkkh                                1/1     Running     0               153m
lb-test-ns                                         lb-test-dep-68d6754b4d-x47fh                                1/1     Running     0               153m

svc
---
lb-test-ns    lb-test-svc            LoadBalancer   172.30.131.186   10.46.22.227      80:32383/TCP          154m                        


5. Check connectivity to the svc
$ curl 10.46.22.227


Actual results:
$ curl 10.46.22.227
(no reply)

Expected results: reply from the svc pods


Additional info:

$ oc -n lb-test-ns describe svc lb-test-svc
Name:                     lb-test-svc
Namespace:                lb-test-ns
Labels:                   app=lb-test-dep
Annotations:              <none>
Selector:                 app=lb-test-dep
Type:                     LoadBalancer
IP Family Policy:         SingleStack
IP Families:              IPv4
IP:                       172.30.131.186
IPs:                      172.30.131.186
LoadBalancer Ingress:     10.46.22.227
Port:                     <unset>  80/TCP
TargetPort:               8080/TCP
NodePort:                 <unset>  32383/TCP
Endpoints:                10.128.2.13:8080,10.129.2.9:8080
Session Affinity:         None
External Traffic Policy:  Cluster
Events:
  Type     Reason                  Age                  From                Message
  ----     ------                  ----                 ----                -------
  Warning  SyncLoadBalancerFailed  159m                 service-controller  Error syncing load balancer: failed to ensure load balancer: load balancer 8d001d70-e891-4379-a850-2335819aa7cd is not ACTIVE, current provisioning status: PENDING_CREATE
  Warning  SyncLoadBalancerFailed  158m (x2 over 158m)  service-controller  Error syncing load balancer: failed to ensure load balancer: failed to patch service object lb-test-ns/lb-test-svc: services "lb-test-svc" is forbidden: User "system:serviceaccount:kube-system:cloud-controller-manager" cannot patch resource "services" in API group "" in the namespace "lb-test-ns"                                                                                                      
  Normal   EnsuringLoadBalancer    158m (x5 over 159m)  service-controller  Ensuring load balancer
  Normal   EnsuredLoadBalancer     158m (x2 over 158m)  service-controller  Ensured load balancer

CCM logs (oc -n openshift-cloud-controller-manager logs openstack-cloud-controller-manager-5d6b64cc45-vjckq)
--------
I0412 13:09:59.295060       1 event.go:294] "Event occurred" object="lb-test-ns/lb-test-svc" fieldPath="" kind="Service" apiVersion="v1" type="Normal" reason="EnsuringLoadBalancer" message="Ensuring load balancer"
I0412 13:09:59.412977       1 loadbalancer.go:1947] "EnsureLoadBalancer" cluster="kubernetes" service="lb-test-ns/lb-test-svc"
W0412 13:09:59.728918       1 loadbalancer.go:1708] LoadBalancerSourceRanges is ignored
I0412 13:09:59.798352       1 loadbalancer.go:1836] "Creating fully populated loadbalancer" lbName="kube_service_kubernetes_lb-test-ns_lb-test-svc" service="lb-test-ns/lb-test-svc"
I0412 13:10:01.808313       1 loadbalancer.go:151] "Waiting for load balancer ACTIVE" lbID="8d001d70-e891-4379-a850-2335819aa7cd"
I0412 13:10:02.936338       1 loadbalancer.go:165] "Load balancer ACTIVE" lbID="8d001d70-e891-4379-a850-2335819aa7cd"
E0412 13:10:02.936665       1 controller.go:310] error processing service lb-test-ns/lb-test-svc (will retry): failed to ensure load balancer: load balancer 8d001d70-e891-4379-a850-2335819aa7cd is not ACTIVE, current provisioning status: PENDING_CREATE
I0412 13:10:02.937889       1 event.go:294] "Event occurred" object="lb-test-ns/lb-test-svc" fieldPath="" kind="Service" apiVersion="v1" type="Warning" reason="SyncLoadBalancerFailed" message="Error syncing load balancer: failed to ensure load balancer: load balancer 8d001d70-e891-4379-a850-2335819aa7cd is not ACTIVE, current provisioning status: PENDING_CREATE"
I0412 13:10:07.937329       1 loadbalancer.go:1947] "EnsureLoadBalancer" cluster="kubernetes" service="lb-test-ns/lb-test-svc"
I0412 13:10:07.937893       1 event.go:294] "Event occurred" object="lb-test-ns/lb-test-svc" fieldPath="" kind="Service" apiVersion="v1" type="Normal" reason="EnsuringLoadBalancer" message="Ensuring load balancer"
W0412 13:10:08.055965       1 loadbalancer.go:1708] LoadBalancerSourceRanges is ignored
E0412 13:10:10.457513       1 controller.go:310] error processing service lb-test-ns/lb-test-svc (will retry): failed to ensure load balancer: failed to patch service object lb-test-ns/lb-test-svc: services "lb-test-svc" is forbidden: User "system:serviceaccount:kube-system:cloud-controller-manager" cannot patch resource "services" in API group "" in the namespace "lb-test-ns"
I0412 13:10:10.458023       1 event.go:294] "Event occurred" object="lb-test-ns/lb-test-svc" fieldPath="" kind="Service" apiVersion="v1" type="Warning" reason="SyncLoadBalancerFailed" message="Error syncing load balancer: failed to ensure load balancer: failed to patch service object lb-test-ns/lb-test-svc: services \"lb-test-svc\" is forbidden: User \"system:serviceaccount:kube-system:cloud-controller-manager\" cannot patch resource \"services\" in API group \"\" in the namespace \"lb-test-ns\""
I0412 13:10:20.459134       1 loadbalancer.go:1947] "EnsureLoadBalancer" cluster="kubernetes" service="lb-test-ns/lb-test-svc"
I0412 13:10:20.460278       1 event.go:294] "Event occurred" object="lb-test-ns/lb-test-svc" fieldPath="" kind="Service" apiVersion="v1" type="Normal" reason="EnsuringLoadBalancer" message="Ensuring load balancer"
W0412 13:10:20.586724       1 loadbalancer.go:1708] LoadBalancerSourceRanges is ignored
I0412 13:10:21.015741       1 event.go:294] "Event occurred" object="lb-test-ns/lb-test-svc" fieldPath="" kind="Service" apiVersion="v1" type="Normal" reason="EnsuredLoadBalancer" message="Ensured load balancer"
I0412 13:10:21.043077       1 loadbalancer.go:1947] "EnsureLoadBalancer" cluster="kubernetes" service="lb-test-ns/lb-test-svc"
I0412 13:10:21.044079       1 event.go:294] "Event occurred" object="lb-test-ns/lb-test-svc" fieldPath="" kind="Service" apiVersion="v1" type="Normal" reason="EnsuringLoadBalancer" message="Ensuring load balancer"
W0412 13:10:21.156610       1 loadbalancer.go:1708] LoadBalancerSourceRanges is ignored
E0412 13:10:21.522328       1 controller.go:310] error processing service lb-test-ns/lb-test-svc (will retry): failed to ensure load balancer: failed to patch service object lb-test-ns/lb-test-svc: services "lb-test-svc" is forbidden: User "system:serviceaccount:kube-system:cloud-controller-manager" cannot patch resource "services" in API group "" in the namespace "lb-test-ns"
I0412 13:10:21.522551       1 event.go:294] "Event occurred" object="lb-test-ns/lb-test-svc" fieldPath="" kind="Service" apiVersion="v1" type="Warning" reason="SyncLoadBalancerFailed" message="Error syncing load balancer: failed to ensure load balancer: failed to patch service object lb-test-ns/lb-test-svc: services \"lb-test-svc\" is forbidden: User \"system:serviceaccount:kube-system:cloud-controller-manager\" cannot patch resource \"services\" in API group \"\" in the namespace \"lb-test-ns\""
I0412 13:10:26.523345       1 loadbalancer.go:1947] "EnsureLoadBalancer" cluster="kubernetes" service="lb-test-ns/lb-test-svc"
I0412 13:10:26.524912       1 event.go:294] "Event occurred" object="lb-test-ns/lb-test-svc" fieldPath="" kind="Service" apiVersion="v1" type="Normal" reason="EnsuringLoadBalancer" message="Ensuring load balancer"
W0412 13:10:26.636587       1 loadbalancer.go:1708] LoadBalancerSourceRanges is ignored
I0412 13:10:27.051463       1 event.go:294] "Event occurred" object="lb-test-ns/lb-test-svc" fieldPath="" kind="Service" apiVersion="v1" type="Normal" reason="EnsuredLoadBalancer" message="Ensured load balancer"

Comment 2 Matthew Booth 2022-04-19 16:10:30 UTC
I can reproduce this using Octavia backend. This is an RBAC issue, not a LB backend issue.

Comment 3 ShiftStack Bugwatcher 2022-04-20 07:04:24 UTC
Removing the Triaged keyword because:
* the QE automation assessment (flag qe_test_coverage) is missing

Comment 4 Matthew Booth 2022-04-22 10:24:01 UTC
I've created 2 PRs to address this bug:

The actual fix is a permission fix in CCCMO: https://github.com/openshift/cluster-cloud-controller-manager-operator/pull/184. This adds the required permissions for CPO to annotate the service object.

A secondary issue was that we noticed that CPO would succeed on the second attempt, but without adding the expected annotation. This should be fixed by https://github.com/kubernetes/kubernetes/pull/109601.

Comment 7 Jon Uriarte 2022-05-06 09:55:42 UTC
Verified in 4.11.0-0.nightly-2022-05-05-015322 on top of OSP 16.1.7.

Followed reproducer steps to verify this BZ, the RBAC issues are fixed but the svc is still
not reachable due to some bugs in OSP Octavia OVN component, tracking it in bug 2082496.

Comment 9 errata-xmlrpc 2022-08-10 11:06:30 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:5069


Note You need to log in before you can comment on or make changes to this bug.