Bug 2077933

Summary: Kube controller manager does not handle new configurations available in the cloud provider OpenStack
Product: OpenShift Container Platform Reporter: Maysa Macedo <mdemaced>
Component: Cloud ComputeAssignee: Matthew Booth <mbooth>
Cloud Compute sub component: OpenStack Provider QA Contact: rlobillo
Status: CLOSED ERRATA Docs Contact:
Severity: medium    
Priority: high CC: aos-bugs, emacchi, m.andre, mfedosin, pprinett, rlobillo, stephenfin
Version: 4.11Keywords: Triaged
Target Milestone: ---   
Target Release: 4.12.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-01-17 19:48:11 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Maysa Macedo 2022-04-22 15:49:00 UTC
Description of problem:

When including a new configuration option recently supported by the cloud provider openstack[1], e.g. "enabled = true" in the user managed config-map, the  kube controller manager Pods will be restarting since they are not able to recognize the new config and fail with the following traceback:

I0422 15:39:28.989335       1 leaderelection.go:258] successfully acquired lease kube-system/kube-controller-manager               
I0422 15:39:28.989780       1 event.go:294] "Event occurred" object="kube-system/kube-controller-manager" kind="ConfigMap" apiVersion="v1" type="Normal" reason="LeaderElection" message="ostest-28rzc-master-0_1b6b3575-ceca-452c-84cb-bccfaa
bde936 became leader"                                                                                                                                                                                                                        
I0422 15:39:28.989881       1 event.go:294] "Event occurred" object="kube-system/kube-controller-manager" kind="Lease" apiVersion="coordination.k8s.io/v1" type="Normal" reason="LeaderElection" message="ostest-28rzc-master-0_1b6b3575-ceca-
452c-84cb-bccfaabde936 became leader"                                                                          
F0422 15:39:34.692089       1 controllermanager.go:258] error building controller context: cloud provider could not be initialized: could not init cloud provider "openstack": warning:
can't store data at section "LoadBalancer", variable "enabled"                                                                                                                                                                                
goroutine 295 [running]:                                                                                                                                                                                                                     
k8s.io/kubernetes/vendor/k8s.io/klog/v2.stacks(0x1)                                                                                                                                                                                          
        /go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/vendor/k8s.io/klog/v2/klog.go:1038 +0x8a                                                                                                                            
k8s.io/kubernetes/vendor/k8s.io/klog/v2.(*loggingT).output(0x7de36e0, 0x3, 0x0, 0xc000259500, 0x0, {0x63defce, 0x1}, 0xc0013ce090, 0x0)           
        /go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/vendor/k8s.io/klog/v2/klog.go:987 +0x5fd                                                                                                                            
k8s.io/kubernetes/vendor/k8s.io/klog/v2.(*loggingT).printf(0x0, 0x0, 0x0, {0x0, 0x0}, {0x4ad7b94, 0x25}, {0xc0013ce090, 0x1, 0x1})                                                                                                           
        /go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/vendor/k8s.io/klog/v2/klog.go:753 +0x1c5                                                                                                                            
k8s.io/kubernetes/vendor/k8s.io/klog/v2.Fatalf(...)                                                                        
        /go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/vendor/k8s.io/klog/v2/klog.go:1532                                                         
k8s.io/kubernetes/cmd/kube-controller-manager/app.Run.func2({0x520dbb0, 0xc000ec89c0}, 0xc00037eab0, 0x4c57fd0)                                                                                                                              
        /go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/cmd/kube-controller-manager/app/controllermanager.go:258 +0x1c8                                                                                                     
k8s.io/kubernetes/cmd/kube-controller-manager/app.Run.func4({0x520dbb0, 0xc000ec89c0})                                                    
        /go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/cmd/kube-controller-manager/app/controllermanager.go:320 +0xe3                                                                                                       
created by k8s.io/kubernetes/vendor/k8s.io/client-go/tools/leaderelection.(*LeaderElector).Run                   
        /go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/vendor/k8s.io/client-go/tools/leaderelection/leaderelection.go:211 +0x154


[1] https://github.com/kubernetes/cloud-provider-openstack/blob/master/docs/openstack-cloud-controller-manager/using-openstack-cloud-controller-manager.md#load-balancer
Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1. Install a 4.11 cluster
2. Include a new option in the user facing config map with "oc edit cm cloud-provider-config -n openshift-config"
3.

Actual results:


Expected results:


Additional info:

Comment 2 Martin André 2022-04-26 06:49:13 UTC
*** Bug 2078567 has been marked as a duplicate of this bug. ***

Comment 3 Martin André 2022-04-26 06:51:28 UTC
Adding comment from https://bugzilla.redhat.com/show_bug.cgi?id=2078567 here for additional context:

As the openstack platform is migrating to the external cloud provider, the user-managed cloud.conf in `openshift-config/cloud-provider-config` is now the source for both the in-tree cloud.conf (in `openshift-config-managed/kube-cloud-config`) and the CCM cloud.conf (in `openshift-cloud-controller-manager/cloud-conf`).

However, even after switching to CCM we need to keep initializing the in-tree cloud provider in order to be able to detach volumes created with legacy cloud provider.

Moreover, the legacy cloud provider errors out when it finds an option it does not understand, causing KCM to enters a crashloopback.

We can't legitimately ask customer to limit themselves to the intersection of in-tree and external cloud provider options. We should make the legacy cloud provider option parsing more tolerant and ignore unknown options.

Comment 6 rlobillo 2022-10-25 15:40:45 UTC
Verified on 4.12.0-0.nightly-2022-10-05-053337 on top of RHOS-17.0-RHEL-9-20220909.n.0

$ oc get cm/cloud-provider-config -n openshift-config -o yaml
apiVersion: v1
data:
  ca-bundle.pem: <hidden>
  config: |
    [Global]
    secret-name = openstack-credentials
    secret-namespace = kube-system
    region = regionOne
    ca-file = /etc/kubernetes/static-pod-resources/configmaps/cloud-config/ca-bundle.pem
    [LoadBalancer]
    floating-network-id=2e03985d-1eb4-4478-83b5-fdba6bda8ee0
    lb-provider=ovn
    lb-method=SOURCE_IP_PORT
    [LoadBalancerClass "class1"]
    floating-network-id=17521769-5c47-4869-8296-4c6668d4e2c8
    floating-subnet-id=3535d323-bb67-4e9a-92fe-fe85e8489245
    [LoadBalancerClass "class2"]
    floating-network-id=17521769-5c47-4869-8296-4c6668d4e2c8
    floating-subnet-tags="class2tag"
    [LoadBalancerClass "class3"]
    floating-network-id=17521769-5c47-4869-8296-4c6668d4e2c8
    floating-subnet="external_subnet3"
kind: ConfigMap
metadata:
  creationTimestamp: "2022-10-06T14:16:55Z"
  name: cloud-provider-config
  namespace: openshift-config
  resourceVersion: "10815728"
  uid: 21a208be-0925-4816-88ff-e013f3a99c9d


$ oc get cm/kube-cloud-config -n openshift-config-managed -o yaml
apiVersion: v1
data:
  ca-bundle.pem: <hidden>
  cloud.conf: |
    [Global]
    secret-name = openstack-credentials
    secret-namespace = kube-system
    region = regionOne
    ca-file = /etc/kubernetes/static-pod-resources/configmaps/cloud-config/ca-bundle.pem
    [LoadBalancer]
    floating-network-id=2e03985d-1eb4-4478-83b5-fdba6bda8ee0
    lb-provider=ovn
    lb-method=SOURCE_IP_PORT
    [LoadBalancerClass "class1"]
    floating-network-id=17521769-5c47-4869-8296-4c6668d4e2c8
    floating-subnet-id=3535d323-bb67-4e9a-92fe-fe85e8489245
    [LoadBalancerClass "class2"]
    floating-network-id=17521769-5c47-4869-8296-4c6668d4e2c8
    floating-subnet-tags="class2tag"
    [LoadBalancerClass "class3"]
    floating-network-id=17521769-5c47-4869-8296-4c6668d4e2c8
    floating-subnet="external_subnet3"
kind: ConfigMap
metadata:
  creationTimestamp: "2022-10-06T14:22:22Z"
  name: kube-cloud-config
  namespace: openshift-config-managed
  resourceVersion: "10815730"
  uid: 7a691240-c910-442d-9829-8875b1a0a01d


$ oc get cm/cloud-conf -n openshift-cloud-controller-manager -o yaml
apiVersion: v1
data:
  ca-bundle.pem: <hidden>
  cloud.conf: |
    [Global]
    region      = regionOne
    ca-file     = /etc/kubernetes/static-pod-resources/configmaps/cloud-config/ca-bundle.pem
    use-clouds  = true
    clouds-file = /etc/openstack/secret/clouds.yaml
    cloud       = openstack

    [LoadBalancer]
    floating-network-id = 2e03985d-1eb4-4478-83b5-fdba6bda8ee0
    lb-provider         = ovn
    lb-method           = SOURCE_IP_PORT
    use-octavia         = true

    [LoadBalancerClass "class1"]
    floating-network-id = 17521769-5c47-4869-8296-4c6668d4e2c8
    floating-subnet-id  = 3535d323-bb67-4e9a-92fe-fe85e8489245

    [LoadBalancerClass "class2"]
    floating-network-id  = 17521769-5c47-4869-8296-4c6668d4e2c8
    floating-subnet-tags = class2tag

    [LoadBalancerClass "class3"]
    floating-network-id = 17521769-5c47-4869-8296-4c6668d4e2c8
    floating-subnet     = external_subnet3
kind: ConfigMap
metadata:
  creationTimestamp: "2022-10-06T14:20:24Z"
  name: cloud-conf
  namespace: openshift-cloud-controller-manager
  resourceVersion: "10815729"
  uid: efe71473-5362-4e61-bf4e-554929be1e60



$ oc get pods -n openshift-kube-controller-manager -l app=kube-controller-manager
NAME                                            READY   STATUS    RESTARTS      AGE
kube-controller-manager-ostest-qbr7x-master-0   4/4     Running   4             34m
kube-controller-manager-ostest-qbr7x-master-1   4/4     Running   4             36m
kube-controller-manager-ostest-qbr7x-master-2   4/4     Running   1 (23m ago)   24m


$  oc logs kube-controller-manager-ostest-qbr7x-master-2 -n openshift-kube-controller-manager 
[...]
W1025 15:14:30.440858       1 openstack.go:364] Non-fatal error parsing OpenStack cloud config. This may happen when passing config directives exclusive to OpenStack CCM to the legacy cloud provider. Legacy cloud provider has correctly parsed all directives it knows about
: warnings:
can't store data at section "LoadBalancerClass"
can't store data at section "LoadBalancerClass"
can't store data at section "LoadBalancerClass"
can't store data at section "LoadBalancerClass"
can't store data at section "LoadBalancerClass"
can't store data at section "LoadBalancerClass"
can't store data at section "LoadBalancerClass"
can't store data at section "LoadBalancerClass"
can't store data at section "LoadBalancerClass"
can't store data at section "LoadBalancerClass"
can't store data at section "LoadBalancerClass"
can't store data at section "LoadBalancerClass"
can't store data at section "LoadBalancerClass"
can't store data at section "LoadBalancerClass"
can't store data at section "LoadBalancerClass"
can't store data at section "LoadBalancerClass"
can't store data at section "LoadBalancerClass"
can't store data at section "LoadBalancerClass"

Comment 9 errata-xmlrpc 2023-01-17 19:48:11 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.12.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:7399