Bug 1934798

Summary: machineset-controller stuck in CrashLoopBackOff after upgrade to 4.7.0
Product: OpenShift Container Platform Reporter: OpenShift BugZilla Robot <openshift-bugzilla-robot>
Component: Cloud ComputeAssignee: Michael Gugino <mgugino>
Cloud Compute sub component: Other Providers QA Contact: Milind Yadav <miyadav>
Status: CLOSED ERRATA Docs Contact:
Severity: urgent    
Priority: urgent CC: lmohanty, mgugino, scuppett, sreber, tmicheli, wking
Version: 4.7   
Target Milestone: ---   
Target Release: 4.7.z   
Hardware: All   
OS: All   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-03-16 08:42:49 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Bug Depends On: 1934216    
Bug Blocks:    

Comment 3 Milind Yadav 2021-03-08 12:31:33 UTC
Validated 

Below are the steps :

Nightly without fix :

[miyadav@miyadav debug]$ oc logs  machine-api-controllers-7fcbdc6dc9-2gqvn -c machineset-controller | grep "Registering Components"
2021/03/08 11:06:53 Registering Components.
2021/03/08 11:07:10 Registering Components.


Upgraded nightly to image which contains the fix 

[miyadav@miyadav debug]$ oc get pods
NAME                                           READY   STATUS    RESTARTS   AGE
cluster-autoscaler-operator-77d84d5b48-5jbsp   2/2     Running   0          72s
cluster-baremetal-operator-677f489878-qxc77    1/1     Running   0          71s
machine-api-controllers-64f7444c86-gg2hv       7/7     Running   0          18m
machine-api-operator-6b5df9fcf-fm848           2/2     Running   0          20m

[miyadav@miyadav debug]$ oc logs -f machine-api-controllers-64f7444c86-gg2hv | grep "Registering Components"
error: a container name must be specified for pod machine-api-controllers-64f7444c86-gg2hv, choose one of: [machineset-controller machine-controller nodelink-controller machine-healthcheck-controller kube-rbac-proxy-machineset-mtrc kube-rbac-proxy-machine-mtrc kube-rbac-proxy-mhc-mtrc]
[miyadav@miyadav debug]$ oc logs -f machine-api-controllers-64f7444c86-gg2hv -c machineset-controller | grep "Registering Components"
2021/03/08 12:04:02 Registering Components.
2021/03/08 12:04:04 Registering Components.
^C
[miyadav@miyadav debug]$ oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.7.0-0.nightly-2021-03-06-183610   True        False         8m28s   Cluster version is 4.7.0-0.nightly-2021-03-06-183610
[miyadav@miyadav debug]$ 


Additional info:

time difference is less than 10 seconds as seen in logs 

moved to VERIFIED

Comment 5 errata-xmlrpc 2021-03-16 08:42:49 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.7.2 bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:0749