Bug 1831780

Summary: Libvirt provider does not set phase
Product: OpenShift Container Platform Reporter: David Benoit <dbenoit>
Component: Cloud ComputeAssignee: Prashanth Sundararaman <psundara>
Cloud Compute sub component: Other Providers QA Contact: Jianwei Hou <jhou>
Status: CLOSED ERRATA Docs Contact:
Severity: low    
Priority: unspecified CC: agarcial, mgugino, psundara, wking
Version: 4.3.z   
Target Milestone: ---   
Target Release: 4.6.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-10-27 15:58:53 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
must-gather none

Description David Benoit 2020-05-05 15:41:08 UTC
Description of problem:
When performing a libvirt IPI install of OpenShift 4.3, the provider does not set a phase.


Version-Release number of selected component (if applicable):
4.3.z

How reproducible:
Always

Steps to Reproduce:
1. Perform libvirt IPI installer
2. oc get machines -n openshift-machine-api


Actual results:

NAME                                PHASE   TYPE   REGION   ZONE   AGE
test-ocp-9nngh-master-0                                            2d20h
test-ocp-9nngh-master-1                                            2d20h
test-ocp-9nngh-master-2                                            2d20h
test-ocp-9nngh-worker-0-g2vjg                                      2d20h
test-ocp-9nngh-worker-0-jt779                                      2d20h


Expected results:
Phase is set by the machine controller.

Additional info:
Upon scaling machineset replicas, the cluster begins firing MachineWithNoRunningPhase alerts which never resolve.  These alerts do not seem to trigger without scaling machinesets.

Comment 1 Prashanth Sundararaman 2020-05-05 18:49:22 UTC
Created attachment 1685385 [details]
must-gather

Comment 2 Prashanth Sundararaman 2020-05-05 18:57:08 UTC
Based on a slack conversation with the cloud team, it looked like the libvirt actuator is not using the latest version of the machine controller which could be causing this.

Comment 3 Alberto 2020-07-01 10:13:37 UTC
Hi zeenix, I'm tagging this with upcomingSprint. Please feel free to drop it if you're planning to tackle this sooner.

Comment 4 Prashanth Sundararaman 2020-07-08 12:49:55 UTC
should be fixed by: https://github.com/openshift/cluster-api-provider-libvirt/pull/198. Was able to test the changes and I see phase set.

Comment 7 Jianwei Hou 2020-08-20 02:09:14 UTC
Thanks, I think this is verified based on the above comments.

Comment 9 errata-xmlrpc 2020-10-27 15:58:53 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:4196