Bug 1990425

Summary: [RHV-IPI] Unable to check if machine exists post successful installation
Product: OpenShift Container Platform Reporter: Apoorva Jagtap <apjagtap>
Component: Cloud ComputeAssignee: OCP on RHV Team <ocprhvteam>
Cloud Compute sub component: oVirt Provider QA Contact: Michael Burman <mburman>
Status: CLOSED DUPLICATE Docs Contact:
Severity: high    
Priority: unspecified CC: eslutsky
Version: 4.8   
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-08-09 11:08:13 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Apoorva Jagtap 2021-08-05 11:27:01 UTC
Description of problem:

[*] After successful installation on RHV (IPI), the machines in `openshift-machine-api` namespace do not reflect the expected state of the machines, even when all the nodes report Ready state.

Version-Release number of selected component (if applicable):
v4.8.2

Actual results:

[*] Either do not display the status of machines (specifically for master machines), or report the worker nodes still in `provisioned` state.
~~~
oc get machine -n openshift-machine-api -owide
NAME                          PHASE         TYPE   REGION   ZONE   AGE     NODE                          PROVIDERID                                     STATE
tocpomni-sczzb-master-0                                            3d22h                                                                                
tocpomni-sczzb-master-1                                            3d22h                                                                                
tocpomni-sczzb-master-2                                            3d22h                                                                                
tocpomni-sczzb-worker-8rxkf   Provisioned                          3d22h   tocpomni-sczzb-worker-8rxkf   ovirt://xxxx   up
tocpomni-sczzb-worker-c6m75   Running                              3d22h   tocpomni-sczzb-worker-c6m75   ovirt://xxxx   up
tocpomni-sczzb-worker-gps26   Provisioned                          3d22h   tocpomni-sczzb-worker-gps26   ovirt://xxx   up
~~~

[*] All the nodes report to be Ready with IPs assigned:
~~~
$ omg get nodes 
NAME                         STATUS  ROLES   AGE  VERSION
tocpomni-sczzb-master-0      Ready   master  36m  v1.21.1+051ac4f
tocpomni-sczzb-master-1      Ready   master  36m  v1.21.1+051ac4f
tocpomni-sczzb-master-2      Ready   master  36m  v1.21.1+051ac4f
tocpomni-sczzb-worker-8rxkf  Ready   worker  24m  v1.21.1+051ac4f
tocpomni-sczzb-worker-c6m75  Ready   worker  26m  v1.21.1+051ac4f
tocpomni-sczzb-worker-gps26  Ready   worker  27m  v1.21.1+051ac4f
~~~

[*] The latest yaml definition of the master machines report following status:
~~~
status:
conditions:
- lastTransitionTime: "2021-08-02T19:53:05Z"
message: |
Failed to check if machine exists: Failed to parse non-array sso with response <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>503 Service Unavailable</title>
</head><body>
<h1>Service Unavailable</h1>
<p>The server is temporarily unable to service your
request due to maintenance downtime or capacity
problems. Please try again later.</p>
</body></html>
reason: ErrorCheckingProvider
status: Unknown
type: InstanceExists
lastUpdated: "2021-08-02T19:53:05Z"
phase: ""
~~~
- The RHV console is up and operational as expected.

Expected results:

[*] After successful installation, all the machines should report expected status.


Additional info:

[*] Initially, we observed some connection refused messages in the machine's yaml, however, it seems that the connection from machine-api-controller pod is working as expected.

Comment 6 Evgeny Slutsky 2021-08-09 11:08:13 UTC

*** This bug has been marked as a duplicate of bug 1989676 ***