Bug 1840018

Summary: ovirt IPI leave worker machines on "image_locked" state
Product: OpenShift Container Platform Reporter: Ramon Gordillo <ramon.gordillo>
Component: InstallerAssignee: Douglas Schilling Landgraf <dougsland>
Installer sub component: OpenShift on RHV QA Contact: Lucie Leistnerova <lleistne>
Status: CLOSED DUPLICATE Docs Contact:
Severity: medium    
Priority: medium CC: dougsland, hpopal, rgolan, rgordill
Version: 4.5Keywords: UpcomingSprint
Target Milestone: ---   
Target Release: 4.5.z   
Hardware: x86_64   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1841386 (view as bug list) Environment:
Last Closed: 2020-07-09 12:41:45 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1841386    
Bug Blocks:    
Attachments:
Description Flags
Machine screen
none
master.yaml
none
worker.yaml
none
machine-api-operator logs (with prefix per container)
none
machine-api-controller logs (with prefix per container) none

Description Ramon Gordillo 2020-05-26 09:03:31 UTC
Created attachment 1692186 [details]
Machine screen

Description of problem:

After a fresh installation, the worker machines are in a provisioned status, although the nodes are Ready. Also, we need to manually approve CSRs, otherwise it is not allowed to do an oc debug node.

When creating new machineset, the new machines shows the same behaviour.

The web console does not show the corresponding node to the machine (screenshot attached), and it does not show the number of pods in the node summary. 

Attached master and worker machine yamls in this environment.

Version-Release number of the following components:

OCP: 4.5.0-0.nightly-2020-05-25-043808
RHV: 4.3.9.4-11.el7

How reproducible: Always

Steps to Reproduce:
1. RHV IPI installation (ends ok)

Actual results:

oc get nodes
NAME                         STATUS   ROLES    AGE   VERSION
ocp44-9gcdd-master-0         Ready    master   13h   v1.18.2
ocp44-9gcdd-master-1         Ready    master   13h   v1.18.2
ocp44-9gcdd-master-2         Ready    master   13h   v1.18.2
ocp44-9gcdd-worker-0-7mpnk   Ready    worker   13h   v1.18.2
ocp44-9gcdd-worker-0-8p2mp   Ready    worker   13h   v1.18.2
ocp44-9gcdd-worker-0-dwjsq   Ready    worker   13h   v1.18.2


oc get machines -n openshift-machine-api
NAME                         PHASE         TYPE   REGION   ZONE   AGE
ocp44-9gcdd-master-0         Running                              13h
ocp44-9gcdd-master-1         Running                              13h
ocp44-9gcdd-master-2         Running                              13h
ocp44-9gcdd-worker-0-7mpnk   Provisioned                          13h
ocp44-9gcdd-worker-0-8p2mp   Provisioned                          13h
ocp44-9gcdd-worker-0-dwjsq   Provisioned                          13h


Expected results:

Machines in a running phase, with the corresponding information in the console.

Comment 1 Ramon Gordillo 2020-05-26 09:04:30 UTC
Created attachment 1692187 [details]
master.yaml

Comment 2 Ramon Gordillo 2020-05-26 09:04:54 UTC
Created attachment 1692188 [details]
worker.yaml

Comment 4 Roy Golan 2020-05-28 09:58:24 UTC
please provide the logs of all the containers of the pods machine-api-controllers, and machine-api-operator

Comment 5 Ramon Gordillo 2020-05-28 10:30:38 UTC
Created attachment 1692996 [details]
machine-api-operator logs (with prefix per container)

Comment 6 Ramon Gordillo 2020-05-28 10:31:16 UTC
Created attachment 1692997 [details]
machine-api-controller logs (with prefix per container)

Comment 10 Ramon Gordillo 2020-06-16 07:52:43 UTC
Same happens with 4.5-rc1

Comment 11 Sandro Bonazzola 2020-06-18 06:48:55 UTC
due to capacity constraints we will be revisiting this bug in the upcoming sprint

Comment 12 Douglas Schilling Landgraf 2020-07-09 12:07:55 UTC
due to capacity constraints we will be revisiting this bug in the upcoming sprint

Comment 13 Roy Golan 2020-07-09 12:41:45 UTC

*** This bug has been marked as a duplicate of bug 1854787 ***