Bug 2005647

Summary: UPI-on-Alicloud got the master-0 node NotReady, so that the installation failed
Product: OpenShift Container Platform Reporter: Jianli Wei <jiwei>
Component: InstallerAssignee: aos-install
Installer sub component: openshift-installer QA Contact: Gaoyun Pei <gpei>
Status: CLOSED DUPLICATE Docs Contact:
Severity: medium    
Priority: unspecified CC: anbhat, aos-bugs, bpickard, gpei, harpatil, mfojtik, mstaeble, rphillips, xxia
Version: 4.9Keywords: Reopened
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-01-17 16:34:34 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
pod 'kube-controller-manager-jiwei-ii-master-0' none

Description Jianli Wei 2021-09-19 04:54:55 UTC
Created attachment 1824328 [details]
pod 'kube-controller-manager-jiwei-ii-master-0'

Version:

[root@jiwei-ii-rhel7-bastion working-dir]# openshift-install version
openshift-install 4.9.0-rc.1
built from commit 6b4296b0df51096b4ff03e4ec4aeedeead3425ab
release image quay.io/openshift-release-dev/ocp-release@sha256:2cce76f4dc2400d3c374f76ac0aa4e481579fce293e732f0b27775b7218f2c8d
release architecture amd64
[root@jiwei-ii-rhel7-bastion working-dir]# 

Platform: alicloud

Please specify:
* UPI (semi-manual installation on customized infrastructure)

What happened?
The master-0 node stays as NotReady, so that the installation failed. 

Anything else we need to know?

Sorry that we cannot leave the cluster there for days, so please refer to the attached logs firstly. Please let me know if you have time within the coming day and want a running env, then I'll try deploying another cluster to recreate the issue for you. Thanks! 

[root@jiwei-ii-rhel7-bastion working-dir]# oc get clusterversion
NAME      VERSION   AVAILABLE   PROGRESSING   SINCE   STATUS
version             False       True          20h     Unable to apply 4.9.0-rc.1: some cluster operators have not yet rolled out
[root@jiwei-ii-rhel7-bastion working-dir]# oc get nodes -o wide
NAME                STATUS     ROLES    AGE   VERSION                INTERNAL-IP    EXTERNAL-IP   OS-IMAGE                                                       KERNEL-VERSION                 CONTAINER-RUNTIME
jiwei-ii-master-0   NotReady   master   20h   v1.22.0-rc.0+75ee307   172.16.1.223   <none>        Red Hat Enterprise Linux CoreOS 49.84.202109071840-0 (Ootpa)   4.18.0-305.17.1.el8_4.x86_64   cri-o://1.22.0-68.rhaos4.9.git011c10a.el8
jiwei-ii-master-1   Ready      master   20h   v1.22.0-rc.0+75ee307   172.16.1.224   <none>        Red Hat Enterprise Linux CoreOS 49.84.202109071840-0 (Ootpa)   4.18.0-305.17.1.el8_4.x86_64   cri-o://1.22.0-68.rhaos4.9.git011c10a.el8
jiwei-ii-master-2   Ready      master   20h   v1.22.0-rc.0+75ee307   172.16.1.225   <none>        Red Hat Enterprise Linux CoreOS 49.84.202109071840-0 (Ootpa)   4.18.0-305.17.1.el8_4.x86_64   cri-o://1.22.0-68.rhaos4.9.git011c10a.el8
jiwei-ii-worker-0   NotReady   worker   20h   v1.22.0-rc.0+75ee307   172.16.1.226   <none>        Red Hat Enterprise Linux CoreOS 49.84.202109071840-0 (Ootpa)   4.18.0-305.17.1.el8_4.x86_64   cri-o://1.22.0-68.rhaos4.9.git011c10a.el8
jiwei-ii-worker-1   NotReady   worker   20h   v1.22.0-rc.0+75ee307   172.16.1.227   <none>        Red Hat Enterprise Linux CoreOS 49.84.202109071840-0 (Ootpa)   4.18.0-305.17.1.el8_4.x86_64   cri-o://1.22.0-68.rhaos4.9.git011c10a.el8
[root@jiwei-ii-rhel7-bastion working-dir]# 

What did you expect to happen?
The UPI installation should succeed and all nodes are Ready. 

How to reproduce it (as minimally and precisely as possible)?
Not quite sure for now, although we'd got the issue twice so far.

Comment 2 Ryan Phillips 2021-10-12 19:27:45 UTC
Should be fixed in rc7 or later.

*** This bug has been marked as a duplicate of bug 2011513 ***

Comment 18 Matthew Staebler 2022-01-17 16:34:34 UTC

*** This bug has been marked as a duplicate of bug 2035757 ***