Bug 1754939

Summary:	[upi] [baremetal] Installer doesn't validate dns requirements
Product:	OpenShift Container Platform	Reporter:	Sam Yangsao <syangsao>
Component:	Installer	Assignee:	Abhinav Dahiya <adahiya>
Installer sub component:	openshift-installer	QA Contact:	sheng.lao <shlao>
Status:	CLOSED ERRATA	Docs Contact:
Severity:	medium
Priority:	medium	CC:	adahiya, aos-bugs, bbennett, danw, deads, gblomqui, mfojtik, nagrawal, wking
Version:	4.2.0	Keywords:	Reopened
Target Milestone:	---
Target Release:	4.3.0
Hardware:	x86_64
OS:	Linux
Whiteboard:
Fixed In Version:		Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:
Clones:	1755111 (view as bug list)		Environment:
Last Closed:	2020-01-23 11:06:54 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	1755111

Description Sam Yangsao 2019-09-24 12:04:30 UTC

Description of problem:

FATAL failed to initialize the cluster: Some cluster operators are still updating: authentication, console, image-registry, ingress, monitoring

Version-Release number of the following components:

# oc version
Client Version: openshift-clients-4.2.0-201909221318
Server Version: 4.2.0-0.nightly-2019-09-23-154647
Kubernetes Version: v1.14.6+8d00594

# openshift-install version
openshift-install v4.2.0
built from commit f96afb99f1ce4f8976ce62f7df44acb24d2062d6
release image quay.io/openshift-release-dev/ocp-release-nightly@sha256:b3ba58c53a3f5e98f53dff425e7e4c87b60f5d49d66213853b79f00f7a8a9448

How reproducible:

Always?

Steps to Reproduce:

1.  Create ignition configs and PXE boot bootstrap and 3 master nodes
2.  Bootstrap node completes
3.  While waiting for master nodes to come up, some of the cluster operators have not completed.  The command below was ran after 10 hours from when the initial installation began.

# openshift-install --dir=sam/ wait-for install-complete --log-level debug
DEBUG OpenShift Installer v4.2.0                   
DEBUG Built from commit f96afb99f1ce4f8976ce62f7df44acb24d2062d6 
INFO Waiting up to 30m0s for the cluster at https://api.lab.msp.redhat.com:6443 to initialize... 
DEBUG Still waiting for the cluster to initialize: Some cluster operators are still updating: authentication, console, image-registry, ingress, monitoring 
DEBUG Still waiting for the cluster to initialize: Some cluster operators are still updating: authentication, console, image-registry, ingress, monitoring 
FATAL failed to initialize the cluster: Some cluster operators are still updating: authentication, console, image-registry, ingress, monitoring 

Actual results:

[root@etcd-0 ~]# crictl pods
POD ID              CREATED             STATE               NAME                                                           NAMESPACE                                      ATTEMPT
0c92c1c6264a6       9 hours ago         NotReady            revision-pruner-6-etcd-0                                       openshift-kube-controller-manager              0
3df31ce2a7968       9 hours ago         Ready               kube-controller-manager-etcd-0                                 openshift-kube-controller-manager              0
fc5cf23057806       9 hours ago         NotReady            installer-6-etcd-0                                             openshift-kube-controller-manager              0
96f91e9fe3dbc       9 hours ago         NotReady            revision-pruner-6-etcd-0                                       openshift-kube-scheduler                       0
67ec2df7239cf       9 hours ago         Ready               openshift-kube-scheduler-etcd-0                                openshift-kube-scheduler                       0
b9ccb04bba762       9 hours ago         Ready               node-exporter-z9pgv                                            openshift-monitoring                           0
faf424b2c2772       9 hours ago         NotReady            installer-6-etcd-0                                             openshift-kube-scheduler                       0
d670083c4a6b6       9 hours ago         Ready               cluster-samples-operator-54565f7965-rfczc                      openshift-cluster-samples-operator             0
45f47d7ccc6eb       9 hours ago         Ready               ingress-operator-5665558bdc-vcrph                              openshift-ingress-operator                     0
485b5e70ee25a       10 hours ago        Ready               cluster-image-registry-operator-fcf6564b8-48r67                openshift-image-registry                       0
1d56b879a8904       10 hours ago        Ready               cluster-monitoring-operator-7b6b9dd8f9-8k8qf                   openshift-monitoring                           0
df9fdc9948b9d       10 hours ago        Ready               console-operator-6f984ccd78-hxdr2                              openshift-console-operator                     0
cf454c11394e4       10 hours ago        Ready               controller-manager-kspgv                                       openshift-controller-manager                   0
1caed7982ec6b       10 hours ago        NotReady            revision-pruner-3-etcd-0                                       openshift-kube-apiserver                       0
6d5547c8529ab       10 hours ago        NotReady            revision-pruner-5-etcd-0                                       openshift-kube-controller-manager              0
d4ef268248820       10 hours ago        Ready               kube-apiserver-etcd-0                                          openshift-kube-apiserver                       0
fe330da1e820a       10 hours ago        NotReady            installer-5-etcd-0                                             openshift-kube-controller-manager              0
9d827ee477917       10 hours ago        NotReady            installer-3-etcd-0                                             openshift-kube-apiserver                       0
af8458913a849       10 hours ago        Ready               tuned-5pt7j                                                    openshift-cluster-node-tuning-operator         0
450700f93687e       10 hours ago        NotReady            revision-pruner-2-etcd-0                                       openshift-kube-apiserver                       0
9483cb44b4558       10 hours ago        Ready               openshift-service-catalog-apiserver-operator-9766c9d48-qnm5x   openshift-service-catalog-apiserver-operator   0
a325f5ca1a72e       10 hours ago        Ready               apiserver-25zsr                                                openshift-apiserver                            0
6355101fa1e5e       10 hours ago        NotReady            revision-pruner-5-etcd-0                                       openshift-kube-scheduler                       0
24639a72856a3       10 hours ago        Ready               etcd-quorum-guard-6c5d5869bf-lnglc                             openshift-machine-config-operator              0
dde05fa752123       10 hours ago        Ready               machine-config-server-dw4jg                                    openshift-machine-config-operator              0
70f2d5dc852a5       10 hours ago        NotReady            installer-5-etcd-0                                             openshift-kube-scheduler                       0
c01456d33749d       10 hours ago        NotReady            revision-pruner-4-etcd-0                                       openshift-kube-controller-manager              0
e5059717c2eac       10 hours ago        NotReady            installer-4-etcd-0                                             openshift-kube-controller-manager              0
a97cf0b8aaffc       10 hours ago        NotReady            installer-2-etcd-0                                             openshift-kube-apiserver                       0
479c501ff82c9       10 hours ago        NotReady            revision-pruner-4-etcd-0                                       openshift-kube-scheduler                       0
16527b61375c2       10 hours ago        NotReady            installer-4-etcd-0                                             openshift-kube-scheduler                       0
8bd4c693545ce       10 hours ago        Ready               multus-admission-controller-sck4f                              openshift-multus                               0
e98ba12f48511       10 hours ago        Ready               configmap-cabundle-injector-79989ddcb9-2gkn2                   openshift-service-ca                           0
884a95840ee77       10 hours ago        Ready               apiservice-cabundle-injector-5b97d64df-gw2kh                   openshift-service-ca                           0
b8e97a0be4f2b       10 hours ago        Ready               dns-default-clhwf                                              openshift-dns                                  0
2f6902ef0f270       10 hours ago        Ready               machine-config-daemon-qnvb6                                    openshift-machine-config-operator              0
9abeae0cc06ab       10 hours ago        Ready               openshift-controller-manager-operator-7c4dcb969f-hn64w         openshift-controller-manager-operator          0
b35c74d7cdb2f       10 hours ago        Ready               machine-config-operator-76bfc97464-qvrx2                       openshift-machine-config-operator              0
e020466b6a4a9       10 hours ago        Ready               kube-apiserver-operator-858d85c6cb-6f9gp                       openshift-kube-apiserver-operator              0
18bb3cdd27561       10 hours ago        Ready               insights-operator-586c48d6fc-nntz4                             openshift-insights                             0
132deba4bed25       10 hours ago        Ready               kube-controller-manager-operator-599dd4658d-c2bdc              openshift-kube-controller-manager-operator     0
77dce3bafbaa2       10 hours ago        Ready               sdn-controller-9lvtg                                           openshift-sdn                                  0
e1a2d8f15d663       10 hours ago        Ready               sdn-f8tpv                                                      openshift-sdn                                  0
aac2b55e4d676       10 hours ago        Ready               ovs-4l96t                                                      openshift-sdn                                  0
f3c105f2abe4e       10 hours ago        Ready               multus-gg8fn                                                   openshift-multus                               0
4148d0f7f7753       11 hours ago        Ready               etcd-member-etcd-0                                             openshift-etcd                                 0

[root@etcd-1 ~]# crictl  pods
POD ID              CREATED             STATE               NAME                                                 NAMESPACE                                ATTEMPT
e74509e06e455       9 hours ago         NotReady            revision-pruner-6-etcd-1                             openshift-kube-scheduler                 0
2eb39add3ab76       9 hours ago         Ready               cluster-autoscaler-operator-87d7696b9-bwlcx          openshift-machine-api                    0
783c4fdf980b6       9 hours ago         Ready               openshift-kube-scheduler-etcd-1                      openshift-kube-scheduler                 0
3a9308b8fb812       9 hours ago         NotReady            installer-6-etcd-1                                   openshift-kube-scheduler                 0
bf30ba8376ab1       9 hours ago         NotReady            revision-pruner-6-etcd-1                             openshift-kube-controller-manager        0
80d181b90ee90       9 hours ago         Ready               kube-controller-manager-etcd-1                       openshift-kube-controller-manager        0
6c8b1922e4256       9 hours ago         Ready               node-exporter-sdghw                                  openshift-monitoring                     0
44d51e17dca3b       9 hours ago         NotReady            installer-6-etcd-1                                   openshift-kube-controller-manager        0
6893b2992c4e4       9 hours ago         Ready               downloads-5bb8997d85-8sf8q                           openshift-console                        0
5995770f17acc       10 hours ago        Ready               controller-manager-xkm8p                             openshift-controller-manager             0
8279176f33481       10 hours ago        Ready               packageserver-5dc69b7-768cd                          openshift-operator-lifecycle-manager     0
4e2c4e6c8094d       10 hours ago        NotReady            revision-pruner-3-etcd-1                             openshift-kube-apiserver                 0
51c97d5906417       10 hours ago        NotReady            revision-pruner-5-etcd-1                             openshift-kube-controller-manager        0
2d43d6d6c04cb       10 hours ago        NotReady            installer-5-etcd-1                                   openshift-kube-controller-manager        0
fae8aa65212cb       10 hours ago        Ready               tuned-nx6g8                                          openshift-cluster-node-tuning-operator   0
c64273b4088b0       10 hours ago        Ready               kube-apiserver-etcd-1                                openshift-kube-apiserver                 0
485b5944454de       10 hours ago        NotReady            revision-pruner-2-etcd-1                             openshift-kube-apiserver                 0
2945c02a1d352       10 hours ago        NotReady            installer-3-etcd-1                                   openshift-kube-apiserver                 0
f57af5b54fad0       10 hours ago        Ready               etcd-quorum-guard-6c5d5869bf-qwkkk                   openshift-machine-config-operator        0
095a5369a852f       10 hours ago        Ready               apiserver-6hp5t                                      openshift-apiserver                      0
2d75f16df8f5d       10 hours ago        Ready               machine-config-server-4t6pl                          openshift-machine-config-operator        0
4c1227a3ad584       10 hours ago        NotReady            installer-2-etcd-1                                   openshift-kube-apiserver                 0
79e4abcc39424       10 hours ago        NotReady            revision-pruner-4-etcd-1                             openshift-kube-controller-manager        0
b671a85aaab26       10 hours ago        NotReady            installer-4-etcd-1                                   openshift-kube-controller-manager        0
a181471815fed       10 hours ago        NotReady            revision-pruner-5-etcd-1                             openshift-kube-scheduler                 0
7620ba196a657       10 hours ago        NotReady            revision-pruner-2-etcd-1                             openshift-kube-controller-manager        0
f7ca479cb9431       10 hours ago        NotReady            installer-5-etcd-1                                   openshift-kube-scheduler                 0
fb1650595fe9e       10 hours ago        Ready               catalog-operator-65857c7d75-bspts                    openshift-operator-lifecycle-manager     0
596707f91f231       10 hours ago        Ready               multus-admission-controller-ltrvn                    openshift-multus                         0
f8804ed4daa58       10 hours ago        NotReady            installer-2-etcd-1                                   openshift-kube-controller-manager        0
1bb978de74f17       10 hours ago        Ready               service-serving-cert-signer-7787d496cf-qttjc         openshift-service-ca                     0
9e0f7fa42beca       10 hours ago        Ready               machine-config-daemon-zqp55                          openshift-machine-config-operator        0
b939ae7321590       10 hours ago        Ready               dns-default-g5rqt                                    openshift-dns                            0
a4a694dccead2       10 hours ago        Ready               openshift-apiserver-operator-6b479c8f9f-ctqd5        openshift-apiserver-operator             0
c23b7d35adcfd       10 hours ago        Ready               service-ca-operator-5f46776b66-bk7pq                 openshift-service-ca-operator            0
42a5136ae497b       10 hours ago        Ready               openshift-kube-scheduler-operator-64b9499754-wwm9z   openshift-kube-scheduler-operator        0
59e7c9caaf484       10 hours ago        Ready               cloud-credential-operator-58bbb76884-h2dj8           openshift-cloud-credential-operator      0
9cfd10aedae92       10 hours ago        Ready               sdn-controller-6hx58                                 openshift-sdn                            0
888d5928041d0       10 hours ago        Ready               sdn-rt9mr                                            openshift-sdn                            0
0ad42864bc0d9       10 hours ago        Ready               ovs-mv89r                                            openshift-sdn                            0
1d5f0fe887e7e       10 hours ago        Ready               multus-wr69v                                         openshift-multus                         0
63685363d928a       11 hours ago        Ready               etcd-member-etcd-1                                   openshift-etcd                           0

[root@etcd-2 ~]# crictl pods
POD ID              CREATED             STATE               NAME                                                              NAMESPACE                                               ATTEMPT
90c50f2fae268       9 hours ago         NotReady            revision-pruner-6-etcd-2                                          openshift-kube-scheduler                                0
faf99fcf44113       9 hours ago         Ready               openshift-kube-scheduler-etcd-2                                   openshift-kube-scheduler                                0
40323fa6f9fe3       9 hours ago         NotReady            installer-6-etcd-2                                                openshift-kube-scheduler                                0
340dbb8cd605b       9 hours ago         Ready               prometheus-operator-668f98845c-gr79h                              openshift-monitoring                                    0
78e9f04e30d46       9 hours ago         NotReady            revision-pruner-6-etcd-2                                          openshift-kube-controller-manager                       0
10e1d8a286020       9 hours ago         Ready               kube-controller-manager-etcd-2                                    openshift-kube-controller-manager                       0
ed64cc4cde85f       9 hours ago         NotReady            installer-6-etcd-2                                                openshift-kube-controller-manager                       0
bf315d258226b       9 hours ago         Ready               node-exporter-zrx2l                                               openshift-monitoring                                    0
d7743c3447368       10 hours ago        Ready               downloads-5bb8997d85-5ptll                                        openshift-console                                       0
2b5ed25a4e81a       10 hours ago        Ready               cluster-storage-operator-795c6df4b-np747                          openshift-cluster-storage-operator                      0
0cc6226fb2cfe       10 hours ago        Ready               marketplace-operator-7449c7c9bc-4dnw9                             openshift-marketplace                                   0
0c2b5d3636fcf       10 hours ago        Ready               packageserver-5dc69b7-jw6kw                                       openshift-operator-lifecycle-manager                    0
717b7712db353       10 hours ago        Ready               controller-manager-7hx49                                          openshift-controller-manager                            0
09d786f12419a       10 hours ago        NotReady            revision-pruner-3-etcd-2                                          openshift-kube-apiserver                                0
97c273c54214b       10 hours ago        Ready               kube-apiserver-etcd-2                                             openshift-kube-apiserver                                0
5e2cd320ee6ae       10 hours ago        NotReady            installer-3-etcd-2                                                openshift-kube-apiserver                                0
6d9dd1fb997d3       10 hours ago        NotReady            revision-pruner-5-etcd-2                                          openshift-kube-controller-manager                       0
9c6d8ad600210       10 hours ago        NotReady            installer-5-etcd-2                                                openshift-kube-controller-manager                       0
cd4adb963887d       10 hours ago        Ready               tuned-rgxhg                                                       openshift-cluster-node-tuning-operator                  0
ef1c520c9449a       10 hours ago        NotReady            revision-pruner-2-etcd-2                                          openshift-kube-apiserver                                0
9baa5f488892f       10 hours ago        Ready               cluster-node-tuning-operator-5b79c4f65-q4xjm                      openshift-cluster-node-tuning-operator                  0
33cf42042eeb1       10 hours ago        Ready               openshift-service-catalog-controller-manager-operator-7b764jmx2   openshift-service-catalog-controller-manager-operator   0
220bea2991a5d       10 hours ago        Ready               authentication-operator-c6b58d7cb-7tgrq                           openshift-authentication-operator                       0
2300e8eaf61ec       10 hours ago        NotReady            installer-2-etcd-2                                                openshift-kube-apiserver                                0
99ce07eaa2902       10 hours ago        Ready               etcd-quorum-guard-6c5d5869bf-dc8jv                                openshift-machine-config-operator                       0
876e46e41e952       10 hours ago        Ready               apiserver-c2lsl                                                   openshift-apiserver                                     0
e67973e9289ce       10 hours ago        Ready               machine-config-server-b76nx                                       openshift-machine-config-operator                       0
7d9f0b0c40828       10 hours ago        NotReady            revision-pruner-5-etcd-2                                          openshift-kube-scheduler                                0
76bdaf81ab26a       10 hours ago        Ready               machine-config-controller-6ccb48c449-nvpjg                        openshift-machine-config-operator                       0
cef4429f418bf       10 hours ago        NotReady            installer-5-etcd-2                                                openshift-kube-scheduler                                0
f3d5c7eec3d30       10 hours ago        NotReady            revision-pruner-4-etcd-2                                          openshift-kube-controller-manager                       0
094ec97b69415       10 hours ago        NotReady            installer-4-etcd-2                                                openshift-kube-controller-manager                       0
ff1ba790745bb       10 hours ago        Ready               olm-operator-6bdfb77b7c-9rpf4                                     openshift-operator-lifecycle-manager                    0
50dc94b367247       10 hours ago        Ready               multus-admission-controller-dmncn                                 openshift-multus                                        0
b3997b3b5daca       10 hours ago        Ready               dns-default-ghp4x                                                 openshift-dns                                           0
00113b3293eff       10 hours ago        Ready               machine-config-daemon-dhmnc                                       openshift-machine-config-operator                       0
1d42ff216c6c0       10 hours ago        Ready               cluster-version-operator-85b98666c8-zgl9x                         openshift-cluster-version                               0
ae2cda5a6a46c       10 hours ago        Ready               dns-operator-c6bff95d8-mk8kk                                      openshift-dns-operator                                  0
5fd1fdc7b147f       10 hours ago        Ready               machine-approver-d599647f4-tbhsx                                  openshift-cluster-machine-approver                      0
2420868e9189d       10 hours ago        Ready               machine-api-operator-57b49f84bf-4npnj                             openshift-machine-api                                   0
7b60784bb99e5       10 hours ago        Ready               sdn-controller-6xpcr                                              openshift-sdn                                           0
1bdd35ae70f17       10 hours ago        Ready               sdn-dxl8w                                                         openshift-sdn                                           0
fb60a9b4f5d67       10 hours ago        Ready               ovs-d28wf                                                         openshift-sdn                                           0
9f77b3e4eab9b       10 hours ago        Ready               multus-lk2dq                                                      openshift-multus                                        0
e2515e3f4ff8e       10 hours ago        Ready               network-operator-56668895ff-bjjg2                                 openshift-network-operator                              0
c4c3bbd03a27c       11 hours ago        Ready               etcd-member-etcd-2                                                openshift-etcd                                          0

No pending csr's either.

# oc get csr
No resources found.

Expected results:

Installation should complete successfully

Additional info:
Please attach logs from ansible-playbook with the -vvv flag

Comment 3 Scott Dodson 2019-09-24 16:04:05 UTC

Please gather the output of `oc adm must-gather` and attach that to this bug.

Likely root cause is either ingress or auth.

Comment 4 Abhinav Dahiya 2019-09-24 16:07:21 UTC

Did you add the compute nodes? https://docs.openshift.com/container-platform/4.1/installing/installing_bare_metal/installing-bare-metal.html#machine-requirements_installing-bare-metal

Comment 5 Sam Yangsao 2019-09-24 16:13:32 UTC

(In reply to Abhinav Dahiya from comment #4)
> Did you add the compute nodes?
> https://docs.openshift.com/container-platform/4.1/installing/
> installing_bare_metal/installing-bare-metal.html#machine-
> requirements_installing-bare-metal

I did not yet, I can PXE boot them up with the current configs, since we're at this state, should I start over and also ensure the compute nodes are up (along with the bootstrap/control nodes)?  

Or should I start over while setting this option to `0` instead?

compute:
- hyperthreading: Disabled   
  name: worker
  replicas: 3 <--- Change to `0` and restart a fresh install

Comment 6 Sam Yangsao 2019-09-24 16:20:17 UTC

(In reply to Scott Dodson from comment #3)
> Please gather the output of `oc adm must-gather` and attach that to this bug.
> 
> Likely root cause is either ingress or auth.

It fails to run, I'm assuming because the cluster is not fully up.

# oc adm must-gather
[must-gather      ] OUT Using must-gather plugin-in image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:9ecf8fce3bc1cf67073b09cbf95006b35bd8715d141ddd9f961dcebf74719b43
[must-gather      ] OUT namespace/openshift-must-gather-tx6rj created
[must-gather      ] OUT clusterrolebinding.rbac.authorization.k8s.io/must-gather-rvbll created
[must-gather      ] OUT pod for plug-in image quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:9ecf8fce3bc1cf67073b09cbf95006b35bd8715d141ddd9f961dcebf74719b43 created
[must-gather-m72n4] POD 2019/09/24 16:06:39 Finished successfully with no errors.
[must-gather-m72n4] POD 2019/09/24 16:06:40 Gathering data for ns/openshift-cluster-version...
[must-gather-m72n4] POD 2019/09/24 16:06:40     Collecting resources for namespace "openshift-cluster-version"...
[must-gather-m72n4] POD 2019/09/24 16:06:40     Gathering pod data for namespace "openshift-cluster-version"...
[must-gather-m72n4] POD 2019/09/24 16:06:40         Gathering data for pod "cluster-version-operator-85b98666c8-zgl9x"
[must-gather-m72n4] POD 2019/09/24 16:06:49         Skipping container endpoint collection for pod "cluster-version-operator-85b98666c8-zgl9x" container "cluster-version-operator": No ports
[must-gather-m72n4] POD 2019/09/24 16:07:34 Finished successfully with no errors.
[must-gather-m72n4] POD 2019/09/24 16:07:35 Gathering config.openshift.io resource data...
[must-gather-m72n4] POD 2019/09/24 16:07:37 Gathering kubeapiserver.operator.openshift.io resource data...
[must-gather-m72n4] POD 2019/09/24 16:07:37 Gathering cluster operator resource data...
[must-gather-m72n4] POD 2019/09/24 16:07:37     Gathering related object reference information for ClusterOperator "authentication"...
[must-gather-m72n4] POD 2019/09/24 16:07:37     Found related object "authentications.operator.openshift.io/cluster" for ClusterOperator "authentication"...
[must-gather-m72n4] POD 2019/09/24 16:07:37     Found related object "authentications.config.openshift.io/cluster" for ClusterOperator "authentication"...
[must-gather-m72n4] POD 2019/09/24 16:07:37     Found related object "infrastructures.config.openshift.io/cluster" for ClusterOperator "authentication"...
[must-gather-m72n4] POD 2019/09/24 16:07:37     Found related object "oauths.config.openshift.io/cluster" for ClusterOperator "authentication"...
[must-gather-m72n4] POD 2019/09/24 16:07:37     Found related object "namespaces/openshift-config" for ClusterOperator "authentication"...
[must-gather-m72n4] POD 2019/09/24 16:07:37     Found related object "namespaces/openshift-config-managed" for ClusterOperator "authentication"...
[must-gather-m72n4] POD 2019/09/24 16:07:37     Found related object "namespaces/openshift-authentication" for ClusterOperator "authentication"...
[must-gather-m72n4] POD 2019/09/24 16:07:37     Found related object "namespaces/openshift-authentication-operator" for ClusterOperator "authentication"...
[must-gather-m72n4] OUT gather logs unavailable: unexpected EOF
[must-gather-m72n4] OUT waiting for gather to complete


[must-gather-m72n4] OUT gather never finished: timed out waiting for the condition
[must-gather      ] OUT clusterrolebinding.rbac.authorization.k8s.io/must-gather-rvbll deleted
[must-gather      ] OUT namespace/openshift-must-gather-tx6rj deleted
error: gather never finished for pod must-gather-m72n4: timed out waiting for the condition

Comment 8 Greg Blomquist 2019-09-24 18:13:07 UTC

Must gather says CVO reports no ports.  Sending to networking.

Comment 9 Dan Winship 2019-09-24 18:36:36 UTC

(In reply to Greg Blomquist from comment #8)
> Must gather says CVO reports no ports.  Sending to networking.

That error appears to mean "the container does not declare any ports, so I'm not going to test if I can reach those ports". It does not indicate a networking error.

Comment 10 David Eads 2019-09-24 18:45:29 UTC

Providing some concrete next steps:

 1. FATAL failed to initialize the cluster: Some cluster operators are still updating: authentication, console, image-registry, ingress, monitoring  suggests network-edge, you could copy dan mace

 2. we need some data to starting digging in.  `oc get clusteroperators -oyaml` will be helpful in cases where must-gather failed.  `oc get clusteroperators` may also help you determine who to engage with

 3. In cases where you can't collect must-gather, we're going to want a separate bug against `oc` so we can always provide *something*.

 4. In cases without must-gather, you're likely to want to give dev access to the cluster to poke around and see what's available.  An email to the owner or @neelesh will keep creds out of bugzilla.

Comment 12 Dan Winship 2019-09-24 19:36:41 UTC

(In reply to David Eads from comment #10)
>  1. FATAL failed to initialize the cluster: Some cluster operators are still
> updating: authentication, console, image-registry, ingress, monitoring 
> suggests network-edge, you could copy dan mace

bouncing to Routing.

Comment 13 Dan Mace 2019-09-24 19:41:31 UTC

To do anything with this bug, we need either concrete, portable reproducer steps, or at least information David asked for in https://bugzilla.redhat.com/show_bug.cgi?id=1754939#c10.

Comment 16 Scott Dodson 2019-09-25 14:41:40 UTC

This has failed because the wildcard dns entry isn't present, we should validate that either by having the Installer emitting a warning or making the bootstrap process fail with a clear message indicating the root cause.

      message: 'RouteHealthDegraded: failed to GET route: dial tcp: lookup oauth-openshift.apps.lab.msp.redhat.com
        on 172.30.0.10:53: no such host'

Comment 17 David Eads 2019-09-25 15:02:57 UTC

An installer warning is likely to be ignored and we'll end up chasing this failure through multiple operators.  If the bootstrap process hard fails, the user will get a clear message and any bug report will be directed to a team with the knowledge to help a user through the failure.

Deferring the failure and debugging until the bootstrap has "succeeded" makes distinguishing types of failures harder, increases the time to resolution, and produces output without any real value.

Comment 18 Sam Yangsao 2019-09-25 17:00:59 UTC

Just an update, the install is still going.. odd that it shows 97% initially, then jumps back to 49%

[root@tatooine ocp42]# openshift-install --dir=sam/ wait-for install-complete --log-level debug
DEBUG OpenShift Installer v4.2.0                   
DEBUG Built from commit f96afb99f1ce4f8976ce62f7df44acb24d2062d6 
INFO Waiting up to 30m0s for the cluster at https://api.lab.msp.redhat.com:6443 to initialize... 
DEBUG Still waiting for the cluster to initialize: Working towards 4.2.0-0.nightly-2019-09-23-154647: 97% complete 
DEBUG Still waiting for the cluster to initialize: Working towards 4.2.0-0.nightly-2019-09-23-154647 
DEBUG Still waiting for the cluster to initialize: Working towards 4.2.0-0.nightly-2019-09-23-154647: 10% complete 
DEBUG Still waiting for the cluster to initialize: Working towards 4.2.0-0.nightly-2019-09-23-154647: 16% complete 
DEBUG Still waiting for the cluster to initialize: Working towards 4.2.0-0.nightly-2019-09-23-154647: 26% complete 
DEBUG Still waiting for the cluster to initialize: Working towards 4.2.0-0.nightly-2019-09-23-154647: 38% complete 
DEBUG Still waiting for the cluster to initialize: Working towards 4.2.0-0.nightly-2019-09-23-154647: 40% complete 
DEBUG Still waiting for the cluster to initialize: Working towards 4.2.0-0.nightly-2019-09-23-154647: 49% complete 
DEBUG Still waiting for the cluster to initialize: Working towards 4.2.0-0.nightly-2019-09-23-154647: 49% complete 
DEBUG Still waiting for the cluster to initialize: Working towards 4.2.0-0.nightly-2019-09-23-154647: 49% complete 
DEBUG Still waiting for the cluster to initialize: Working towards 4.2.0-0.nightly-2019-09-23-154647: 49% complete 
DEBUG Still waiting for the cluster to initialize: Working towards 4.2.0-0.nightly-2019-09-23-154647: 49% complete 
DEBUG Still waiting for the cluster to initialize: Working towards 4.2.0-0.nightly-2019-09-23-154647 
DEBUG Still waiting for the cluster to initialize: Working towards 4.2.0-0.nightly-2019-09-23-154647: 24% complete 
DEBUG Still waiting for the cluster to initialize: Working towards 4.2.0-0.nightly-2019-09-23-154647: 41% complete 
DEBUG Still waiting for the cluster to initialize: Working towards 4.2.0-0.nightly-2019-09-23-154647: 50% complete 
DEBUG Still waiting for the cluster to initialize: Working towards 4.2.0-0.nightly-2019-09-23-154647: 53% complete 
DEBUG Still waiting for the cluster to initialize: Working towards 4.2.0-0.nightly-2019-09-23-154647: 55% complete 
DEBUG Still waiting for the cluster to initialize: Working towards 4.2.0-0.nightly-2019-09-23-154647: 60% complete 
DEBUG Still waiting for the cluster to initialize: Working towards 4.2.0-0.nightly-2019-09-23-154647: 65% complete 
DEBUG Still waiting for the cluster to initialize: Working towards 4.2.0-0.nightly-2019-09-23-154647: 69% complete 
DEBUG Still waiting for the cluster to initialize: Working towards 4.2.0-0.nightly-2019-09-23-154647: 71% complete 
DEBUG Still waiting for the cluster to initialize: Working towards 4.2.0-0.nightly-2019-09-23-154647: 76% complete 
DEBUG Still waiting for the cluster to initialize: Working towards 4.2.0-0.nightly-2019-09-23-154647: 85% complete 
DEBUG Still waiting for the cluster to initialize: Working towards 4.2.0-0.nightly-2019-09-23-154647: 98% complete 

[root@etcd-0 ~]# crictl pods
POD ID              CREATED             STATE               NAME                                                              NAMESPACE                                               ATTEMPT
c345ccd93f74c       16 minutes ago      NotReady            revision-pruner-8-etcd-0                                          openshift-kube-apiserver                                0
913189c5f154b       21 minutes ago      Ready               kube-apiserver-etcd-0                                             openshift-kube-apiserver                                0
4519ea8a316ae       22 minutes ago      NotReady            installer-8-etcd-0                                                openshift-kube-apiserver                                0
28ffcaa78440c       About an hour ago   Ready               packageserver-7b58f6fc-ql4sp                                      openshift-operator-lifecycle-manager                    0
e14688578ab67       2 hours ago         Ready               node-exporter-svzdg                                               openshift-monitoring                                    0
652c44af0cf88       3 hours ago         Ready               oauth-openshift-7d6b8b465f-jvpw8                                  openshift-authentication                                0
a1e3b9c4b23dc       3 hours ago         Ready               console-b6fbc547d-sd4tb                                           openshift-console                                       0
278ac32e1c531       3 hours ago         Ready               cluster-autoscaler-operator-87d7696b9-kzdjk                       openshift-machine-api                                   0
92e0e826b1a8e       3 hours ago         NotReady            revision-pruner-7-etcd-0                                          openshift-kube-apiserver                                0
72ccfe3835808       3 hours ago         NotReady            installer-7-etcd-0                                                openshift-kube-apiserver                                0
d74486afb9ae4       4 hours ago         NotReady            revision-pruner-6-etcd-0                                          openshift-kube-scheduler                                0
0a1880ba2f76d       4 hours ago         Ready               prometheus-operator-668f98845c-6g99l                              openshift-monitoring                                    0
90f52417a1bf2       4 hours ago         NotReady            revision-pruner-7-etcd-0                                          openshift-kube-controller-manager                       0
8fbd4cb25bb3b       4 hours ago         Ready               openshift-kube-scheduler-etcd-0                                   openshift-kube-scheduler                                0
11a0457f32c06       4 hours ago         Ready               kube-controller-manager-etcd-0                                    openshift-kube-controller-manager                       0
c2dd6f7f154fd       4 hours ago         NotReady            installer-6-etcd-0                                                openshift-kube-scheduler                                0
9e583fd107371       4 hours ago         NotReady            installer-7-etcd-0                                                openshift-kube-controller-manager                       0
eff3a6e151fef       5 hours ago         Ready               console-operator-6f984ccd78-j96qk                                 openshift-console-operator                              0
75a1e71330e5d       5 hours ago         Ready               cluster-image-registry-operator-fcf6564b8-zgfnc                   openshift-image-registry                                0
0623776fd46a0       5 hours ago         NotReady            revision-pruner-5-etcd-0                                          openshift-kube-apiserver                                0
0000a2c618bf5       5 hours ago         NotReady            revision-pruner-6-etcd-0                                          openshift-kube-controller-manager                       0
befcb7df525a2       5 hours ago         NotReady            installer-6-etcd-0                                                openshift-kube-controller-manager                       0
445f89f69f504       5 hours ago         NotReady            installer-5-etcd-0                                                openshift-kube-apiserver                                0
783f064d23f39       5 hours ago         Ready               controller-manager-k5smz                                          openshift-controller-manager                            0
2dfa70d011930       5 hours ago         NotReady            revision-pruner-4-etcd-0                                          openshift-kube-apiserver                                0
d4e56a6ab2ffb       5 hours ago         NotReady            revision-pruner-2-etcd-0                                          openshift-kube-apiserver                                0
b0b0e16e8a6e6       5 hours ago         NotReady            installer-4-etcd-0                                                openshift-kube-apiserver                                0
bbf3821e69aeb       21 hours ago        Ready               etcd-quorum-guard-6c5d5869bf-5n9cq                                openshift-machine-config-operator                       0
922317e327bcb       21 hours ago        Ready               machine-config-server-v2bdw                                       openshift-machine-config-operator                       0
44d2c4afd3dd5       22 hours ago        Ready               tuned-fm9mb                                                       openshift-cluster-node-tuning-operator                  0
727ea4b9adf1f       22 hours ago        NotReady            revision-pruner-5-etcd-0                                          openshift-kube-controller-manager                       0
16ad938ec3479       22 hours ago        Ready               openshift-service-catalog-apiserver-operator-9766c9d48-qrvcg      openshift-service-catalog-apiserver-operator            0
fe6b1b518c4ca       22 hours ago        Ready               openshift-service-catalog-controller-manager-operator-7b76tvfgm   openshift-service-catalog-controller-manager-operator   0
aeb8a9c230338       22 hours ago        NotReady            installer-5-etcd-0                                                openshift-kube-controller-manager                       0
128e5ead83e30       22 hours ago        NotReady            revision-pruner-5-etcd-0                                          openshift-kube-scheduler                                0
55df920fc0a5c       22 hours ago        NotReady            revision-pruner-4-etcd-0                                          openshift-kube-controller-manager                       0
0d8c3170f20ab       22 hours ago        NotReady            installer-5-etcd-0                                                openshift-kube-scheduler                                0
258b76cfee50b       22 hours ago        NotReady            revision-pruner-3-etcd-0                                          openshift-kube-scheduler                                0
5ab412aa94bdb       22 hours ago        NotReady            installer-4-etcd-0                                                openshift-kube-controller-manager                       0
8b93ea33a2dec       22 hours ago        NotReady            installer-3-etcd-0                                                openshift-kube-scheduler                                0
7864345a4c382       22 hours ago        Ready               multus-admission-controller-h8jkx                                 openshift-multus                                        0
1037f39df3846       22 hours ago        Ready               apiserver-gk2kz                                                   openshift-apiserver                                     0
944682a513e71       22 hours ago        Ready               catalog-operator-65857c7d75-cgw6n                                 openshift-operator-lifecycle-manager                    0
9c13d00f124db       22 hours ago        NotReady            revision-pruner-3-etcd-0                                          openshift-kube-controller-manager                       0
7b78936881253       22 hours ago        Ready               machine-config-daemon-9qwn6                                       openshift-machine-config-operator                       0
0e6bf9c6b139e       22 hours ago        Ready               dns-default-kv5zk                                                 openshift-dns                                           0
f3313296533af       22 hours ago        NotReady            installer-3-etcd-0                                                openshift-kube-controller-manager                       0
967aaa2983ca8       22 hours ago        NotReady            revision-pruner-2-etcd-0                                          openshift-kube-scheduler                                0
b43d1931516b2       22 hours ago        NotReady            revision-pruner-2-etcd-0                                          openshift-kube-controller-manager                       0
8f28976c60584       22 hours ago        NotReady            installer-2-etcd-0                                                openshift-kube-apiserver                                0
269224e6b9fee       22 hours ago        NotReady            installer-2-etcd-0                                                openshift-kube-scheduler                                0
df5dbafd4048b       22 hours ago        NotReady            installer-2-etcd-0                                                openshift-kube-controller-manager                       0
58d71fd3b607e       22 hours ago        Ready               cloud-credential-operator-58bbb76884-z7llh                        openshift-cloud-credential-operator                     0
cd7f7179274b5       22 hours ago        Ready               machine-config-operator-76bfc97464-dfsld                          openshift-machine-config-operator                       0
333fd3fced044       22 hours ago        Ready               openshift-apiserver-operator-6b479c8f9f-lqspx                     openshift-apiserver-operator                            0
67c2ce11821a0       22 hours ago        Ready               cluster-version-operator-85b98666c8-8m4rq                         openshift-cluster-version                               0
3a89c83600810       22 hours ago        Ready               sdn-d7ffx                                                         openshift-sdn                                           0
024a862aec6d1       23 hours ago        Ready               sdn-controller-vxgt8                                              openshift-sdn                                           0
6af7554b88fb4       23 hours ago        Ready               ovs-ndbbz                                                         openshift-sdn                                           0
a7af937bea208       23 hours ago        Ready               multus-nbl7h                                                      openshift-multus                                        0
b98740725942e       23 hours ago        Ready               network-operator-56668895ff-rpp6g                                 openshift-network-operator                              0
d24e348db8189       24 hours ago        Ready               etcd-member-etcd-0                                                openshift-etcd                                          0

[root@tatooine ocp42]# oc get csr
NAME        AGE     REQUESTOR                                                                   CONDITION
csr-2t72z   4h23m   system:node:worker2                                                         Approved,Issued
csr-5nhzl   6h6m    system:node:etcd-1                                                          Approved,Issued
csr-7246v   23h     system:serviceaccount:openshift-machine-config-operator:node-bootstrapper   Approved,Issued
csr-74wcx   4h38m   system:node:worker2                                                         Approved,Issued
csr-7smv8   23h     system:serviceaccount:openshift-machine-config-operator:node-bootstrapper   Approved,Issued
csr-8cpt2   23h     system:serviceaccount:openshift-machine-config-operator:node-bootstrapper   Approved,Issued
csr-8xjs4   3h41m   system:node:worker1                                                         Approved,Issued
csr-9gm94   23h     system:serviceaccount:openshift-machine-config-operator:node-bootstrapper   Approved,Issued
csr-9qqqb   23h     system:serviceaccount:openshift-machine-config-operator:node-bootstrapper   Approved,Issued
csr-bbtxq   4h21m   system:node:etcd-1                                                          Approved,Issued
csr-bckk8   3h38m   system:node:etcd-0                                                          Approved,Issued
csr-blmq7   143m    system:node:etcd-2                                                          Approved,Issued
csr-d7p6p   3h26m   system:node:worker1                                                         Approved,Issued
csr-dscsk   23h     system:node:etcd-2                                                          Approved,Issued
csr-fbbv6   4h52m   system:node:etcd-0                                                          Approved,Issued
csr-fnwn9   84m     system:node:worker2                                                         Approved,Issued
csr-fzdqk   4h59m   system:node:worker1                                                         Approved,Issued
csr-gkzff   23h     system:node:worker1                                                         Approved,Issued
csr-h9wrq   23h     system:node:worker2                                                         Approved,Issued
csr-hq24b   4h51m   system:node:etcd-0                                                          Approved,Issued
csr-hrtdv   4h51m   system:node:etcd-0                                                          Approved,Issued
csr-hv7zq   81m     system:node:etcd-1                                                          Approved,Issued
csr-k87z2   7h27m   system:node:etcd-2                                                          Approved,Issued
csr-ksr74   23h     system:serviceaccount:openshift-machine-config-operator:node-bootstrapper   Approved,Issued
csr-mg8dj   3h56m   system:node:worker1                                                         Approved,Issued
csr-mh5m7   79m     system:node:etcd-2                                                          Approved,Issued
csr-mqppd   3h22m   system:node:etcd-0                                                          Approved,Issued
csr-n8hqd   3h11m   system:node:worker1                                                         Approved,Issued
csr-nfpps   4h38m   system:node:worker2                                                         Approved,Issued
csr-nssxq   3h22m   system:node:etcd-0                                                          Approved,Issued
csr-nvxhm   23h     system:node:etcd-1                                                          Approved,Issued
csr-pkxdw   79m     system:node:etcd-2                                                          Approved,Issued
csr-q2pgf   23h     system:node:etcd-0                                                          Approved,Issued
csr-rff48   4h38m   system:node:worker2                                                         Approved,Issued
csr-sc7cc   3h53m   system:node:etcd-0                                                          Approved,Issued
csr-vnpwp   5h22m   system:node:etcd-2                                                          Approved,Issued
csr-wsctc   23h     system:serviceaccount:openshift-machine-config-operator:node-bootstrapper   Approved,Issued
csr-zlgs7   4h36m   system:node:etcd-1                                                          Approved,Issued
csr-zq7rf   6h42m   system:node:worker2                                                         Approved,Issued

The difference between this install versus the original install for this BZ.  I recreated the ignition configuration as follows:

<snip>

# cat install-config.yaml 
apiVersion: v1
baseDomain: msp.redhat.com
compute:
- hyperthreading: Disabled   
  name: worker
  replicas: 2   <<< Changed from '3' 
controlPlane:
  hyperthreading: Disabled   
  name: master 
  replicas: 3 
metadata:
  name: lab
networking:
  clusterNetworks:
  - cidr: 10.128.0.0/14 
    hostPrefix: 23 
  networkType: OpenShiftSDN
  serviceNetwork: 
  - 172.30.0.0/16
platform:
  none: {}
pullSecret: '{"auths":  #######}' 
sshKey: 'ssh-ed25519 ######'

</snip>

PXE booted the following VM's for installation:

bootstrap
etcd-0
etcd-1
etcd-2
worker1 * did not exist in the other install
worker2 * did not exist in the other install

Installation is still going for ~24 hours, I'm assuming due mainly to the slower disk backend, but at least its still running.

I'll write up another Bug, but I was thinking if the install_config has compute nodes configured and the install does not see them, we should have some error reporting on this stating that it's not finding the compute nodes to continue on with the installation.

Comment 19 W. Trevor King 2019-09-25 17:42:27 UTC

> odd that it shows 97% initially, then jumps back to 49%

This is explained in bug 1690816.

Comment 20 Scott Dodson 2019-09-30 20:41:50 UTC

https://jira.coreos.com/browse/CORS-1245

Comment 21 David Eads 2019-10-02 11:56:42 UTC

You'll need a bug in order to backport your change back into 4.2.z.  I think this one works well and clearly demonstrates the problem you're solving and the value you're bringing.  I've reopened for you.

Comment 23 sheng.lao 2019-10-23 09:25:57 UTC

This pr provides more readable information when jobs is failed.

I launch a cluster on baremetal with 4.3.0-0.nightly-2019-10-22-165241, and it works good, besides the openshift-monitoring.

Comment 24 W. Trevor King 2019-10-23 19:04:44 UTC

Also linking the follow-up fix, which was in 4.3.0-0.nightly-2019-10-22-165241 and so got VERIFIED as well in comment 23.

Comment 26 errata-xmlrpc 2020-01-23 11:06:54 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:0062