1625911 – Could not find csr for nodes - Bootstrapping task

Bug 1625911 - Could not find csr for nodes - Bootstrapping task

Summary: Could not find csr for nodes - Bootstrapping task

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Installer
Sub Component:
Version:	3.10.0
Hardware:	x86_64
OS:	Linux
Priority:	unspecified
Severity:	urgent
Target Milestone:	---
Target Release:	3.10.z
Assignee:	Michael Gugino
QA Contact:	Johnny Liu
Docs Contact:
URL:
Whiteboard:
Duplicates (1):	1627623 (view as bug list)
Depends On:
Blocks:	1641207
TreeView+	depends on / blocked

Reported:	2018-09-06 08:32 UTC by Serena Cortopassi
Modified:	2018-12-10 08:51 UTC (History)
CC List:	22 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Clones:	1641207 (view as bug list)
Environment:
Last Closed:	2018-11-11 16:39:10 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
oc get csr -oyaml (27.68 KB, text/plain) 2018-09-06 08:32 UTC, Serena Cortopassi	no flags	Details
inventory file (2.95 KB, text/plain) 2018-09-06 08:34 UTC, Serena Cortopassi	no flags	Details
oc get nodes (296 bytes, text/plain) 2018-09-06 08:35 UTC, Serena Cortopassi	no flags	Details
oc get csr (2.62 KB, text/plain) 2018-09-06 08:38 UTC, Serena Cortopassi	no flags	Details
ansible execution -vvv (52.94 KB, text/plain) 2018-09-14 12:07 UTC, Serena Cortopassi	no flags	Details
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Knowledge Base (Solution)	3600441	0	None	None	None	2018-09-25 20:46:24 UTC
Red Hat Product Errata	RHSA-2018:2709	0	None	None	None	2018-11-11 16:40:08 UTC

Description Serena Cortopassi 2018-09-06 08:32:04 UTC

Created attachment 1481221 [details]
oc get csr -oyaml

Description of problem:

I open it following https://bugzilla.redhat.com/show_bug.cgi?id=1623204#c8.
Installer fails on task [Approve node certificates when bootstrapping] with error "Could not find csr for nodes".

Version-Release number of the following components:

both with openshift-ansible-3.10.41-1.git.0.fd15dd7.el7.noarch (rpm) and openshift-ansible-openshift-ansible-3.10.43-1 (git release)

ansible-2.6.3-1.el7.noarch

How reproducible: always

Steps to Reproduce:

3 masters + 3 infra + 1 node + 1 lb

ansible-playbook -i /etc/ansible/hosts playbooks/deploy_cluster.yml

Actual results:

PLAY [Approve any pending CSR requests from inventory nodes] ******************************************************************************************************************************

TASK [Dump all candidate bootstrap hostnames] *********************************************************************************************************************************************
Thursday 06 September 2018  09:55:21 +0200 (0:00:00.398)       0:27:56.128 ****
ok: [red-ose-master01.internal.ose.extrasys.it] => {
    "msg": [
        "red-ose-master01.internal.ose.extrasys.it", 
        "red-ose-master02.internal.ose.extrasys.it", 
        "red-ose-master03.internal.ose.extrasys.it", 
        "red-ose-node01.internal.ose.extrasys.it", 
        "red-ose-infra01.internal.ose.extrasys.it", 
        "red-ose-infra02.internal.ose.extrasys.it", 
        "red-ose-infra03.internal.ose.extrasys.it"
    ]
}

TASK [Find all hostnames for bootstrapping] ***********************************************************************************************************************************************
Thursday 06 September 2018  09:55:21 +0200 (0:00:00.084)       0:27:56.212 ****
ok: [red-ose-master01.internal.ose.extrasys.it]

TASK [Dump the bootstrap hostnames] *******************************************************************************************************************************************************
Thursday 06 September 2018  09:55:22 +0200 (0:00:00.240)       0:27:56.453 ****
ok: [red-ose-master01.internal.ose.extrasys.it] => {
    "msg": [
        "red-ose-master01", 
        "red-ose-master02", 
        "red-ose-master03", 
        "red-ose-node01", 
        "red-ose-infra01", 
        "red-ose-infra02", 
        "red-ose-infra03"
    ]
}

TASK [Approve node certificates when bootstrapping] ***************************************************************************************************************************************
Thursday 06 September 2018  09:55:22 +0200 (0:00:00.085)       0:27:56.539 ****
FAILED - RETRYING: Approve node certificates when bootstrapping (30 retries left).
FAILED - RETRYING: Approve node certificates when bootstrapping (29 retries left).
FAILED - RETRYING: Approve node certificates when bootstrapping (1 retries left).
fatal: [red-ose-master01.internal.ose.extrasys.it]: FAILED! => {"attempts": 30, "changed": false, "msg": "Cound not find csr for nodes: red-ose-master01, red-ose-master02, red-ose-master0
3", "state": "unknown"}

PLAY RECAP ********************************************************************************************************************************************************************************
localhost                  : ok=13   changed=0    unreachable=0    failed=0
red-ose-infra01.internal.ose.extrasys.it : ok=119  changed=65   unreachable=0    failed=0
red-ose-infra02.internal.ose.extrasys.it : ok=119  changed=65   unreachable=0    failed=0
red-ose-infra03.internal.ose.extrasys.it : ok=119  changed=65   unreachable=0    failed=0
red-ose-int-haproxy01.internal.ose.extrasys.it : ok=33   changed=4    unreachable=0    failed=0
red-ose-master01.internal.ose.extrasys.it : ok=432  changed=205  unreachable=0    failed=1   
red-ose-master02.internal.ose.extrasys.it : ok=293  changed=147  unreachable=0    failed=0
red-ose-master03.internal.ose.extrasys.it : ok=293  changed=147  unreachable=0    failed=0
red-ose-node01.internal.ose.extrasys.it : ok=119  changed=65   unreachable=0    failed=0


INSTALLER STATUS **************************************************************************************************************************************************************************
Initialization              : Complete (0:00:49)
Health Check                : Complete (0:03:02)
Node Bootstrap Preparation  : Complete (0:14:59)
etcd Install                : Complete (0:01:00)
Load Balancer Install       : Complete (0:00:16)
Master Install              : Complete (0:06:07)
Master Additional Install   : Complete (0:01:32)
Node Join                   : In Progress (0:02:55)

Failure summary:


  1. Hosts:    red-ose-master01.internal.ose.extrasys.it
     Play:     Approve any pending CSR requests from inventory nodes 
     Task:     Approve node certificates when bootstrapping
     Message:  Cound not find csr for nodes: red-ose-master01, red-ose-master02, red-ose-master03


Expected results: Installer completes successfully.

Additional info: in attachment


I suspect this could be also related with the generated name in csr and how this should match with node hostnames and DNS entries.

- On DNS I have nodename.internal.ose.extrasys.it entries. All nodes involved correctly resolve each other hostname (both fqdn and short)
- hostname -f on nodes gives "nodename"(e.g.red-ose-master01). Should it be nodename.internal.ose.extrasys.it?
- in /etc/hosts I have the entry "IP nodename" (with no fqdn)

Is that configuration right?

Comment 1 Serena Cortopassi 2018-09-06 08:34:52 UTC

Created attachment 1481222 [details]
inventory file

Comment 2 Serena Cortopassi 2018-09-06 08:35:13 UTC

Created attachment 1481223 [details]
oc get nodes

Comment 3 Serena Cortopassi 2018-09-06 08:38:01 UTC

Created attachment 1481224 [details]
oc get csr

Comment 5 Martin Adler 2018-09-10 05:02:28 UTC

I am also experiencing this even when falling back to a single node setup. As this is targeted to be fixed for 3.11, any chance that I can bring up a cluster with 3.10?

Comment 6 Michael Gugino 2018-09-10 16:20:35 UTC

Latest code in master and 3.10 branches has added additional debug output.

Please recreate problem using ansible-playbook -vvv (3 v's, not 2) and attach output.

Comment 10 Greg Rodriguez II 2018-09-11 23:26:46 UTC

Added customer ticket to BZ, as they are experiencing this same issue, as well.  Please let me know if any attachments or further information is needed.

Comment 14 Jaspreet Kaur 2018-09-13 05:45:08 UTC

*** Bug 1627623 has been marked as a duplicate of this bug. ***

Comment 15 Michal Karnik 2018-09-13 11:20:53 UTC

I had the same problem and this commit from Michael Gugino fixed it:

https://github.com/openshift/openshift-ansible/pull/10033

thanks (mgugino)

Comment 17 Scott Dodson 2018-09-13 13:44:55 UTC

The comments on this bug contain two unique problems.

The original reporter is observing difference in the nodenames and the bootstrap names.

Latter comments, some of which are private, are more similar to https://bugzilla.redhat.com/show_bug.cgi?id=1625873

Comment 18 Serena Cortopassi 2018-09-14 12:06:39 UTC

(In reply to Michael Gugino from comment #6)
> Latest code in master and 3.10 branches has added additional debug output.
> 
> Please recreate problem using ansible-playbook -vvv (3 v's, not 2) and
> attach output.

I recreated the problem with openshift-ansible-openshift-ansible-3.10.47-1 git release with 1 master + 1 infra + 1 node.

Please see attached the -vvv execution and failure.

The installation is performed on Rhel images somehow customized with cloud-init. I'll try on brand new qcow to avoid misconfiguration even in nodenames/hostnames/bootstrap names.

Comment 19 Serena Cortopassi 2018-09-14 12:07:06 UTC

Created attachment 1483324 [details]
ansible execution -vvv

Comment 20 Michael Gugino 2018-09-14 12:48:48 UTC

(In reply to Serena Cortopassi from comment #18)
> (In reply to Michael Gugino from comment #6)
> > Latest code in master and 3.10 branches has added additional debug output.
> > 
> > Please recreate problem using ansible-playbook -vvv (3 v's, not 2) and
> > attach output.
> 
> I recreated the problem with openshift-ansible-openshift-ansible-3.10.47-1
> git release with 1 master + 1 infra + 1 node.
> 
> Please see attached the -vvv execution and failure.
> 
> The installation is performed on Rhel images somehow customized with
> cloud-init. I'll try on brand new qcow to avoid misconfiguration even in
> nodenames/hostnames/bootstrap names.

This run looks like it's the issue that master is not ready yet: runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized

We have another patch out for this problem, already shipped in master, waiting on QE to verify before we backport to 3.10.

Comment 21 Serena Cortopassi 2018-09-14 13:41:24 UTC

(In reply to Michael Gugino from comment #20)

> This run looks like it's the issue that master is not ready yet: runtime
> network not ready: NetworkReady=false reason:NetworkPluginNotReady
> message:docker: network plugin is not ready: cni config uninitialized
> 
> We have another patch out for this problem, already shipped in master,
> waiting on QE to verify before we backport to 3.10.

I can confirm what you say:
[awx@red-ose-master01 ~]$ oc get nodes
NAME               STATUS     ROLES     AGE       VERSION
red-ose-master01   NotReady   master    2h        v1.10.0+b81c8f8

[awx@red-ose-master01 ~]$ sudo journalctl -xe -u atomic-openshift-node
...
Unable to update cni config: No networks found in /etc/cni/net.d
...

The folder /etc/cni/net.d is empty indeed.

This behaviour does not happen to everyone, I am on RH OpenStack 10 and I see a lot of people are affected in that case.
Do you know if it is something related to nodenames or OS version,.., or what?

Could you please link the patch shipped? 

Thanks

Comment 22 Michael Gugino 2018-09-14 15:48:35 UTC

(In reply to Serena Cortopassi from comment #21)
> (In reply to Michael Gugino from comment #20)
> 
> > This run looks like it's the issue that master is not ready yet: runtime
> > network not ready: NetworkReady=false reason:NetworkPluginNotReady
> > message:docker: network plugin is not ready: cni config uninitialized
> > 
> > We have another patch out for this problem, already shipped in master,
> > waiting on QE to verify before we backport to 3.10.
> 
> I can confirm what you say:
> [awx@red-ose-master01 ~]$ oc get nodes
> NAME               STATUS     ROLES     AGE       VERSION
> red-ose-master01   NotReady   master    2h        v1.10.0+b81c8f8
> 
> [awx@red-ose-master01 ~]$ sudo journalctl -xe -u atomic-openshift-node
> ...
> Unable to update cni config: No networks found in /etc/cni/net.d
> ...
> 
> The folder /etc/cni/net.d is empty indeed.
> 
> This behaviour does not happen to everyone, I am on RH OpenStack 10 and I
> see a lot of people are affected in that case.
> Do you know if it is something related to nodenames or OS version,.., or
> what?
> 
> Could you please link the patch shipped? 
> 
> Thanks

Here is the patch for 3.10, not shipped yet.  https://github.com/openshift/openshift-ansible/pull/10055

That patch links to the original in master/3.11.

Comment 38 Johnny Liu 2018-10-10 07:19:11 UTC

The comments on this bug contain more than one issues, just like analysis in comment 17. 

Here only provide one fix PR in comment 22, it is similar to https://bugzilla.redhat.com/show_bug.cgi?id=1628964, QE already reproduced it and run verification for the fix PR. More detailed refer to https://bugzilla.redhat.com/show_bug.cgi?id=1628964#c9.

If the PR do not fix your issue, suggest to open a separate bug for tracking.

Comment 39 Omer SEN 2018-10-10 17:56:30 UTC

Is there a workaround for this until rpm is released?

Comment 40 Omer SEN 2018-10-10 18:04:45 UTC

This same issue (empty /etc/cni/net.d ) also causing issues at "openshift_web_console : Apply the web console template file]" TASK. I have upgraded to "openshift-ansible-playbooks-3.10.51-1.git.0.44a646c.el7.noarch" which was released today. We pass "Could not find csr for nodes - Bootstrapping task" TASK but now:


===============================================================================

TASK [openshift_web_console : Apply the web console template file] *************
changed: [master.os.serra.local]

TASK [openshift_web_console : Remove temp directory] ***************************
ok: [master.os.serra.local]

TASK [openshift_web_console : Pause for the web console deployment to start] ***
skipping: [master.os.serra.local]

TASK [openshift_web_console : include_tasks] ***********************************
included: /usr/share/ansible/openshift-ansible/roles/openshift_web_console/tasks/start.yml for master.os.serra.local

TASK [openshift_web_console : Verify that the console is running] **************
FAILED - RETRYING: Verify that the console is running (60 retries left).
FAILED - RETRYING: Verify that the console is running (59 retries left).
FAILED - RETRYING: Verify that the console is running (58 retries left).
FAILED - RETRYING: Verify that the console is running (57 retries left).
FAILED - RETRYING: Verify that the console is running (56 retries left).
FAILED - RETRYING: Verify that the console is running (55 retries left).
FAILED - RETRYING: Verify that the console is running (54 retries left).
FAILED - RETRYING: Verify that the console is running (53 retries left).
FAILED - RETRYING: Verify that the console is running (52 retries left).
FAILED - RETRYING: Verify that the console is running (51 retries left).
FAILED - RETRYING: Verify that the console is running (50 retries left).
FAILED - RETRYING: Verify that the console is running (49 retries left).
FAILED - RETRYING: Verify that the console is running (48 retries left).
FAILED - RETRYING: Verify that the console is running (47 retries left).
FAILED - RETRYING: Verify that the console is running (46 retries left).
FAILED - RETRYING: Verify that the console is running (45 retries left).
FAILED - RETRYING: Verify that the console is running (44 retries left).
FAILED - RETRYING: Verify that the console is running (43 retries left).
FAILED - RETRYING: Verify that the console is running (42 retries left).
FAILED - RETRYING: Verify that the console is running (41 retries left).
FAILED - RETRYING: Verify that the console is running (40 retries left).
FAILED - RETRYING: Verify that the console is running (39 retries left).
FAILED - RETRYING: Verify that the console is running (38 retries left).
FAILED - RETRYING: Verify that the console is running (37 retries left).
FAILED - RETRYING: Verify that the console is running (36 retries left).
FAILED - RETRYING: Verify that the console is running (35 retries left).


FAILED - RETRYING: Verify that the console is running (34 retries left).
FAILED - RETRYING: Verify that the console is running (33 retries left).
FAILED - RETRYING: Verify that the console is running (32 retries left).
FAILED - RETRYING: Verify that the console is running (31 retries left).
FAILED - RETRYING: Verify that the console is running (30 retries left).
FAILED - RETRYING: Verify that the console is running (29 retries left).
FAILED - RETRYING: Verify that the console is running (28 retries left).
FAILED - RETRYING: Verify that the console is running (27 retries left).
FAILED - RETRYING: Verify that the console is running (26 retries left).
FAILED - RETRYING: Verify that the console is running (25 retries left).
FAILED - RETRYING: Verify that the console is running (24 retries left).
FAILED - RETRYING: Verify that the console is running (23 retries left).
FAILED - RETRYING: Verify that the console is running (22 retries left).
FAILED - RETRYING: Verify that the console is running (21 retries left).
FAILED - RETRYING: Verify that the console is running (20 retries left).
FAILED - RETRYING: Verify that the console is running (19 retries left).
FAILED - RETRYING: Verify that the console is running (18 retries left).
FAILED - RETRYING: Verify that the console is running (17 retries left).
FAILED - RETRYING: Verify that the console is running (16 retries left).
FAILED - RETRYING: Verify that the console is running (15 retries left).
FAILED - RETRYING: Verify that the console is running (14 retries left).
FAILED - RETRYING: Verify that the console is running (13 retries left).
FAILED - RETRYING: Verify that the console is running (12 retries left).
FAILED - RETRYING: Verify that the console is running (11 retries left).
FAILED - RETRYING: Verify that the console is running (10 retries left).
FAILED - RETRYING: Verify that the console is running (9 retries left).
FAILED - RETRYING: Verify that the console is running (8 retries left).
FAILED - RETRYING: Verify that the console is running (7 retries left).
FAILED - RETRYING: Verify that the console is running (6 retries left).
FAILED - RETRYING: Verify that the console is running (5 retries left).
FAILED - RETRYING: Verify that the console is running (4 retries left).
FAILED - RETRYING: Verify that the console is running (3 retries left).
FAILED - RETRYING: Verify that the console is running (2 retries left).
FAILED - RETRYING: Verify that the console is running (1 retries left).
fatal: [master.os.serra.local]: FAILED! => {"attempts": 60, "changed": false, "failed": true, "results": {"cmd": "/usr/bin/oc get deployment webconsole -o json -n openshift-web-console", "results": [{"apiVersion": "extensions/v1beta1", "kind": "Deployment", "metadata": {"annotations": {"deployment.kubernetes.io/revision": "1", "kubectl.kubernetes.io/last-applied-configuration": "{\"apiVersion\":\"apps/v1beta1\",\"kind\":\"Deployment\",\"metadata\":{\"annotations\":{},\"labels\":{\"app\":\"openshift-web-console\",\"webconsole\":\"true\"},\"name\":\"webconsole\",\"namespace\":\"openshift-web-console\"},\"spec\":{\"replicas\":1,\"strategy\":{\"rollingUpdate\":{\"maxUnavailable\":\"100%\"},\"type\":\"RollingUpdate\"},\"template\":{\"metadata\":{\"labels\":{\"app\":\"openshift-web-console\",\"webconsole\":\"true\"},\"name\":\"webconsole\"},\"spec\":{\"containers\":[{\"command\":[\"/usr/bin/origin-web-console\",\"--audit-log-path=-\",\"-v=0\",\"--config=/var/webconsole-config/webconsole-config.yaml\"],\"image\":\"docker.io/openshift/origin-web-console:v3.10.0\",\"imagePullPolicy\":\"IfNotPresent\",\"livenessProbe\":{\"exec\":{\"command\":[\"/bin/sh\",\"-c\",\"if [[ ! -f /tmp/webconsole-config.hash ]]; then \\\\\\n  md5sum /var/webconsole-config/webconsole-config.yaml \\u003e /tmp/webconsole-config.hash; \\\\\\nelif [[ $(md5sum /var/webconsole-config/webconsole-config.yaml) != $(cat /tmp/webconsole-config.hash) ]]; then \\\\\\n  echo 'webconsole-config.yaml has changed.'; \\\\\\n  exit 1; \\\\\\nfi \\u0026\\u0026 curl -k -f https://0.0.0.0:8443/console/\"]}},\"name\":\"webconsole\",\"ports\":[{\"containerPort\":8443}],\"readinessProbe\":{\"httpGet\":{\"path\":\"/healthz\",\"port\":8443,\"scheme\":\"HTTPS\"}},\"resources\":{\"requests\":{\"cpu\":\"100m\",\"memory\":\"100Mi\"}},\"volumeMounts\":[{\"mountPath\":\"/var/serving-cert\",\"name\":\"serving-cert\"},{\"mountPath\":\"/var/webconsole-config\",\"name\":\"webconsole-config\"}]}],\"nodeSelector\":{\"node-role.kubernetes.io/master\":\"true\"},\"serviceAccountName\":\"webconsole\",\"volumes\":[{\"name\":\"serving-cert\",\"secret\":{\"defaultMode\":288,\"secretName\":\"webconsole-serving-cert\"}},{\"configMap\":{\"defaultMode\":288,\"name\":\"webconsole-config\"},\"name\":\"webconsole-config\"}]}}}}\n"}, "creationTimestamp": "2018-10-10T17:47:02Z", "generation": 1, "labels": {"app": "openshift-web-console", "webconsole": "true"}, "name": "webconsole", "namespace": "openshift-web-console", "resourceVersion": "5834", "selfLink": "/apis/extensions/v1beta1/namespaces/openshift-web-console/deployments/webconsole", "uid": "7e6039f6-ccb4-11e8-a3b9-525400380b0e"}, "spec": {"progressDeadlineSeconds": 600, "replicas": 1, "revisionHistoryLimit": 2, "selector": {"matchLabels": {"app": "openshift-web-console", "webconsole": "true"}}, "strategy": {"rollingUpdate": {"maxSurge": "25%", "maxUnavailable": "100%"}, "type": "RollingUpdate"}, "template": {"metadata": {"creationTimestamp": null, "labels": {"app": "openshift-web-console", "webconsole": "true"}, "name": "webconsole"}, "spec": {"containers": [{"command": ["/usr/bin/origin-web-console", "--audit-log-path=-", "-v=0", "--config=/var/webconsole-config/webconsole-config.yaml"], "image": "docker.io/openshift/origin-web-console:v3.10.0", "imagePullPolicy": "IfNotPresent", "livenessProbe": {"exec": {"command": ["/bin/sh", "-c", "if [[ ! -f /tmp/webconsole-config.hash ]]; then \\\n  md5sum /var/webconsole-config/webconsole-config.yaml > /tmp/webconsole-config.hash; \\\nelif [[ $(md5sum /var/webconsole-config/webconsole-config.yaml) != $(cat /tmp/webconsole-config.hash) ]]; then \\\n  echo 'webconsole-config.yaml has changed.'; \\\n  exit 1; \\\nfi && curl -k -f https://0.0.0.0:8443/console/"]}, "failureThreshold": 3, "periodSeconds": 10, "successThreshold": 1, "timeoutSeconds": 1}, "name": "webconsole", "ports": [{"containerPort": 8443, "protocol": "TCP"}], "readinessProbe": {"failureThreshold": 3, "httpGet": {"path": "/healthz", "port": 8443, "scheme": "HTTPS"}, "periodSeconds": 10, "successThreshold": 1, "timeoutSeconds": 1}, "resources": {"requests": {"cpu": "100m", "memory": "100Mi"}}, "terminationMessagePath": "/dev/termination-log", "terminationMessagePolicy": "File", "volumeMounts": [{"mountPath": "/var/serving-cert", "name": "serving-cert"}, {"mountPath": "/var/webconsole-config", "name": "webconsole-config"}]}], "dnsPolicy": "ClusterFirst", "nodeSelector": {"node-role.kubernetes.io/master": "true"}, "restartPolicy": "Always", "schedulerName": "default-scheduler", "securityContext": {}, "serviceAccount": "webconsole", "serviceAccountName": "webconsole", "terminationGracePeriodSeconds": 30, "volumes": [{"name": "serving-cert", "secret": {"defaultMode": 288, "secretName": "webconsole-serving-cert"}}, {"configMap": {"defaultMode": 288, "name": "webconsole-config"}, "name": "webconsole-config"}]}}}, "status": {"conditions": [{"lastTransitionTime": "2018-10-10T17:47:02Z", "lastUpdateTime": "2018-10-10T17:47:02Z", "message": "Deployment has minimum availability.", "reason": "MinimumReplicasAvailable", "status": "True", "type": "Available"}, {"lastTransitionTime": "2018-10-10T17:57:03Z", "lastUpdateTime": "2018-10-10T17:57:03Z", "message": "ReplicaSet \"webconsole-55c4d867f\" has timed out progressing.", "reason": "ProgressDeadlineExceeded", "status": "False", "type": "Progressing"}], "observedGeneration": 1, "replicas": 1, "unavailableReplicas": 1, "updatedReplicas": 1}}], "returncode": 0}, "state": "list"}
...ignoring

TASK [openshift_web_console : Check status in the openshift-web-console namespace] ***
changed: [master.os.serra.local]

TASK [openshift_web_console : debug] *******************************************
ok: [master.os.serra.local] => {
    "msg": [
        "In project openshift-web-console on server https://master.os.serra.local:8443", 
        "", 
        "svc/webconsole - 172.30.50.83:443 -> 8443", 
        "  deployment/webconsole deploys docker.io/openshift/origin-web-console:v3.10.0", 
        "    deployment #1 running for 10 minutes - 0/1 pods", 
        "", 
        "View details with 'oc describe <resource>/<name>' or list everything with 'oc get all'."
    ]
}

TASK [openshift_web_console : Get pods in the openshift-web-console namespace] ***
changed: [master.os.serra.local]

TASK [openshift_web_console : debug] *******************************************
ok: [master.os.serra.local] => {
    "msg": [
        "NAME                         READY     STATUS    RESTARTS   AGE       IP        NODE", 
        "webconsole-55c4d867f-t9qw6   0/1       Pending   0          10m       <none>    <none>"
    ]
}

TASK [openshift_web_console : Get events in the openshift-web-console namespace] ***
changed: [master.os.serra.local]

TASK [openshift_web_console : debug] *******************************************
ok: [master.os.serra.local] => {
    "msg": [
        "LAST SEEN   FIRST SEEN   COUNT     NAME                                          KIND         SUBOBJECT   TYPE      REASON              SOURCE                  MESSAGE", 
        "18s         10m          37        webconsole-55c4d867f-t9qw6.155c50721e9f61cd   Pod                      Warning   FailedScheduling    default-scheduler       0/2 nodes are available: 2 node(s) were not ready.", 
        "10m         10m          1         webconsole-55c4d867f.155c50721ea9bff6         ReplicaSet               Normal    SuccessfulCreate    replicaset-controller   Created pod: webconsole-55c4d867f-t9qw6", 
        "10m         10m          1         webconsole.155c5071d5c31064                   Deployment               Normal    ScalingReplicaSet   deployment-controller   Scaled up replica set webconsole-55c4d867f to 1"
    ]
}

TASK [openshift_web_console : Get console pod logs] ****************************
changed: [master.os.serra.local]

TASK [openshift_web_console : debug] *******************************************
ok: [master.os.serra.local] => {
    "msg": []
}

TASK [openshift_web_console : Report console errors] ***************************
fatal: [master.os.serra.local]: FAILED! => {"changed": false, "failed": true, "msg": "Console install failed."}
        to retry, use: --limit @/usr/share/ansible/openshift-ansible/playbooks/deploy_cluster.retry

PLAY RECAP *********************************************************************
localhost                  : ok=13   changed=0    unreachable=0    failed=0   
master.os.serra.local      : ok=491  changed=126  unreachable=0    failed=1   
node1.os.serra.local       : ok=35   changed=2    unreachable=0    failed=0   


INSTALLER STATUS ***************************************************************
Initialization              : Complete (0:00:25)
Health Check                : Complete (0:01:14)
etcd Install                : Complete (0:00:49)
Node Bootstrap Preparation  : Complete (0:00:00)
Master Install              : Complete (0:04:07)
Master Additional Install   : Complete (0:01:45)
Node Join                   : Complete (0:00:14)
Hosted Install              : Complete (0:00:49)
Web Console Install         : In Progress (0:10:44)
        This phase can be restarted by running: playbooks/openshift-web-console/config.yml


Failure summary:


  1. Hosts:    master.os.serra.local
     Play:     Web Console
     Task:     Report console errors
     Message:  Console install failed.




LATEST PACKAGE JUST INSTALLED:
=============================


rpm -qi  openshift-ansible-playbooks        
Name        : openshift-ansible-playbooks
Version     : 3.10.51
Release     : 1.git.0.44a646c.el7
Architecture: noarch
Install Date: Wed 10 Oct 2018 06:36:43 PM BST
Group       : Unspecified
Size        : 442696
License     : ASL 2.0
Signature   : RSA/SHA1, Thu 27 Sep 2018 10:01:36 AM BST, Key ID c34c5bd42f297ecc
Source RPM  : openshift-ansible-3.10.51-1.git.0.44a646c.el7.src.rpm
Build Date  : Wed 26 Sep 2018 05:56:24 PM BST
Build Host  : c1be.rdu2.centos.org
Relocations : (not relocatable)
Packager    : CBS <cbs>
Vendor      : CentOS
URL         : https://github.com/openshift/openshift-ansible
Summary     : Openshift and Atomic Enterprise Ansible Playbooks
Description :
Openshift and Atomic Enterprise Ansible Playbooks.

Comment 41 Michael Gugino 2018-10-10 18:27:41 UTC

(In reply to Omer SEN from comment #39)
> Is there a workaround for this until rpm is released?

Omer, please file a new case or bugzilla for this issue.

Comment 42 Omer SEN 2018-10-10 18:38:14 UTC

(In reply to Michael Gugino from comment #41)
> (In reply to Omer SEN from comment #39)
> > Is there a workaround for this until rpm is released?
> 
> Omer, please file a new case or bugzilla for this issue.

https://bugzilla.redhat.com/show_bug.cgi?id=1638120 was filled for this new issue

Comment 43 Nicholas Schuetz 2018-10-16 16:27:28 UTC

You can work around this issue by manually setting the hostnames on each system to the FQDN.  Ie, hostnamectl set-hostname your.full.hostname.com.  Also, make sure to use the FQDNs in /etc/ansible/hosts.

Take note, that if the hostname is being set from DHCP, you'll have to set PEERDNS=no on the interface to keep NetworkManager from reverting your changes.

-Nick

Comment 45 errata-xmlrpc 2018-11-11 16:39:10 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:2709

Note You need to log in before you can comment on or make changes to this bug.

aabhishe
andcosta
aos-bugs
cshereme
ddharwar
dmoessne
grodrigu
hpolava
jialiu
jkaur
jokerman
ktadimar
martin.adler
mgugino
misalunk
mmccomas
mnozell
nick
omer.sen
rkant
scortopa
stwalter