Bug 1619886 - OCP installer provisioning of instances playbook always fails first time running
Summary: OCP installer provisioning of instances playbook always fails first time running
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer
Version: 3.10.0
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 3.10.z
Assignee: Tzu-Mainn Chen
QA Contact: rlopez
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-08-22 02:21 UTC by rlopez
Modified: 2018-11-11 16:40 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-11-11 16:39:10 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2018:2709 0 None None None 2018-11-11 16:40:08 UTC

Description rlopez 2018-08-22 02:21:51 UTC
Description of problem:

Running the provision.yml file to create the openstack instances for openshift always fails on the initial run. Even though it fails, the heat stack does eventually complete successfully. 

The workaround:

- Run the provision.yml file twice to ensure everything gets properly set.

The error seen:

TASK [openshift_openstack : fail] ***********************************************************************************************************************************
Tuesday 21 August 2018  22:19:34 -0400 (0:00:00.057)       0:01:16.840 ******** 
fatal: [localhost]: FAILED! => {"changed": false, "msg": "Stack creation failed"}

However, this is what openstack stack list shows:

]$ openstack stack list
+--------------------------------------+-------------------+--------------------+----------------------+--------------+
| ID                                   | Stack Name        | Stack Status       | Creation Time        | Updated Time |
+--------------------------------------+-------------------+--------------------+----------------------+--------------+
| 4caf48cd-c223-4cc0-a506-163e6eb887d8 | openshift-cluster | CREATE_IN_PROGRESS | 2018-08-22T02:18:44Z | None         |
+--------------------------------------+-------------------+--------------------+----------------------+--------------+

After a while it shows:

$ openstack stack list
+--------------------------------------+-------------------+-----------------+----------------------+--------------+
| ID                                   | Stack Name        | Stack Status    | Creation Time        | Updated Time |
+--------------------------------------+-------------------+-----------------+----------------------+--------------+
| 4caf48cd-c223-4cc0-a506-163e6eb887d8 | openshift-cluster | CREATE_COMPLETE | 2018-08-22T02:18:44Z | None         |
+--------------------------------------+-------------------+-----------------+----------------------+--------------+




What openstack event list shows (all looks green):

$ openstack stack event list openshift-cluster
2018-08-21 22:25:09Z [openshift-cluster]: CREATE_IN_PROGRESS  Stack CREATE started
2018-08-21 22:25:11Z [openshift-cluster.lb-secgrp]: CREATE_IN_PROGRESS  state changed
2018-08-21 22:25:12Z [openshift-cluster.etcd-secgrp]: CREATE_IN_PROGRESS  state changed
2018-08-21 22:25:13Z [openshift-cluster.router]: CREATE_IN_PROGRESS  state changed
2018-08-21 22:25:13Z [openshift-cluster.lb-secgrp]: CREATE_COMPLETE  state changed
2018-08-21 22:25:14Z [openshift-cluster.net]: CREATE_IN_PROGRESS  state changed
2018-08-21 22:25:14Z [openshift-cluster.etcd-secgrp]: CREATE_COMPLETE  state changed
2018-08-21 22:25:15Z [openshift-cluster.net]: CREATE_COMPLETE  state changed
2018-08-21 22:25:15Z [openshift-cluster.common-secgrp]: CREATE_IN_PROGRESS  state changed
2018-08-21 22:25:15Z [openshift-cluster.subnet]: CREATE_IN_PROGRESS  state changed
2018-08-21 22:25:16Z [openshift-cluster.node-secgrp]: CREATE_IN_PROGRESS  state changed
2018-08-21 22:25:16Z [openshift-cluster.common-secgrp]: CREATE_COMPLETE  state changed
2018-08-21 22:25:17Z [openshift-cluster.subnet]: CREATE_COMPLETE  state changed
2018-08-21 22:25:16Z [openshift-cluster.node-secgrp]: CREATE_IN_PROGRESS  state changed
2018-08-21 22:25:16Z [openshift-cluster.common-secgrp]: CREATE_COMPLETE  state changed
2018-08-21 22:25:17Z [openshift-cluster.subnet]: CREATE_COMPLETE  state changed
2018-08-21 22:25:17Z [openshift-cluster.master-secgrp]: CREATE_IN_PROGRESS  state changed
2018-08-21 22:25:17Z [openshift-cluster.api_lb]: CREATE_IN_PROGRESS  state changed
2018-08-21 22:25:18Z [openshift-cluster.router_lb]: CREATE_IN_PROGRESS  state changed
2018-08-21 22:25:18Z [openshift-cluster.infra-secgrp]: CREATE_IN_PROGRESS  state changed
2018-08-21 22:25:19Z [openshift-cluster.cns-secgrp]: CREATE_IN_PROGRESS  state changed
2018-08-21 22:25:22Z [openshift-cluster.infra-secgrp]: CREATE_COMPLETE  state changed
2018-08-21 22:25:22Z [openshift-cluster.node-secgrp]: CREATE_COMPLETE  state changed
2018-08-21 22:25:23Z [openshift-cluster.router]: CREATE_COMPLETE  state changed
2018-08-21 22:25:25Z [openshift-cluster.interface]: CREATE_IN_PROGRESS  state changed
2018-08-21 22:25:25Z [openshift-cluster.master-secgrp]: CREATE_COMPLETE  state changed
2018-08-21 22:25:25Z [openshift-cluster.cns-secgrp]: CREATE_COMPLETE  state changed
2018-08-21 22:25:31Z [openshift-cluster.interface]: CREATE_COMPLETE  state changed
2018-08-21 22:25:31Z [openshift-cluster.compute_nodes]: CREATE_IN_PROGRESS  state changed
2018-08-21 22:25:31Z [openshift-cluster.cns]: CREATE_IN_PROGRESS  state changed
2018-08-21 22:25:31Z [openshift-cluster.etcd]: CREATE_IN_PROGRESS  state changed
2018-08-21 22:25:33Z [openshift-cluster.cns]: CREATE_COMPLETE  state changed
2018-08-21 22:25:33Z [openshift-cluster.etcd]: CREATE_COMPLETE  state changed
2018-08-21 22:26:10Z [openshift-cluster.compute_nodes]: CREATE_COMPLETE  state changed
2018-08-21 22:26:39Z [openshift-cluster.router_lb]: CREATE_COMPLETE  state changed
2018-08-21 22:26:39Z [openshift-cluster.router_lb_listener_https]: CREATE_IN_PROGRESS  state changed
2018-08-21 22:26:39Z [openshift-cluster.router_lb_listener_http]: CREATE_IN_PROGRESS  state changed
2018-08-21 22:26:41Z [openshift-cluster.router_lb_listener_https]: CREATE_COMPLETE  state changed
2018-08-21 22:26:42Z [openshift-cluster.router_lb_pool_https]: CREATE_IN_PROGRESS  state changed
2018-08-21 22:26:45Z [openshift-cluster.router_lb_listener_http]: CREATE_COMPLETE  state changed
2018-08-21 22:26:45Z [openshift-cluster.router_lb_pool_http]: CREATE_IN_PROGRESS  state changed
2018-08-21 22:26:46Z [openshift-cluster.api_lb]: CREATE_COMPLETE  state changed
2018-08-21 22:26:46Z [openshift-cluster.router_lb_pool_https]: CREATE_COMPLETE  state changed
2018-08-21 22:26:47Z [openshift-cluster.api_lb_pool]: CREATE_IN_PROGRESS  state changed
2018-08-21 22:26:48Z [openshift-cluster.router_lb_pool_http]: CREATE_COMPLETE  state changed
2018-08-21 22:26:48Z [openshift-cluster.infra_nodes]: CREATE_IN_PROGRESS  state changed
2018-08-21 22:26:48Z [openshift-cluster.router_lb_floating_ip]: CREATE_IN_PROGRESS  state changed
2018-08-21 22:26:48Z [openshift-cluster.api_lb_pool]: CREATE_COMPLETE  state changed
2018-08-21 22:26:48Z [openshift-cluster.api_lb_listener]: CREATE_IN_PROGRESS  state changed
2018-08-21 22:26:48Z [openshift-cluster.masters]: CREATE_IN_PROGRESS  state changed
2018-08-21 22:26:51Z [openshift-cluster.api_lb_listener]: CREATE_COMPLETE  state changed
2018-08-21 22:26:52Z [openshift-cluster.api_lb_floating_ip]: CREATE_IN_PROGRESS  state changed
2018-08-21 22:26:52Z [openshift-cluster.router_lb_floating_ip]: CREATE_COMPLETE  state changed
2018-08-21 22:26:48Z [openshift-cluster.api_lb_pool]: CREATE_COMPLETE  state changed
2018-08-21 22:26:48Z [openshift-cluster.api_lb_listener]: CREATE_IN_PROGRESS  state changed
2018-08-21 22:26:48Z [openshift-cluster.masters]: CREATE_IN_PROGRESS  state changed
2018-08-21 22:26:51Z [openshift-cluster.api_lb_listener]: CREATE_COMPLETE  state changed
2018-08-21 22:26:52Z [openshift-cluster.api_lb_floating_ip]: CREATE_IN_PROGRESS  state changed
2018-08-21 22:26:52Z [openshift-cluster.router_lb_floating_ip]: CREATE_COMPLETE  state changed
2018-08-21 22:26:58Z [openshift-cluster.api_lb_floating_ip]: CREATE_COMPLETE  state changed
2018-08-21 22:27:29Z [openshift-cluster.masters]: CREATE_COMPLETE  state changed
2018-08-21 22:27:31Z [openshift-cluster.infra_nodes]: CREATE_COMPLETE  state changed
2018-08-21 22:27:31Z [openshift-cluster]: CREATE_COMPLETE  Stack CREATE completed successfully

Comment 1 rlopez 2018-08-22 02:31:24 UTC
As an FYI, running the provision.yml with the -vvv flags doesn't show anymore output for the error.

Comment 2 Tzu-Mainn Chen 2018-08-23 16:46:23 UTC
It looks like in Roger's HA environment, the 'stack get' used to check the current status of the deploying stack will occasionally just fail. This is a bit of a workaround, but I can't think of anything better to fix the issue from the openshift-ansible side.

https://github.com/openshift/openshift-ansible/pull/9734

Comment 3 rlopez 2018-08-24 15:52:07 UTC
confirmed works.

Comment 11 errata-xmlrpc 2018-11-11 16:39:10 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:2709


Note You need to log in before you can comment on or make changes to this bug.