Bug 1349456
Summary: | During overcloud deployment the ceph monitor gets started only on one of the controllers which causes the deployment to get stuck for some time | ||
---|---|---|---|
Product: | Red Hat OpenStack | Reporter: | Marius Cornea <mcornea> |
Component: | openstack-tripleo-heat-templates | Assignee: | Giulio Fidente <gfidente> |
Status: | CLOSED ERRATA | QA Contact: | Yogev Rabl <yrabl> |
Severity: | urgent | Docs Contact: | |
Priority: | unspecified | ||
Version: | 9.0 (Mitaka) | CC: | athomas, dbecker, jjoyce, mburns, morazi, ohochman, rhel-osp-director-maint, tvignaud, yrabl |
Target Milestone: | ga | Keywords: | Regression |
Target Release: | 9.0 (Mitaka) | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | openstack-tripleo-heat-templates-2.0.0-22.el7ost | Doc Type: | If docs needed, set a value |
Doc Text: | Story Points: | --- | |
Clone Of: | 1348489 | Environment: | |
Last Closed: | 2016-08-11 11:33:31 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 1348489 | ||
Bug Blocks: |
Description
Marius Cornea
2016-06-23 13:31:46 UTC
Just an update with current findings, puppet seems to take much longer to complete step 2 on non-bootstrap nodes, up to 980seconds. This blocks ceph-mon on the bootstrap node too, waiting for the other two nodes to make up the cluster. Still investigating what causes the delay/timeout on the non-bootstrap nodes. The deployment eventually completes with ceph-mon running on all nodes when puppet completes step 2. During the puppet run, the corosync logs show a 15mins gap with no lines logged occurring during the time in which pacemaker is bringing up the VIPs. This *could* be due to a timeout or a networking problem, continuing investigation. I tested the patch and step 2 completed in approx 2 minutes: 2016-07-20 08:52:15 [overcloud-ControllerNodesPostDeployment-ovygbbvoegp7-ControllerLoadBalancerDeployment_Step1-qzmfbd6kirmq]: CREATE_COMPLETE Stack CREATE completed successfully 2016-07-20 08:52:16 [overcloud-ControllerNodesPostDeployment-ovygbbvoegp7-ControllerServicesBaseDeployment_Step2-y7mugjbxvuvl]: CREATE_IN_PROGRESS Stack CREATE started 2016-07-20 08:52:16 [1]: CREATE_IN_PROGRESS state changed 2016-07-20 08:52:17 [0]: CREATE_IN_PROGRESS state changed 2016-07-20 08:52:18 [2]: CREATE_IN_PROGRESS state changed 2016-07-20 08:53:05 [1]: SIGNAL_IN_PROGRESS Signal: deployment succeeded 2016-07-20 08:53:06 [1]: CREATE_COMPLETE state changed 2016-07-20 08:53:06 [2]: SIGNAL_IN_PROGRESS Signal: deployment succeeded 2016-07-20 08:53:08 [2]: CREATE_COMPLETE state changed 2016-07-20 08:54:03 [0]: SIGNAL_IN_PROGRESS Signal: deployment succeeded 2016-07-20 08:54:04 [ControllerServicesBaseDeployment_Step2]: CREATE_COMPLETE state changed 2016-07-20 08:54:04 [ControllerRingbuilderDeployment_Step3]: CREATE_IN_PROGRESS state changed 2016-07-20 08:54:04 [0]: CREATE_COMPLETE state changed 2016-07-20 08:54:04 [overcloud-ControllerNodesPostDeployment-ovygbbvoegp7-ControllerServicesBaseDeployment_Step2-y7mugjbxvuvl]: CREATE_COMPLETE Stack CREATE completed successfully 2016-07-20 08:54:05 [overcloud-ControllerNodesPostDeployment-ovygbbvoegp7-ControllerRingbuilderDeployment_Step3-gjazamyx7ktj]: CREATE_IN_PROGRESS Stack CREATE started openstack-tripleo-heat-templates-2.0.0-24.el7ost.noarch Step2 completed in approx. 2 minutes: 2016-07-29 07:50:57 [overcloud-ControllerNodesPostDeployment-ys5skqitpsta-ControllerServicesBaseDeployment_Step2-pvvopijeq3ry]: CREATE_IN_PROGRESS Stack CREATE started 2016-07-29 07:50:57 [1]: CREATE_IN_PROGRESS state changed 2016-07-29 07:50:57 [0]: CREATE_IN_PROGRESS state changed 2016-07-29 07:50:58 [2]: CREATE_IN_PROGRESS state changed 2016-07-29 07:51:51 [2]: SIGNAL_IN_PROGRESS Signal: deployment succeeded 2016-07-29 07:51:51 [1]: SIGNAL_IN_PROGRESS Signal: deployment succeeded 2016-07-29 07:51:52 [1]: CREATE_COMPLETE state changed 2016-07-29 07:51:52 [2]: CREATE_COMPLETE state changed 2016-07-29 07:52:43 [0]: SIGNAL_IN_PROGRESS Signal: deployment succeeded 2016-07-29 07:52:44 [0]: CREATE_COMPLETE state changed 2016-07-29 07:52:44 [overcloud-ControllerNodesPostDeployment-ys5skqitpsta-ControllerServicesBaseDeployment_Step2-pvvopijeq3ry]: CREATE_COMPLETE Stack CREATE completed successfully Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHEA-2016-1599.html |