Bug 1230966 - Overcloud post deployment fails with Pacemaker enabled - nodes active, CREATE_FAILED
Summary: Overcloud post deployment fails with Pacemaker enabled - nodes active, CREATE...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: rhosp-director
Version: 7.0 (Kilo)
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ga
: Director
Assignee: Dan Sneddon
QA Contact: Alexander Chuzhoy
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-06-11 21:49 UTC by Ronelle Landy
Modified: 2015-08-05 13:53 UTC (History)
8 users (show)

Fixed In Version: openstack-tripleo-heat-templates-0.8.6-6.el7ost
Doc Type: Known Issue
Doc Text:
Redis needs to use a separate VIP. When deploying with network isolation, the director automatically place Redis VIP on the Internal API VIP by default. Operators do have the ability to move Redis to another network using the ServiceNetMap parameter.
Clone Of:
: 1231184 (view as bug list)
Environment:
Last Closed: 2015-08-05 13:53:28 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
OpenStack gerrit 190853 None None None Never
OpenStack gerrit 190994 None None None Never
OpenStack gerrit 191026 None None None Never
Red Hat Product Errata RHEA-2015:1549 normal SHIPPED_LIVE Red Hat Enterprise Linux OpenStack Platform director Release 2015-08-05 17:49:10 UTC

Description Ronelle Landy 2015-06-11 21:49:12 UTC
Description of problem:

virt env is installed with bits from the latest poodle where Pacemaker is used by default for the overcloud.
instack-deploy-overcloud -- tuskar fails (CREATE_FAILED) with the following errors:

ERROR heat.engine.resources.openstack.heat.software_deployment [-] Deployment to server failed: deploy_status_code : Deployment exited with non-zero status code: 1", "warnings": []}

| logical_resource_id    | ControllerNodesPostDeployment                                                                                                                                                                                                                                                                           |
17:09:19 | physical_resource_id   | e108159e-b556-4d4c-be41-41009e41087c                                                                                                                                                                                                                                                                    |
17:09:20 | required_by            | BlockStorageNodesPostDeployment                                                                                                                                                                                                                                                                         |
17:09:20 |                        | CephStorageNodesPostDeployment                                                                                                                                                                                                                                                                          |
17:09:20 | resource_name          | ControllerNodesPostDeployment                                                                                                                                                                                                                                                                           |
17:09:20 | resource_status        | CREATE_FAILED                                                                                                                                                                                                                                                                                           |
17:09:20 | resource_status_reason | ResourceUnknownStatus: Resource failed - Unknown status FAILED due to "Resource CREATE failed: ResourceUnknownStatus: Resource failed - Unknown status FAILED due to "Resource CREATE failed: Error: Deployment to server failed: deploy_status_code : Deployment exited with non-zero status code: 1"" |
17:09:20 | resource_type          | OS::TripleO::ControllerPostDeployment                                                                                                                                                                                                                       


Error: Could not find data item redis_vip in any Hiera data file and no default supplied at /var/lib/heat-config/heat-config-puppet/7d0a69c1-4821-430c-8547-fe4ba0a928d6.pp:257


Version-Release number of selected component (if applicable):

[stack@instack ~]$ rpm -qa  | grep openstack
openstack-nova-console-2015.1.0-10.el7ost.noarch
openstack-neutron-2015.1.0-2.el7ost.noarch
openstack-ironic-conductor-2015.1.0-4.el7ost.noarch
openstack-ceilometer-alarm-2015.1.0-2.el7ost.noarch
openstack-swift-account-2.3.0-1.el7ost.noarch
openstack-tuskar-ui-0.3.0-2.el7ost.noarch
openstack-tripleo-heat-templates-0.8.6-4.el7ost.noarch
openstack-heat-api-cloudwatch-2015.1.0-3.el7ost.noarch
openstack-ceilometer-notification-2015.1.0-2.el7ost.noarch
openstack-neutron-openvswitch-2015.1.0-2.el7ost.noarch
openstack-nova-api-2015.1.0-10.el7ost.noarch
openstack-tripleo-image-elements-0.9.6-1.el7ost.noarch
python-openstackclient-1.0.3-2.el7ost.noarch
openstack-ironic-discoverd-1.1.0-3.el7ost.noarch
openstack-tripleo-puppet-elements-0.0.1-2.el7ost.noarch
openstack-swift-object-2.3.0-1.el7ost.noarch
openstack-tripleo-0.0.6-0.1.git812abe0.el7ost.noarch
openstack-utils-2014.2-1.el7ost.noarch
openstack-nova-common-2015.1.0-10.el7ost.noarch
openstack-heat-common-2015.1.0-3.el7ost.noarch
openstack-tuskar-0.4.18-2.el7ost.noarch
python-django-openstack-auth-1.2.0-2.el7ost.noarch
openstack-dashboard-theme-2015.1.0-9.el7ost.noarch
openstack-tuskar-ui-extras-0.0.3-3.el7ost.noarch
openstack-tempest-kilo-20150507.2.el7ost.noarch
openstack-swift-2.3.0-1.el7ost.noarch
openstack-neutron-ml2-2015.1.0-2.el7ost.noarch
openstack-nova-novncproxy-2015.1.0-10.el7ost.noarch
openstack-keystone-2015.1.0-1.el7ost.noarch
openstack-swift-plugin-swift3-1.7-3.el7ost.noarch
openstack-tripleo-common-0.0.1.dev6-0.git49b57eb.el7ost.noarch
openstack-neutron-common-2015.1.0-2.el7ost.noarch
openstack-heat-engine-2015.1.0-3.el7ost.noarch
openstack-ceilometer-common-2015.1.0-2.el7ost.noarch
openstack-heat-api-cfn-2015.1.0-3.el7ost.noarch
openstack-ceilometer-api-2015.1.0-2.el7ost.noarch
openstack-ironic-api-2015.1.0-4.el7ost.noarch
openstack-swift-proxy-2.3.0-1.el7ost.noarch
openstack-ceilometer-collector-2015.1.0-2.el7ost.noarch
openstack-ironic-common-2015.1.0-4.el7ost.noarch
openstack-selinux-0.6.31-1.el7ost.noarch
openstack-nova-compute-2015.1.0-10.el7ost.noarch
openstack-nova-conductor-2015.1.0-10.el7ost.noarch
openstack-swift-container-2.3.0-1.el7ost.noarch
redhat-access-plugin-openstack-7.0.0-0.el7ost.noarch
openstack-heat-templates-0-0.6.20150605git.el7ost.noarch
openstack-glance-2015.1.0-6.el7ost.noarch
openstack-heat-api-2015.1.0-3.el7ost.noarch
openstack-ceilometer-central-2015.1.0-2.el7ost.noarch
openstack-puppet-modules-2015.1.4-1.el7ost.noarch
openstack-nova-scheduler-2015.1.0-10.el7ost.noarch
openstack-nova-cert-2015.1.0-10.el7ost.noarch
openstack-dashboard-2015.1.0-9.el7ost.noarch



How reproducible:
Always with latest poodle  confirmed with two installs

Steps to Reproduce:
1. Install and set up virt env with bits from latest poodle (06/11)
2. Run instack-deploy-overcloud --tuskar
3. See failures/ERRORS/CREATE_FAILED in  heat stack-show overcloud

Actual results:
Overcloud deploy is CREATE_FAILED

Expected results:
Should be CREATE_COMPLETE

Additional info:

Comment 4 Ronelle Landy 2015-06-11 22:37:07 UTC
Three controller deploy showed some other issues:

[heat-admin@ov-ik3glkjldcc-0-bgdxz5dw33jc-controller-b6mgf742iqfj ~]$ sudo grep -i error /var/log/messages
Jun 11 17:51:39 localhost kdumpctl: cat: write error: Broken pipe
Jun 11 18:23:58 localhost pengine[17793]: error: unpack_resources: Resource start-up disabled since no STONITH resources have been defined
Jun 11 18:23:58 localhost pengine[17793]: error: unpack_resources: Either configure some or disable STONITH with the stonith-enabled option
Jun 11 18:23:58 localhost pengine[17793]: error: unpack_resources: NOTE: Clusters with shared data need STONITH to ensure data integrity
Jun 11 18:23:58 localhost pengine[17793]: notice: process_pe_message: Configuration ERRORs found during PE processing.  Please run "crm_verify -L" to identify issues.
[heat-admin@ov-ik3glkjldcc-0-bgdxz5dw33jc-controller-b6mgf742iqfj ~]$ 
[heat-admin@ov-ik3glkjldcc-0-bgdxz5dw33jc-controller-b6mgf742iqfj ~]$ crm_verify -L
Live CIB query failed: Transport endpoint is not connected


overcloud was still CREATE_IN PROGRESS .. assuming this will timeout shortly.

Comment 5 Mike Burns 2015-06-12 11:01:32 UTC
Comment 4 appears to be a distinct issue from the redis vip issue, so splitting that to a separate bug

Comment 7 Giulio Fidente 2015-06-12 15:58:21 UTC
Should be fixed by: https://review.openstack.org/#/c/191026/

Comment 8 Dan Sneddon 2015-06-16 21:10:58 UTC
This should be fixed on the most recent puddle/poodles by this fix which was merged downstream: https://review.openstack.org/#/c/191026/

Comment 10 Alexander Chuzhoy 2015-06-19 15:47:48 UTC
Verified:

Environment:
instack-undercloud-2.1.2-1.el7ost.noarch


The command to deploy overcloud is now:
openstack overcloud deploy --plan-uuid [UUID]

The deployment completes successfully.

Comment 12 errata-xmlrpc 2015-08-05 13:53:28 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2015:1549


Note You need to log in before you can comment on or make changes to this bug.