Bug 1284797 - Failed deploy Instanc-ha on rhos-8 beta
Summary: Failed deploy Instanc-ha on rhos-8 beta
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: rhosp-director
Version: 8.0 (Liberty)
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: ---
Assignee: chris alfonso
QA Contact: Asaf Hirshberg
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-11-24 09:28 UTC by Asaf Hirshberg
Modified: 2016-02-01 02:37 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-11-24 10:37:21 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
/var/log/messeages from controller-0 (268.08 KB, application/x-gzip)
2015-11-24 09:28 UTC, Asaf Hirshberg
no flags Details
commands used following the guide (18.30 KB, text/plain)
2015-11-24 09:30 UTC, Asaf Hirshberg
no flags Details

Description Asaf Hirshberg 2015-11-24 09:28:02 UTC
Created attachment 1098094 [details]
/var/log/messeages from controller-0

Description of problem:

Compute node resources fails to start after configuring instance-ha using the guide: https://access.redhat.com/articles/1544823 , after following step 18:

 "Create Compute node resources and set the stonith level 1 to include both the nodes's physical fence device and fence-nova:"

pcs resource create overcloud-compute-0 ocf:pacemaker:remote reconnect_interval=60 op monitor interval=20
pcs property set --node overcloud-compute-0 osprole=compute
pcs stonith level add 1 overcloud-compute-0 compute0-ipmilan,fence-nova
pcs stonith
pcs resource create overcloud-compute-1 ocf:pacemaker:remote reconnect_interval=60 op monitor interval=20
pcs property set --node overcloud-compute-1 osprole=compute
pcs stonith level add 1 overcloud-compute-1 compute1-ipmilan,fence-nova
pcs stonith
pcs resource create overcloud-compute-2 ocf:pacemaker:remote reconnect_interval=60 op monitor interval=20
pcs property set --node overcloud-compute-2 osprole=compute
pcs stonith level add 1 overcloud-compute-2 compute2-ipmilan,fence-nova
pcs stonith

The result shown in "pcs status":

Failed Actions:
* overcloud-compute-0_start_0 on overcloud-controller-0 'unknown error' (1): call=310, status=Error, exitreason='none',
    last-rc-change='Tue Nov 24 04:16:09 2015', queued=0ms, exec=0ms
* overcloud-compute-1_start_0 on overcloud-controller-0 'unknown error' (1): call=312, status=Error, exitreason='none',
    last-rc-change='Tue Nov 24 04:16:10 2015', queued=0ms, exec=0ms
* overcloud-compute-2_start_0 on overcloud-controller-0 'unknown error' (1): call=313, status=Error, exitreason='none',
    last-rc-change='Tue Nov 24 04:16:10 2015', queued=0ms, exec=0ms
* overcloud-compute-0_start_0 on overcloud-controller-2 'unknown error' (1): call=313, status=Error, exitreason='none',
    last-rc-change='Tue Nov 24 04:16:11 2015', queued=0ms, exec=0ms
* overcloud-compute-1_start_0 on overcloud-controller-2 'unknown error' (1): call=312, status=Error, exitreason='none',
    last-rc-change='Tue Nov 24 04:16:11 2015', queued=0ms, exec=0ms
* overcloud-compute-2_start_0 on overcloud-controller-2 'unknown error' (1): call=310, status=Error, exitreason='none',
    last-rc-change='Tue Nov 24 04:16:09 2015', queued=0ms, exec=0ms
* overcloud-compute-0_start_0 on overcloud-controller-1 'unknown error' (1): call=312, status=Error, exitreason='none',
    last-rc-change='Tue Nov 24 04:16:10 2015', queued=0ms, exec=0ms
* overcloud-compute-2_start_0 on overcloud-controller-1 'unknown error' (1): call=314, status=Error, exitreason='none',
    last-rc-change='Tue Nov 24 04:16:11 2015', queued=0ms, exec=0ms


How reproducible:
4/4

Steps to Reproduce:
1. deploy instance-ha using the guide mentioned 

Actual results:
the resources configured on step 18 failed to start

Expected results:
the resource should be "active",and no errors related in pacemaker.

Additional info:
bare-metal setup: 3 controllers, 3 compute, external-ceph storage.

[root@overcloud-controller-0 ~]# rpm -qa | egrep '(pacemaker|fence-agents|resource-agents)'
pacemaker-libs-1.1.13-10.el7.x86_64
fence-agents-cisco-ucs-4.0.11-27.el7.x86_64
fence-agents-cisco-mds-4.0.11-27.el7.x86_64
fence-agents-vmware-soap-4.0.11-27.el7.x86_64
fence-agents-ilo2-4.0.11-27.el7.x86_64
fence-agents-emerson-4.0.11-27.el7.x86_64
fence-agents-rsb-4.0.11-27.el7.x86_64
pacemaker-cluster-libs-1.1.13-10.el7.x86_64
fence-agents-eps-4.0.11-27.el7.x86_64
fence-agents-drac5-4.0.11-27.el7.x86_64
fence-agents-mpath-4.0.11-27.el7.x86_64
fence-agents-ifmib-4.0.11-27.el7.x86_64
fence-agents-hpblade-4.0.11-27.el7.x86_64
fence-agents-bladecenter-4.0.11-27.el7.x86_64
fence-agents-apc-snmp-4.0.11-27.el7.x86_64
fence-agents-ipmilan-4.0.11-27.el7.x86_64
fence-agents-all-4.0.11-27.el7.x86_64
pacemaker-remote-1.1.13-10.el7.x86_64
fence-agents-kdump-4.0.11-27.el7.x86_64
fence-agents-rhevm-4.0.11-27.el7.x86_64
fence-agents-ipdu-4.0.11-27.el7.x86_64
fence-agents-ilo-moonshot-4.0.11-27.el7.x86_64
fence-agents-brocade-4.0.11-27.el7.x86_64
fence-agents-apc-4.0.11-27.el7.x86_64
fence-agents-compute-4.0.11-27.el7.x86_64
pacemaker-1.1.13-10.el7.x86_64
fence-agents-common-4.0.11-27.el7.x86_64
fence-agents-wti-4.0.11-27.el7.x86_64
fence-agents-ilo-ssh-4.0.11-27.el7.x86_64
fence-agents-ilo-mp-4.0.11-27.el7.x86_64
fence-agents-scsi-4.0.11-27.el7.x86_64
pacemaker-cli-1.1.13-10.el7.x86_64
fence-agents-rsa-4.0.11-27.el7.x86_64
fence-agents-ibmblade-4.0.11-27.el7.x86_64
fence-agents-intelmodular-4.0.11-27.el7.x86_64
fence-agents-eaton-snmp-4.0.11-27.el7.x86_64
resource-agents-3.9.5-54.el7.x86_64
[root@overcloud-controller-0 ~]# 

Attached files:
1) /var/log/messages from controller-0
2) command issued following the guide on the environment.

Comment 1 Asaf Hirshberg 2015-11-24 09:30:02 UTC
Created attachment 1098095 [details]
commands used following the guide


Note You need to log in before you can comment on or make changes to this bug.