Bug 1284797 - Failed deploy Instanc-ha on rhos-8 beta
Failed deploy Instanc-ha on rhos-8 beta
Status: CLOSED NOTABUG
Product: Red Hat OpenStack
Classification: Red Hat
Component: rhosp-director (Show other bugs)
8.0 (Liberty)
Unspecified Unspecified
unspecified Severity high
: ---
: ---
Assigned To: chris alfonso
Asaf Hirshberg
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2015-11-24 04:28 EST by Asaf Hirshberg
Modified: 2016-01-31 21:37 EST (History)
4 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2015-11-24 05:37:21 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
/var/log/messeages from controller-0 (268.08 KB, application/x-gzip)
2015-11-24 04:28 EST, Asaf Hirshberg
no flags Details
commands used following the guide (18.30 KB, text/plain)
2015-11-24 04:30 EST, Asaf Hirshberg
no flags Details

  None (edit)
Description Asaf Hirshberg 2015-11-24 04:28:02 EST
Created attachment 1098094 [details]
/var/log/messeages from controller-0

Description of problem:

Compute node resources fails to start after configuring instance-ha using the guide: https://access.redhat.com/articles/1544823 , after following step 18:

 "Create Compute node resources and set the stonith level 1 to include both the nodes's physical fence device and fence-nova:"

pcs resource create overcloud-compute-0 ocf:pacemaker:remote reconnect_interval=60 op monitor interval=20
pcs property set --node overcloud-compute-0 osprole=compute
pcs stonith level add 1 overcloud-compute-0 compute0-ipmilan,fence-nova
pcs stonith
pcs resource create overcloud-compute-1 ocf:pacemaker:remote reconnect_interval=60 op monitor interval=20
pcs property set --node overcloud-compute-1 osprole=compute
pcs stonith level add 1 overcloud-compute-1 compute1-ipmilan,fence-nova
pcs stonith
pcs resource create overcloud-compute-2 ocf:pacemaker:remote reconnect_interval=60 op monitor interval=20
pcs property set --node overcloud-compute-2 osprole=compute
pcs stonith level add 1 overcloud-compute-2 compute2-ipmilan,fence-nova
pcs stonith

The result shown in "pcs status":

Failed Actions:
* overcloud-compute-0_start_0 on overcloud-controller-0 'unknown error' (1): call=310, status=Error, exitreason='none',
    last-rc-change='Tue Nov 24 04:16:09 2015', queued=0ms, exec=0ms
* overcloud-compute-1_start_0 on overcloud-controller-0 'unknown error' (1): call=312, status=Error, exitreason='none',
    last-rc-change='Tue Nov 24 04:16:10 2015', queued=0ms, exec=0ms
* overcloud-compute-2_start_0 on overcloud-controller-0 'unknown error' (1): call=313, status=Error, exitreason='none',
    last-rc-change='Tue Nov 24 04:16:10 2015', queued=0ms, exec=0ms
* overcloud-compute-0_start_0 on overcloud-controller-2 'unknown error' (1): call=313, status=Error, exitreason='none',
    last-rc-change='Tue Nov 24 04:16:11 2015', queued=0ms, exec=0ms
* overcloud-compute-1_start_0 on overcloud-controller-2 'unknown error' (1): call=312, status=Error, exitreason='none',
    last-rc-change='Tue Nov 24 04:16:11 2015', queued=0ms, exec=0ms
* overcloud-compute-2_start_0 on overcloud-controller-2 'unknown error' (1): call=310, status=Error, exitreason='none',
    last-rc-change='Tue Nov 24 04:16:09 2015', queued=0ms, exec=0ms
* overcloud-compute-0_start_0 on overcloud-controller-1 'unknown error' (1): call=312, status=Error, exitreason='none',
    last-rc-change='Tue Nov 24 04:16:10 2015', queued=0ms, exec=0ms
* overcloud-compute-2_start_0 on overcloud-controller-1 'unknown error' (1): call=314, status=Error, exitreason='none',
    last-rc-change='Tue Nov 24 04:16:11 2015', queued=0ms, exec=0ms


How reproducible:
4/4

Steps to Reproduce:
1. deploy instance-ha using the guide mentioned 

Actual results:
the resources configured on step 18 failed to start

Expected results:
the resource should be "active",and no errors related in pacemaker.

Additional info:
bare-metal setup: 3 controllers, 3 compute, external-ceph storage.

[root@overcloud-controller-0 ~]# rpm -qa | egrep '(pacemaker|fence-agents|resource-agents)'
pacemaker-libs-1.1.13-10.el7.x86_64
fence-agents-cisco-ucs-4.0.11-27.el7.x86_64
fence-agents-cisco-mds-4.0.11-27.el7.x86_64
fence-agents-vmware-soap-4.0.11-27.el7.x86_64
fence-agents-ilo2-4.0.11-27.el7.x86_64
fence-agents-emerson-4.0.11-27.el7.x86_64
fence-agents-rsb-4.0.11-27.el7.x86_64
pacemaker-cluster-libs-1.1.13-10.el7.x86_64
fence-agents-eps-4.0.11-27.el7.x86_64
fence-agents-drac5-4.0.11-27.el7.x86_64
fence-agents-mpath-4.0.11-27.el7.x86_64
fence-agents-ifmib-4.0.11-27.el7.x86_64
fence-agents-hpblade-4.0.11-27.el7.x86_64
fence-agents-bladecenter-4.0.11-27.el7.x86_64
fence-agents-apc-snmp-4.0.11-27.el7.x86_64
fence-agents-ipmilan-4.0.11-27.el7.x86_64
fence-agents-all-4.0.11-27.el7.x86_64
pacemaker-remote-1.1.13-10.el7.x86_64
fence-agents-kdump-4.0.11-27.el7.x86_64
fence-agents-rhevm-4.0.11-27.el7.x86_64
fence-agents-ipdu-4.0.11-27.el7.x86_64
fence-agents-ilo-moonshot-4.0.11-27.el7.x86_64
fence-agents-brocade-4.0.11-27.el7.x86_64
fence-agents-apc-4.0.11-27.el7.x86_64
fence-agents-compute-4.0.11-27.el7.x86_64
pacemaker-1.1.13-10.el7.x86_64
fence-agents-common-4.0.11-27.el7.x86_64
fence-agents-wti-4.0.11-27.el7.x86_64
fence-agents-ilo-ssh-4.0.11-27.el7.x86_64
fence-agents-ilo-mp-4.0.11-27.el7.x86_64
fence-agents-scsi-4.0.11-27.el7.x86_64
pacemaker-cli-1.1.13-10.el7.x86_64
fence-agents-rsa-4.0.11-27.el7.x86_64
fence-agents-ibmblade-4.0.11-27.el7.x86_64
fence-agents-intelmodular-4.0.11-27.el7.x86_64
fence-agents-eaton-snmp-4.0.11-27.el7.x86_64
resource-agents-3.9.5-54.el7.x86_64
[root@overcloud-controller-0 ~]# 

Attached files:
1) /var/log/messages from controller-0
2) command issued following the guide on the environment.
Comment 1 Asaf Hirshberg 2015-11-24 04:30 EST
Created attachment 1098095 [details]
commands used following the guide

Note You need to log in before you can comment on or make changes to this bug.