Hide Forgot
Description of problem: On RHEL6 with pacemaker and fence-agents, fencing fails to operate correctly when configured to use fence_ipmilan with either of these 2 methods: 1) pcmk_host_check=none * Fencing does not occur (even on a 2 node cluster) 2) pcmk_host_check=static-list pcmk_host_list='node1 node2 node3 node4' * The fencing operation does not occur on the correct node. Version-Release number of selected component (if applicable): pacemaker-1.1.2-7.el6.x86_64 fence-agents-3.0.12-8.el6.x86_64 How reproducible: Force a fence on node2 by stopping the network interface (ifdown on the heartbeat interface) Actual results: node4 is fenced instead of node2 Expected results: node2 should be fenced Additional info: http://www.gossamer-threads.com/lists/linuxha/users/67266 http://www.gossamer-threads.com/lists/linuxha/users/65098 Attached files for analysis within fence-pb-260111.tar: crm-configure-show - shows the crm configuration of the HA cluster crm_mon.after-ifdown-eth0-on-perou3 - shows the crm monitoring just after the ifdown crm_mon.before.ifdown-eth0-on-perou3 - shows the crm monitoring before the ifdown on perou3 syslog.perou2.during-fencing-pb - shows the status of node perou2 syslog.perou6.during-fencing-pb - shows the status of node perou6
IIRC, IPMI devices can only fence the machine of which they are a part. So this device definition looks wrong: primitive restofenceperou2 stonith:fence_ipmilan \ params ipaddr="10.11.0.103" login="administrator" passwd="administrator" pcmk_host_check="static-list" pcmk_host_list="perou2 perou3 perou6 perou7" action="reboot" Contrary to the advice in: http://www.gossamer-threads.com/lists/linuxha/users/67410#67410 each device is advertising that it can fence _all_ nodes in the cluster. Set pcmk_host_list (for each device) to _only_ the host name associated with the device's ipaddr instead. For example (guessing at the node/ip mapping): primitive restofenceperou2 stonith:fence_ipmilan \ params ipaddr="10.11.0.103" login="administrator" passwd="administrator" pcmk_host_check="static-list" pcmk_host_list="perou2" action="reboot" \ meta target-role="Started" primitive restofenceperou3 stonith:fence_ipmilan \ params ipaddr="10.11.0.104" login="administrator" passwd="administrator" pcmk_host_check="static-list" pcmk_host_list="perou3" action="reboot" \ meta target-role="Started" primitive restofenceperou6 stonith:fence_ipmilan \ params ipaddr="10.11.0.107" login="administrator" passwd="administrator" pcmk_host_check="static-list" pcmk_host_list="perou6" action="reboot" \ meta target-role="Started" primitive restofenceperou7 stonith:fence_ipmilan \ params ipaddr="10.11.0.108" login="administrator" passwd="administrator" pcmk_host_check="static-list" pcmk_host_list="perou7" action="reboot"
Feedback indicates that things are now working. Closing.