Hide Forgot
Created attachment 499781 [details] fence_virt.conf is as attachment. Description of problem: Version-Release number of selected component (if applicable): RHEL6 Host fence-virtd-0.2.1-5.el6.x86_64 RHEL6 Guest cman-3.0.12-23.el6_0.6.x86_64 rgmanager-3.0.12-10.el6.x86_64 How reproducible: Setup fence_virtd + fence_xvm (fence_virt multicast mode) I can fence dedicated node by using "fence_xvm -H node1.example.com" or "fence_node node2.example.com" command. I can also use "fence_xvm -o list" command to list all nodes status. For example, the result from my demo environment is as below: [root@node2 cluster]# fence_xvm -o list node1.example.com ec5ccf1c-b5c3-c0cc-bff3-c8f9c634be8a on node2.example.com 6f93b066-7a9a-b1c5-07b9-c4a6e5d827d1 on But when I try to trigger fence by crashing node or disconnect the heartbeat device. I get the following error message: May 19 01:00:30 fenced node1.example.com not a cluster member after 0 sec post_fail_delay May 19 01:00:30 fenced fencing node node1.example.com May 19 01:00:30 fenced fence node1.example.com dev 0.0 agent fence_xvm result: error from agent May 19 01:00:30 fenced fence node1.example.com failed Steps to Reproduce: 1. setup fence_virtd + fence_xvm 2. Using sysrq to crash cluster node in order to trigger fencing mechanism Actual results: fence failed Expected results: Fence should success. Additional info:
Need cluster.conf from guest cluster, as well
Also, you can "crash" your node, then run "fence_node" or "fence_xvm" manually and get more useful output than what is in syslog.
Created attachment 499972 [details] cluster.conf + fence log Attachment contains cluster.conf and the logs after the crashing test. crash-then-fence_node file show the status using "fence_node node2.example.com" command. crash-then-fence_xvm file show the status using "fence_xvm -H node2.example.com" command.
It sounds like the request is not getting sent, possibly due to some interaction with SELinux or other problem in the environment. This should be resolved by simply upgrading to the most current release of selinux-policy and selinux-policy-targeted (3.7.19-93.el6 as of this writing) - is that the version you have installed?
Just tested the most current version of selinux-policy and selinux-policy-targeted. Fence fails for the reason shown in the following audit.log: type=AVC msg=audit(1306189841.550:44972): avc: denied { name_bind } for pid=4095 comm="fence_xvm" src=1229 scontext=system_u:system_r:fenced_t:s0 tcontext=system_u:object_r:port_t:s0 tclass=tcp_socket type=SYSCALL msg=audit(1306189841.550:44972): arch=c000003e syscall=49 success=no exit=-13 a0=4 a1=7fffb7aa1d80 a2=10 a3=7fffb7aa1d7c items=0 ppid=1522 pid=4095 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm="fence_xvm" exe="/usr/sbin/fence_virt" subj=system_u:system_r:fenced_t:s0 key=(null) Setting enforcement to permissive after triggering fence_xvm causes fencing to immediately execute successfully. I'm not too keen on SELinux so im going to permanently set permissive mode until there is a better workaround.
Thanks for all your suggestions. I've checked my auduit.log and found it's really a selinux permission issue. I updated the selinux-policy and selinux-policy-targeted version to 3.7.19-93.el6. But fence_xvm still didn't work. So I used audit2allow command to add new allowed module to my selinux policy. fence_xvm can finally works as expected now. Thanks for all your help.
Moving to selinux policy component so that we can get updated policy so this doesn't happen
fence_xvm is attempting to listen on port 1229, is this a standard port for fence to listen on ?
Yes, 1229 is normal for fence_xvm to listen on.
Ok we have this policy in F15 and F16 Miroslav can you back port to RHEL6.
It should be fixed in the latest RHEL6.2 policy.
*** This bug has been marked as a duplicate of bug 705489 ***