openAIS versions tried: openais-0.80.6-16.el5 openais-0.80.6-27.el5 'clogd' - the cluster log daemon used by cluster mirrors - makes use of openAIS checkpoints when starting up. Before a node joins a CPG (which will cause existing nodes to send a checkpoint), it attempts to remove any stale checkpoints that may be left over from previous sessions. It does this using the following logic: 1) open checkpoint 2) open failed because checkpoint doesn't exist? Done. 3) checkpoint exists - unlink checkpoint 4) close checkpoint Sometimes (most times), there may not be a residual checkpoint. 'clogd' is stuck indefinitely in saCkptCheckpointOpen (step 1 above). Here is the backtrace from GDB: #0 0x0000003348ad517a in semtimedop () from /lib64/libc.so.6 #1 0x0000003178c01bfc in ipc_sem_wait (ipc_context=0x14f3d110, iov=<value optimized out>, iov_len=<value optimized out>, res_msg=0x7fff7e9c7470, res_len=32) at util.c:490 #2 openais_reply_receive (ipc_context=0x14f3d110, iov=<value optimized out>, iov_len=<value optimized out>, res_msg=0x7fff7e9c7470, res_len=32) at util.c:681 #3 openais_msg_send_reply_receive (ipc_context=0x14f3d110, iov=<value optimized out>, iov_len=<value optimized out>, res_msg=0x7fff7e9c7470, res_len=32) at util.c:720 #4 0x00000031790037a2 in saCkptCheckpointOpen (ckptHandle=7749363892505018368, checkpointName=0x7fff7e9c78d0, checkpointCreationAttributes=0x0, checkpointOpenFlags=1, timeout=<value optimized out>, checkpointHandle=0x7fff7e9c78c8) at ckpt.c:620 #5 0x000000000041ada6 in remove_checkpoint (entry=0x14f3fee0) at cluster.c:1455 #6 0x000000000041bca9 in create_cluster_cpg ( uuid=0x6385d4 "LVM-fHV1NOvCOdTcZMTYYlGPyob3LCkjCRNyKDWk4ORfIjMfpLdmK8qM7Wn0vaJ5qJ00", uuid_instance=1) at cluster.c:1519 #7 0x0000000000424428 in local_resume (tfr=0x6385c4) at functions.c:882 #8 0x000000000042ac66 in do_local_work (data=0x0) at local.c:232 #9 0x00000000004298ee in links_issue_callbacks () at link_mon.c:134 #10 0x0000000000401fbc in main (argc=1, argv=0x7fff7e9c8098) at clogd.c:51 Cluster mirrors cannot be started or tested because of this condition.
Jon used repository version as well as version cmirror-1.1.39-10.el5 (which is 5.6 version). RHEL5.5 cmirrord was functional. Note openais-0.80.6-16.el5 is the RHEL5.5 version, which appears to fail with RHEL5.5 openais.
type=AVC msg=audit(1284489475.292:38): avc: denied { unix_read unix_write } for pid=7402 comm="aisexec" key=1714636915 scontext=root:system_r:aisexec_t:s0 tcontext=root:system_r:unconfined_t:s0-s0:c0.c1023 tclass=shm type=SYSCALL msg=audit(1284489475.292:38): arch=c000003e syscall=29 success=no exit=-13 a0=66334873 a1=2dc6c8 a2=180 a3=100 items=0 ppid=1 pid=7402 auid=0 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=2 comm="aisexec" exe="/usr/sbin/aisexec" subj=root:system_r:aisexec_t:s0 key=(null) type=AVC msg=audit(1284489475.292:39): avc: denied { unix_read unix_write } for pid=7402 comm="aisexec" key=1957747793 scontext=root:system_r:aisexec_t:s0 tcontext=root:system_r:unconfined_t:s0-s0:c0.c1023 tclass=sem
Dan, Is it possible to do the removal of the openais boolean as was done in corosync, or does this require a different solution? Regards -steve
No this is a different problem. Miroslav I think we should just allow this. We allow it in RHEL6. sesearch -A -s aisexec_t -t unconfined_t -c shm Found 1 semantic av rules: allow aisexec_t unpriv_userdomain : shm { getattr read write associate unix_read unix_write lock } ;
Fixed in selinux-policy-2.4.6-284.el5.
# cat > myaisexec.te << _EOF policy_module(myaisexec, 1.0) require{ type aisexec_t; type unconfined_t; } allow aisexec_t unconfined_t:shm create_shm_perms; allow aisexec_t unconfined_t:shm rw_shm_perms; _EOF # make -f /usr/share/selinux/devel/Makefile # semodule -i myaisexec.pp
needinfo is set for sdake, but no questions are asked.
does the workaround in comment #9 persist through reboots? If not, what is recommended procedure to enable that? Thanks
Technical note added. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: Due to incorrect SELinux policy, cmirror was unable to start properly, and as a result, cluster mirrors could not be started at all. This error has been fixed, and SELinux no longer prevents cluster mirrors from being started.
Technical note updated. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. Diffed Contents: @@ -1 +1 @@ -Due to incorrect SELinux policy, cmirror was unable to start properly, and as a result, cluster mirrors could not be started at all. This error has been fixed, and SELinux no longer prevents cluster mirrors from being started.+Due to an incorrect SELinux policy, the aisexec service was unable to use shared memory segments as an unprivileged user. This error has been fixed, the relevant SELinux policy has been corrected, and aisexec now works as expected.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2011-0026.html