Created attachment 315353 [details] Patch to close the file descriptors in the event of errors. Description of problem: I have attached a patch which solves this problem by closing the file descriptors which were opened to partitions in the event of an error accessing that partition in disk.c:qdisk_open(). In certain circumstances, qdiskd can encounter errors while scanning for a quorum disk label which causing file descriptors used during that scan to not be closed properly. When qdiskd forks/execs the heuristic those open FDs are inherited by the heuristic program. This is particularly annoying if you are using ping as your heuristic on a SELinux enabled system as you get AVC denials at every ping check filling your logs. This is harmless as the check still can process, but severely clutters the SELinux logs making them less useful for finding real security issues. Version-Release number of selected component (if applicable): # rpm -q cman cman-2.0.84-2.el5 How reproducible: Every time if the conditions are proper. To reproduce this you need: - A host system with a partition that causes the error. On my systems the extended partition container (/dev/sda4 and /dev/sdb4) triggered this. - SELinux set to enforcing with the targeted policy and install-time defaults. - A working cman configured for a quorum disk using a ping heuristic AND configured to look for a qdisk label instead of a specific block device. Steps to Reproduce: 1. Have a SELinux enabled and enforcing system using the targeted policy. 2. Setup a working cluster with a quorum disk using /bin/ping as a heuristic and a qdisk found via label. 3. Watch /var/log/audit/auditd.log (or /var/log/messages depending on config) for SELinux denials. Actual results: If you manage to catch one of the ping processes long enough to list its open file descriptors you see 0 -> /dev/null 1 -> /dev/null 2 -> /dev/null 3 -> /dev/sda4 4 -> /dev/sdb4 (where /dev/sda4 and /dev/sdb4 are the extended partition type) SELinux denials of the form are the obvious sign: type=AVC msg=audit(1219980241.221:6801): avc: denied { read write } for pid=3810 comm="ping" path="/dev/sda4" dev=tmpfs ino=1051 scontext=root:system_r:ping_t:s0 tcontext=system_u:object_r:fixed_disk_device_t:s0 tclass=blk_file type=AVC msg=audit(1219980241.221:6801): avc: denied { read write } for pid=3810 comm="ping" path="/dev/sdb4" dev=tmpfs ino=985 scontext=root:system_r:ping_t:s0 tcontext=system_u:object_r:fixed_disk_device_t:s0 tclass=blk_file type=SYSCALL msg=audit(1219980241.221:6801): arch=c000003e syscall=59 success=yes exit=0 a0=93d96b0 a1=93d9720 a2=93d8240 a3=3 items=0 ppid=1 pid=3810 auid=0 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=220 com m="ping" exe="/bin/ping" subj=root:system_r:ping_t:s0 key=(null) Expected results: No errors and only FDs 0-2 open to /dev/null on the fork/exec. Additional info:
Pushed to master, STABLE2 and RHEL5 branches
This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release.
Actually, I just discovered that this is an incomplete fix for the problem. There is a race condition between the heuristic forks and the open() calls in at least disc_util.c:getuptime() and main.c:update_local_status(). I just received two SELinux alerts relating to these open filehandles surviving the fork() into my ping test. The ideal solution would be to use the O_CLOEXEC flag to open(2), but the manpage says that isn't available until Linux 2.6.23. The only solution here might be to add a pthread_mutex_lock to synchronize the threads such that fork() can't be called until the open either fails or returns successfully and a fcntl(fd,F_SETFD,FD_CLOEXEC) can be called.
Created attachment 315593 [details] Adds pthread_mutex_lock()ing around the opens and fork I believe something like this patch is necessary to fix the race condition. Please review it carefully as my C is quite rusty - feedback welcome. It is hard to know for certain if this fixes the problem as it is tricky to reproduce the race condition, but I have this running on a test cluster right now and will report back if it doesn't seem to solve the problem for me.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2009-0189.html