460645 – qdiskd isn't always closing file descriptors properly before forking

Bug 460645 - qdiskd isn't always closing file descriptors properly before forking

Summary: qdiskd isn't always closing file descriptors properly before forking

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 5
Classification:	Red Hat
Component:	cman
Sub Component:
Version:	5.2
Hardware:	All
OS:	Linux
Priority:	medium
Severity:	medium
Target Milestone:	rc
Target Release:	---
Assignee:	Lon Hohberger
QA Contact:	Cluster QE
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	456578
TreeView+	depends on / blocked

Reported:	2008-08-29 13:06 UTC by Sean E. Millichamp
Modified:	2009-04-16 22:38 UTC (History)
CC List:	3 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2009-01-20 21:51:40 UTC
Target Upstream Version:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
Patch to close the file descriptors in the event of errors. (725 bytes, patch) 2008-08-29 13:06 UTC, Sean E. Millichamp	no flags	Details \| Diff
Adds pthread_mutex_lock()ing around the opens and fork (3.95 KB, patch) 2008-09-02 20:21 UTC, Sean E. Millichamp	no flags	Details \| Diff
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHBA-2009:0189	0	normal	SHIPPED_LIVE	cman bug-fix and enhancement update	2009-01-20 16:05:55 UTC

Description Sean E. Millichamp 2008-08-29 13:06:12 UTC

Created attachment 315353 [details]
Patch to close the file descriptors in the event of errors.

Description of problem:

I have attached a patch which solves this problem by closing the file descriptors which were opened to partitions in the event of an error accessing that partition in disk.c:qdisk_open().

In certain circumstances, qdiskd can encounter errors while scanning for a quorum disk label which causing file descriptors used during that scan to not be closed properly.  When qdiskd forks/execs the heuristic those open FDs are inherited by the heuristic program.

This is particularly annoying if you are using ping as your heuristic on a SELinux enabled system as you get AVC denials at every ping check filling your logs.  This is harmless as the check still can process, but severely clutters the SELinux logs making them less useful for finding real security issues.

Version-Release number of selected component (if applicable):

# rpm -q cman
cman-2.0.84-2.el5

How reproducible:

Every time if the conditions are proper.  To reproduce this you need:

- A host system with a partition that causes the error.  On my systems the extended partition container (/dev/sda4 and /dev/sdb4) triggered this.

- SELinux set to enforcing with the targeted policy and install-time defaults.

- A working cman configured for a quorum disk using a ping heuristic AND configured to look for a qdisk label instead of a specific block device.

Steps to Reproduce:
1. Have a SELinux enabled and enforcing system using the targeted policy.
2. Setup a working cluster with a quorum disk using /bin/ping as a heuristic and a qdisk found via label.
3. Watch /var/log/audit/auditd.log (or /var/log/messages depending on config) for SELinux denials.
  
Actual results:

If you manage to catch one of the ping processes long enough to list its open file descriptors you see
0 -> /dev/null
1 -> /dev/null
2 -> /dev/null
3 -> /dev/sda4
4 -> /dev/sdb4

(where /dev/sda4 and /dev/sdb4 are the extended partition type)

SELinux denials of the form are the obvious sign:

type=AVC msg=audit(1219980241.221:6801): avc:  denied  { read write } for  pid=3810 comm="ping" path="/dev/sda4" dev=tmpfs ino=1051 scontext=root:system_r:ping_t:s0 tcontext=system_u:object_r:fixed_disk_device_t:s0 tclass=blk_file
type=AVC msg=audit(1219980241.221:6801): avc:  denied  { read write } for  pid=3810 comm="ping" path="/dev/sdb4" dev=tmpfs ino=985 scontext=root:system_r:ping_t:s0 tcontext=system_u:object_r:fixed_disk_device_t:s0 tclass=blk_file
type=SYSCALL msg=audit(1219980241.221:6801): arch=c000003e syscall=59 success=yes exit=0 a0=93d96b0 a1=93d9720 a2=93d8240 a3=3 items=0 ppid=1 pid=3810 auid=0 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=220 com
m="ping" exe="/bin/ping" subj=root:system_r:ping_t:s0 key=(null)


Expected results:

No errors and only FDs 0-2 open to /dev/null on the fork/exec.

Additional info:

Comment 1 Lon Hohberger 2008-09-02 15:36:08 UTC

Pushed to master, STABLE2 and RHEL5 branches

Comment 2 RHEL Program Management 2008-09-02 15:42:10 UTC

This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 3 Sean E. Millichamp 2008-09-02 18:59:09 UTC

Actually, I just discovered that this is an incomplete fix for the problem.

There is a race condition between the heuristic forks and the open() calls in at least disc_util.c:getuptime() and main.c:update_local_status().  I just received two SELinux alerts relating to these open filehandles surviving the fork() into my ping test.

The ideal solution would be to use the O_CLOEXEC flag to open(2), but the manpage says that isn't available until Linux 2.6.23.  The only solution here might be to add a pthread_mutex_lock to synchronize the threads such that fork() can't be called until the open either fails or returns successfully and a fcntl(fd,F_SETFD,FD_CLOEXEC) can be called.

Comment 4 Sean E. Millichamp 2008-09-02 20:21:15 UTC

Created attachment 315593 [details]
Adds pthread_mutex_lock()ing around the opens and fork

I believe something like this patch is necessary to fix the race condition.  Please review it carefully as my C is quite rusty - feedback welcome.

It is hard to know for certain if this fixes the problem as it is tricky to reproduce the race condition, but I have this running on a test cluster right now and will report back if it doesn't seem to solve the problem for me.

Comment 7 errata-xmlrpc 2009-01-20 21:51:40 UTC

An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2009-0189.html

Note You need to log in before you can comment on or make changes to this bug.