Created attachment 366475 [details] test case I'll attach a test case and patch. Test cases succeeds in f11, fails in rhel5.4. The issue is the threads waiting on the futex_q q list acquire the mutex lock in the order they are queued rather than by priority. From man pthread_attr_setschedpolicy "When threads executing with the scheduling policy SCHED_FIFO, SCHED_RR, or SCHED_SPORADIC are waiting on a mutex, they shall acquire the mutex in priority order when the mutex is unlocked." http://www.opengroup.org/onlinepubs/009695399/functions/pthread_mutex_trylock.html) "If there are threads blocked on the mutex object referenced by mutex when pthread_mutex_unlock() is called, resulting in the mutex becoming available, the scheduling policy shall determine which thread shall acquire the mutex." The problem is if you set policy to SCHED_FIFO, threads are scheduled SCHED_FIFO regardless of what their priority is set at. I think the reason is that in rhel5, robust mutexes are still fifo in terms of order in which threads acquire the lock once the lock is unlocked. The reason is plist is not used and the queue is basically a normal linux list. in rhel 5: struct futex_q { struct list_head list; wait_queue_head_t waiters; ..... In later kernels: struct futex_q { struct plist_node list; wait_queue_head_t waiters; ....
Created attachment 366476 [details] patch against 5.4 I found an upstream commit that addresses this. I have attached a 5.4 port. This fixes the issue for realtime processes and for the test case. Apparently normal prio processes are still fifo. http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=ec92d08292d3e9b0823eba138a4564d2d39f25c7
Given the changes to the futex_q and futex_hash_bucket structures, did you build your test kernel via brew in order to verify that there are no KABI issues?
We build a test kernel via brew and gave it to the customer to test. You can find it at: https://brewweb.devel.redhat.com/taskinfo?taskID=2051141
Do you want to post your patch to rhkernel-list?
Hi Dave, I sent it out
Thanks Jon -- setting POST: http://post-office.corp.redhat.com/archives/rhkernel-list/2009-October/msg00852.html
Created attachment 366869 [details] same patch with tabbage fixed
This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release.
Created attachment 367305 [details] patch with tabbage fixed #2
in kernel-2.6.18-173.el5 You can download this test kernel from http://people.redhat.com/dzickus/el5 Please do NOT transition this bugzilla state to VERIFIED until our QE team has sent specific instructions indicating when to do so. However feel free to provide a comment indicating that this fix has been verified.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2010-0178.html