Bug 447871
Summary: | prio-wake testcase failures | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise MRG | Reporter: | IBM Bug Proxy <bugproxy> | ||||||
Component: | realtime-kernel | Assignee: | Red Hat Real Time Maintenance <rt-maint> | ||||||
Status: | CLOSED WONTFIX | QA Contact: | |||||||
Severity: | high | Docs Contact: | |||||||
Priority: | low | ||||||||
Version: | beta | CC: | bhu, williams | ||||||
Target Milestone: | --- | ||||||||
Target Release: | --- | ||||||||
Hardware: | x86_64 | ||||||||
OS: | All | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2012-01-05 21:12:00 UTC | Type: | --- | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Attachments: |
|
Description
IBM Bug Proxy
2008-05-22 08:48:25 UTC
Created attachment 306350 [details]
fixes to test pass/fail reporting
Created attachment 306351 [details]
make output of prio-wake easily readable
------- Comment From sripathi.com 2008-05-26 02:17 EDT------- Have the LTP patches been sent to their ML? Have they been accepted? ------- Comment From ankigarg.com 2008-05-26 02:19 EDT------- (In reply to comment #11) > Have the LTP patches been sent to their ML? Have they been accepted? Sending out the patches now... ------- Comment From ankigarg.com 2008-05-27 08:31 EDT------- Trying sched_switch tracer now. ------- Comment From dvhltc.com 2008-06-02 14:07 EDT------- I'd like to see if the -62 (alpha17) kernel improves this scenario since it fixes some PI related bugs. ------- Comment From ankigarg.com 2008-06-03 07:57 EDT------- See equivalent number of failures even with -62 kernel. ------- Comment From dvhltc.com 2008-06-03 18:33 EDT------- After discussing this a bit with various folks, I believe that the test case IS actually valid, but that we can't expect it to pass until after we resolve some issues with the current pthread_cond_* implementations (both glibc and kernel). As it stands, priority inversion is possible with pthread_cond_* as the condition variables do not use PI mutexes internally, and they do not have an explicit ownership handoff at signal/broadcast time. So if a broadcast is sent and the implementation wakes the highest prio thread first (and the CPU is in interrupt context) it won't be able to grab the mutex immediately, so as the implementation signals the next thread (which get's to run immediately on it's runqueue) it will grab the mutex so that the higher priority thread will end up blocking once the CPU returns execution to it. An explicit handoff of ownership prior to returning control to the threads should eliminate this scenario. So I think we will need to defer this bug until such time as we get the long-standing "requeue_pi/condvar" issues sorted out. That said, before we defer the bug, I would like to make sure we agree that the test is valid. I know Steven Rostedt had concerns that it wasn't valid if multiple CPUs were involved. Given my explanation above, are there still concerns over the validity of the test-case? ------- Comment From ankigarg.com 2008-06-04 02:59 EDT------- (In reply to comment #16) > After discussing this a bit with various folks, I believe that the test case IS > actually valid, but that we can't expect it to pass until after we resolve some > issues with the current pthread_cond_* implementations (both glibc and kernel). > > As it stands, priority inversion is possible with pthread_cond_* as the > condition variables do not use PI mutexes internally, and they do not have an > explicit ownership handoff at signal/broadcast time. So if a broadcast is sent > and the implementation wakes the highest prio thread first (and the CPU is in > interrupt context) it won't be able to grab the mutex immediately, so as the > implementation signals the next thread (which get's to run immediately on it's > runqueue) it will grab the mutex so that the higher priority thread will end up > blocking once the CPU returns execution to it. An explicit handoff of ownership > prior to returning control to the threads should eliminate this scenario. > > So I think we will need to defer this bug until such time as we get the > long-standing "requeue_pi/condvar" issues sorted out. That said, before we > defer the bug, I would like to make sure we agree that the test is valid. I > know Steven Rostedt had concerns that it wasn't valid if multiple CPUs were > involved. Given my explanation above, are there still concerns over the > validity of the test-case? I too agree that the testcase is valid and in line with the expected behavior. But with the current implementation of pthread_* and RT scheduler, the testcase would be expected to fail on SMP. That is the reason why, on binding the testcase to a single CPU, did not observe failures. We could defer the bug...I was only wanting to track down the exact reason why the high prio was not being woken up first...could it be due to anything in the scheduler? besides the point that the higher prio could be waking up in interrupt context... ------- Comment From dvhltc.com 2008-06-23 15:11 EDT------- I agree with the DEFERRED state, but not so much with P5. I think this is a real issue we need to address, but we can't until the pi_requque work is sorted out. Doesn't that keep it at least at the P3 level ? ------- Comment From sripathi.com 2008-06-24 01:25 EDT------- (In reply to comment #23) > I agree with the DEFERRED state, but not so much with P5. I think this is a > real issue we need to address, but we can't until the pi_requque work is sorted > out. Doesn't that keep it at least at the P3 level ? OK. ------- Comment From dvhltc.com 2009-04-10 13:41 EDT------- *** Bug 52280 has been marked as a duplicate of this bug. *** ------- Comment From dvhltc.com 2009-04-10 13:44 EDT------- I've tested this with the kernel fixes for Bug 48484 and a preliminary hacked glibc with over 13k successful runs. Moving back to Open state and documenting it's dependency on the glibc fix as well. ------- Comment From dvhltc.com 2009-04-10 13:45 EDT------- This bug will not be fixed in R2-SR1. We are hoping for MRG 1.2, but for now marking it as upstream. ------- Comment From dino.com 2009-05-07 08:48 EDT------- Sent the glibc patches for requeue_pi to Clark on the Rhel-rt-ibm list. ------- Comment From johnstul.com 2009-05-28 12:59 EDT------- *** Bug 51506 has been marked as a duplicate of this bug. *** The glibc guys wont' take the requeue_pi patches and I don't want to deliver a special glibc just for realtime. Closing WONTFIX. ------- Comment From niv.com 2012-02-16 16:32 EDT------- Closing bug from our end WILL_NOT_FIX. We should yank the test from LTP or modify the test at least, to indicate it will fail on SMP. Test cleanups will be handled under a separate bug. |