Bug 487382 - pselect with timeout=NULL triggers a warning at kernel/hrtimer.c:439 hrtimer_reprogram()
pselect with timeout=NULL triggers a warning at kernel/hrtimer.c:439 hrtimer_...
Status: CLOSED ERRATA
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: realtime-kernel (Show other bugs)
Development
All Linux
low Severity medium
: 1.1.1
: ---
Assigned To: Red Hat Real Time Maintenance
David Sommerseth
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2009-02-25 13:40 EST by Luis Claudio R. Goncalves
Modified: 2016-05-22 19:28 EDT (History)
5 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2009-03-26 20:15:22 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Reproducer for the pselect issue (574 bytes, text/plain)
2009-02-25 13:40 EST, Luis Claudio R. Goncalves
no flags Details
Possible fix for the NULL timeout handling in sys_pselect7() (1.58 KB, patch)
2009-02-26 08:44 EST, Luis Claudio R. Goncalves
no flags Details | Diff

  None (edit)
Description Luis Claudio R. Goncalves 2009-02-25 13:40:47 EST
Created attachment 333203 [details]
Reproducer for the pselect issue

[root@void ~]# uname -r
2.6.24.7-103.el5rt


WARNING: at kernel/hrtimer.c:439 hrtimer_reprogram()
Pid: 666, comm: test_pselect Not tainted 2.6.24.7-103.el5rt #1

Call Trace:
 [<ffffffff8128818c>] ? rt_spin_lock_slowlock+0x226/0x24c
 [<ffffffff81054f35>] hrtimer_reprogram+0x60/0xb2
 [<ffffffff81054fe6>] enqueue_hrtimer+0x5f/0xe8
 [<ffffffff81055a01>] hrtimer_start+0x111/0x17f
 [<ffffffff8128714e>] schedule_hrtimeout+0x9e/0xeb
 [<ffffffff81055314>] ? hrtimer_wakeup+0x0/0x21
 [<ffffffff8128714e>] ? schedule_hrtimeout+0x9e/0xeb
 [<ffffffff810be499>] do_select+0x4bf/0x52d
 [<ffffffff810be95b>] ? __pollwait+0x0/0xdf
 [<ffffffff810346fc>] ? default_wake_function+0x0/0x11
 [<ffffffff810346fc>] ? default_wake_function+0x0/0x11
 [<ffffffff810346fc>] ? default_wake_function+0x0/0x11
 [<ffffffff810346fc>] ? default_wake_function+0x0/0x11
 [<ffffffff810346fc>] ? default_wake_function+0x0/0x11
 [<ffffffff810346fc>] ? default_wake_function+0x0/0x11
 [<ffffffff8107fb76>] ? __rcu_read_unlock+0x5a/0x5c
 [<ffffffff81085013>] ? find_get_page+0x161/0x173
 [<ffffffff8103470b>] ? default_wake_function+0xf/0x11
 [<ffffffff810304fe>] ? __wake_up_common+0x41/0x74
 [<ffffffff81085399>] ? find_lock_page+0x1e/0x5d
 [<ffffffff81087561>] ? filemap_fault+0x1fd/0x399
 [<ffffffff810be6cd>] core_sys_select+0x1c6/0x275
 [<ffffffff8105ef91>] ? __rt_mutex_adjust_prio+0x11/0x24
 [<ffffffff8105f896>] ? rt_mutex_adjust_prio+0x35/0x3e
 [<ffffffff8128779b>] ? rt_read_slowunlock+0x473/0x4ac
 [<ffffffff8106047b>] ? rt_mutex_up_read+0x9d/0xa1
 [<ffffffff81060b11>] ? rt_up_read+0x9/0xb
 [<ffffffff810be837>] sys_pselect7+0xbb/0x139
 [<ffffffff810b112d>] ? vfs_write+0x13b/0x170
 [<ffffffff810bebcd>] sys_pselect6+0x5d/0x6a
 [<ffffffff8100c22e>] system_call_ret+0x0/0x5
Comment 1 Luis Claudio R. Goncalves 2009-02-26 08:44:55 EST
Created attachment 333325 [details]
Possible fix for the NULL timeout handling in sys_pselect7()

This patch has been added to CVS (probably -105) for testing and the commit log says:

commit 62568510b8e2679cbc331d7de10ea9ba81ae8b3d
Author: Bernd Schmidt <bernds_cb1@t-online.de>
Date:   Tue Jan 13 22:14:48 2009 +0100

    Fix timeouts in sys_pselect7
    
    Since we (Analog Devices) updated our Blackfin kernel to 2.6.28, we've
    seen occasional 5-second hangs from telnet.  telnetd calls select with a
    NULL timeout, but with the new kernel, the system call occasionally
    returns 0, which causes telnet to call sleep (5).  This did not happen
    with earlier kernels.
    
    The code in sys_pselect7 looks a bit strange, in particular the variable
    "to" is initialized to NULL, then changed if a non-null timeout was
    passed in, but not used further.  It needs to be passed to
    core_sys_select instead of &end_time.
    
    This bug was introduced by 8ff3e8e85fa6c312051134b3953e397feb639f51
    ("select: switch select() and poll() over to hrtimers").
    
    Signed-off-by: Bernd Schmidt <bernd.schmidt@analog.com>
Comment 2 Luis Claudio R. Goncalves 2009-02-26 09:43:20 EST
The patch above fixed both the behavior and the backtrace in pselect7().

Before applying the patch, everytime the timeout was NULL one would notice: a) a backtrace in dmesg and b) that pselect that should block until one event occurred (in this example a key press) but was being released instantly, as if the timeout had passed.

Now it works:

[root@void ~]# dmesg -c
<cut the long messages>

[root@void ~]# /tmp/test_pselect 
Entering the test loop...
<keypress>
Out of the test loop...

[root@void ~]# /tmp/test_pselect 
Entering the test loop...
<keypress>
Out of the test loop...

[root@void ~]# dmesg
[root@void ~]#
Comment 4 David Sommerseth 2009-03-24 11:30:06 EDT
Found upstream commit 62568510b8e2679cbc331d7de10ea9ba81ae8b3d as mrg-rt-v1.git commit a595346a5640b2f11e3f2f0257274ad6da23333b implemented in 2.6.24.7-107

Verified by code review.
Comment 6 errata-xmlrpc 2009-03-26 20:15:22 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2009-0360.html

Note You need to log in before you can comment on or make changes to this bug.