Red Hat Bugzilla – Bug 251038
poll system call behavior during suspend/resume
Last modified: 2013-08-15 08:26:30 EDT
Description of problem: poll system call retunrs with EINTR when system goes to standby. Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: it should not return. below is the email exchange with Chris - * Agarwal, Lomesh (lomesh.agarwal@intel.com) wrote: > > Chris Wright (chrisw@redhat.com) wrote: > > Yes, please. > [Agarwal, Lomesh] how do I open a bug against RHEL? bugzilla.redhat.com -> enter new bug Red Hat Enterprise Linux Version 5 Component Kernel Feel free to put me on Cc: list. > > Also note, turns out the patch will not work as is. If you > > specify a (non-negative) timeout value and send STOP/CONT before > > timeout expires the poll will restart with original timeout. Repeat > > this and you can see this process never returning to userspace. > [Agarwal, Lomesh] non-negative timeout means infinite timeout and app > is Negative timeout means infinite timeout, non-negative timeout means a real timeout value. IOW, if you specifiy a timeout you'd expect to return to userspace at the end of the timeout instead of never. > supposed to hang until one of the file descriptor is ready for IO or a > signal is sent. The only behavior change due to this patch would be > that app won't get control due to any signal which it does not handle > (signal which it handles will still return the control). I don't think > any app writer will rely on unhandled signal to get the control back. > Also if this is a acceptable behavior in select then why is it not > acceptable in poll? Well, it's slightly more subtle than that. Both select and ppoll have a timeout structure that can be rewritten with time left until timeout, so they can easily restart without redoing the full timeout. If updating the timeout structure fails, then select/ppoll will still return early with EINTR. > The full solution > is more invasive, and we'll need to re-evaluate when it's been done. Doesn't look as bad as I thought it might. Here's a first cut. fs/select.c | 34 +++++++++++++++++++++++++++++++++- 1 files changed, 33 insertions(+), 1 deletions(-) diff --git a/fs/select.c b/fs/select.c index a974082..50e6d8e 100644 --- a/fs/select.c +++ b/fs/select.c @@ -736,10 +736,29 @@ out_fds: return err; } +long do_restart_poll(struct restart_block *restart_block) { + struct pollfd __user *ufds = (struct pollfd __user*)restart_block- >arg0; + int nfds = restart_block->arg1; + s64 timeout = ((s64)restart_block->arg3<<32) | (s64)restart_block- >arg2; + int ret; + + restart_block->fn = do_no_restart_syscall; + ret = do_sys_poll(ufds, nfds, &timeout); + if (ret == -EINTR) { + restart_block->fn = do_restart_poll; + restart_block->arg2 = timeout & 0xFFFFFFFF; + restart_block->arg3 = timeout >> 32; + ret = -ERESTART_RESTARTBLOCK; + } + return ret; +} + asmlinkage long sys_poll(struct pollfd __user *ufds, unsigned int nfds, long timeout_msecs) { s64 timeout_jiffies; + int ret; if (timeout_msecs > 0) { #if HZ > 1000 @@ -754,7 +773,20 @@ asmlinkage long sys_poll(struct pollfd __user *ufds, unsigned int nfds, timeout_jiffies = timeout_msecs; } - return do_sys_poll(ufds, nfds, &timeout_jiffies); + ret = do_sys_poll(ufds, nfds, &timeout_jiffies); + if (ret == -EINTR) { + if (timeout_msecs > 0) { + struct restart_block *restart_block; + restart_block = ¤t_thread_info()->restart_block; + restart_block->fn = do_restart_poll; + restart_block->arg0 = (unsigned long)ufds; + restart_block->arg1 = nfds; + restart_block->arg2 = timeout_jiffies & 0xFFFFFFFF; + restart_block->arg3 = timeout_jiffies >> 32; + ret = -ERESTART_RESTARTBLOCK; + } + } + return ret; } #ifdef TIF_RESTORE_SIGMASK Additional info:
Unfortunately, given the age of the problem report, it's relatively low severity, and the lifecycle stage that RHEL 5 is in, this problem is not going to be fixed in RHEL 5. -Lenny.