+++ This bug was initially created as a clone of Bug #675883 +++ Description of problem: In several places, an epoll fd can call another file's ->f_op->poll() method with ep->mtx held. This is in general unsafe, because that other file could itself be an epoll fd that contains the original epoll fd. The code defends against this possibility in its own ->poll() method using ep_call_nested, but there are several other unsafe calls to ->poll elsewhere that can be made to deadlock. For example, the following simple program causes the call in ep_insert recursively call the original fd's ->poll, leading to deadlock. Testcase: https://lkml.org/lkml/2011/2/5/220 Proposed patch: https://lkml.org/lkml/2011/2/7/369 Deadlock is limited to the console, not the entire system.
Updated proposed patch: https://lkml.org/lkml/2011/2/12/108
Andrew has revised the patch: https://lkml.org/lkml/2011/2/14/497
mv 1.3 to 2.0
Upstream commit: 22bacca48a1755f79b7e0f192ddb9fbb7fc6e64e
Verified by reproducing the issue. As there were no RT kernels old enough to reproduce this issue, I had to install kernel-2.6.32-130.el6.x86_64 to trigger the issue. No issues found on the following kernels: + kernel-rt-3.0.25-rt44.57.el6rt.x86_64 (MRG 2.1) + kernel-rt-3.0.18-rt34.53.el6rt.x86_64 (MRG 2.1) + kernel-2.6.32-220.el6.x86_64 (RHEL 6.2) -> VERIFIED
Technical note added. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: * Cause: epoll code was calling another open file's poll operation without that file's lock held * Consequence: potential deadlock when using the system console * Fix: apply upstream patch to use ep_poll_nested * Result: no deadlock on system console
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2012-0496.html