Bug 675885

Summary: kernel: epoll: Prevent deadlock through unsafe ->f_op->poll() calls [mrg-2.1]
Product: Red Hat Enterprise MRG Reporter: Eugene Teo (Security Response) <eteo>
Component: realtime-kernelAssignee: John Kacur <jkacur>
Status: CLOSED ERRATA QA Contact: David Sommerseth <davids>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 2.0CC: bhu, jkacur, lgoncalv, ovasik, williams
Target Milestone: 2.1.5   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
* Cause: epoll code was calling another open file's poll operation without that file's lock held * Consequence: potential deadlock when using the system console * Fix: apply upstream patch to use ep_poll_nested * Result: no deadlock on system console
Story Points: ---
Clone Of: 675883 Environment:
Last Closed: 2012-04-18 20:03:29 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 675883, 681683    
Bug Blocks:    

Description Eugene Teo (Security Response) 2011-02-08 06:33:51 UTC
+++ This bug was initially created as a clone of Bug #675883 +++

Description of problem:
In several places, an epoll fd can call another file's ->f_op->poll() method with ep->mtx held. This is in general unsafe, because that other file could itself be an epoll fd that contains the original epoll fd.

The code defends against this possibility in its own ->poll() method using ep_call_nested, but there are several other unsafe calls to ->poll elsewhere that can be made to deadlock. For example, the following simple program causes the call in ep_insert recursively call the original fd's ->poll, leading to deadlock.

Testcase: https://lkml.org/lkml/2011/2/5/220
Proposed patch: https://lkml.org/lkml/2011/2/7/369

Deadlock is limited to the console, not the entire system.

Comment 1 Eugene Teo (Security Response) 2011-02-14 02:52:05 UTC
Updated proposed patch:
https://lkml.org/lkml/2011/2/12/108

Comment 2 Chuck Ebbert 2011-02-16 16:30:40 UTC
Andrew has revised the patch:
https://lkml.org/lkml/2011/2/14/497

Comment 4 Eugene Teo (Security Response) 2011-06-02 09:11:12 UTC
mv 1.3 to 2.0

Comment 8 Eugene Teo (Security Response) 2011-10-03 02:37:41 UTC
Upstream commit: 22bacca48a1755f79b7e0f192ddb9fbb7fc6e64e

Comment 14 David Sommerseth 2012-04-04 15:50:05 UTC
Verified by reproducing the issue.

As there were no RT kernels old enough to reproduce this issue, I had to install kernel-2.6.32-130.el6.x86_64 to trigger the issue.  No issues found on the following kernels:

+ kernel-rt-3.0.25-rt44.57.el6rt.x86_64 (MRG 2.1)
+ kernel-rt-3.0.18-rt34.53.el6rt.x86_64 (MRG 2.1)
+ kernel-2.6.32-220.el6.x86_64 (RHEL 6.2)

-> VERIFIED

Comment 15 Clark Williams 2012-04-11 15:06:04 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
* Cause: epoll code was calling another open file's poll operation without that file's lock held
* Consequence: potential deadlock when using the system console
* Fix: apply upstream patch to use ep_poll_nested
* Result: no deadlock on system console

Comment 16 errata-xmlrpc 2012-04-18 20:03:29 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2012-0496.html