Bug 190611

Summary: Race condition between i_mapping and iput() causes system panic
Product: Red Hat Enterprise Linux 4 Reporter: Bob Miller <rem>
Component: kernelAssignee: Eric Sandeen <esandeen>
Status: CLOSED WONTFIX QA Contact: Brian Brock <bbrock>
Severity: high Docs Contact:
Priority: medium    
Version: 4.0CC: jbaron, rwheeler
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2010-03-16 18:52:17 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Patch to eliminate the i_mapping race in bd_acquire(). none

Description Bob Miller 2006-05-03 22:02:58 UTC
Description of problem: 
  Other 
  Related People  Group  Security 
  Assignee:   Thorsten Kukuk <kukuk>    Bug Insecure    
  Reporter:   PolyServe Employee <suse> Reporter Accessible  
  QA Contact:   E-mail List <qa>  
 
    
Copy to New  
 
  
Edit Bug Attributes...  View Bug Activity  Format For Printing 
   2006-05-03 15:52:22 2 172493  
  
summary 
Race condition between ->i_mapping and iput() causes system panic. 
 
    
comments 
Description:   [reply]  
Our product does direct I/O to block devices to monitor filesystem status and 
because of this bug our system will panic if we hit the race.  The stack from 
the panic is included below.  It appears in the thread: 
 
http://www.ussg.iu.edu/hypermail/linux/kernel/0603.1/1261.html 
 
that others are having this issue as well.  The thread above has the first 
version of a patch that later made it into 2.6.17-rc2-mm at kernel.org. 
 
Stack traceback for pid 11950 
0xc8585980    11950    11947  1    0   R  0xc8585c40 *stress 
EBP        EIP        Function (args) 
0xd3d41d60 0xc01643ca find_get_pages_tag+0x1a (0x10, 0xd3d41db0, 0xf8aaf880, 
0x0 
) 
0xd3d41d78 0xc01740d9 pagevec_lookup_tag+0x29 (0x1, 0x10, 0x0, 0xd3d41dd4, 
0xfff 
fffff) 
0xd3d41e04 0xc01669f4 wait_on_page_writeback_range_wq+0x84 (0x0) 
0xd3d41e10 0xc0166cb9 filemap_fdatawait+0x19 (0xdb018980, 0xc8585b54, 
0xd3d41e28 
, 0xf8d6e681, 0x0) 
0xd3d41e50 0xc01bcaa8 __writeback_single_inode+0x258 (0x40, 0x8, 0xf6ee4e88, 
0xd 
3d40000, 0x0) 
0xd3d41e8c 0xc01bd206 sync_sb_inodes+0x136 (0xd3d41f14, 0xd3d41ec4, 
0xf6ee4df8, 
0x10, 0x1) 
0xd3d41f84 0xc01bd4b9 sync_inodes_sb+0x99 (0x1, 0x0) 
0xd3d41f94 0xc01bd573 sync_inodes+0x53 (0x0, 0x0) 
0xd3d41fa4 0xc01959cd do_sync+0x3d (0x0, 0x0, 0x0, 0xbb8) 
0xd3d41fbc 0xc0195a30 sys_sync+0x40 
           0xc010a679 sysenter_past_esp+0x52 
Version-Release number of selected component (if applicable): 
 
 
How reproducible: 
The shell code below does the trick. 
 
Steps to Reproduce: 
 while true; do 
 dd if=/dev/zero of=/dev/hdg1 bs=512 count=1000 & sync 
 done 
   
Actual results: 
 
 
Expected results: 
 
 
Additional info:

Comment 1 Elan Kaplan 2007-01-26 16:52:16 UTC
Folks, this bug has been sitting here for a long time with an easy reproduction
case. Do you have any plans to address it ? We would appreciate an update.

Thanks.

Comment 2 Elan Kaplan 2007-01-26 16:55:05 UTC
BTW - we have a patch for this. I will post it to the bug.

Comment 3 Bob Miller 2007-01-26 21:30:52 UTC
Created attachment 146716 [details]
Patch to eliminate the i_mapping race in bd_acquire().

Comment 4 Eric Sandeen 2007-08-23 18:14:58 UTC
upstream commit at
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=09d967c6f32b35eab15b45862ae16e4f06259d8e

Looks like there have been no further changes to these functions since then.

Comment 5 Eric Sandeen 2007-08-24 21:19:56 UTC
For what it's worth, I have not yet been able to reproduce with the
aforementioned reproducer...

-Eric

Comment 6 Eric Sandeen 2007-09-26 16:29:49 UTC
Bob, I see that the original report looks pasted from a SuSE bug report.  I
still have not been able to reproduce this on a RHEL4 kernel, and while it is
not an absolute requirement before we include this patch (since it does appear
that RHEL4 has this same problem), it's always best if we can demonstrate the
problem before we go fixing it.

Have you actually been able to reproduce this on a RHEL4 machine?

Thanks,

-Eric

Comment 7 Eric Sandeen 2008-12-02 18:01:33 UTC
polyserv, ping?  Have you actually seen or reproduced this on RHEL?

Comment 8 Ric Wheeler 2010-03-16 18:52:17 UTC
If this is still an issue, please reopen the bug.