Bug 190611 - Race condition between i_mapping and iput() causes system panic
Race condition between i_mapping and iput() causes system panic
Status: CLOSED WONTFIX
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: kernel (Show other bugs)
4.0
All Linux
medium Severity high
: ---
: ---
Assigned To: Eric Sandeen
Brian Brock
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2006-05-03 18:02 EDT by Bob Miller
Modified: 2010-03-16 14:52 EDT (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2010-03-16 14:52:17 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)
Patch to eliminate the i_mapping race in bd_acquire(). (1.53 KB, patch)
2007-01-26 16:30 EST, Bob Miller
no flags Details | Diff

  None (edit)
Description Bob Miller 2006-05-03 18:02:58 EDT
Description of problem: 
  Other 
  Related People  Group  Security 
  Assignee:   Thorsten Kukuk <kukuk@suse.de>    Bug Insecure    
  Reporter:   PolyServe Employee <suse@polyserve.com> Reporter Accessible  
  QA Contact:   E-mail List <qa@suse.de>  
 
    
Copy to New  
 
  
Edit Bug Attributes...  View Bug Activity  Format For Printing 
   2006-05-03 15:52:22 2 172493  
  
summary 
Race condition between ->i_mapping and iput() causes system panic. 
 
    
comments 
Description:   [reply]  
Our product does direct I/O to block devices to monitor filesystem status and 
because of this bug our system will panic if we hit the race.  The stack from 
the panic is included below.  It appears in the thread: 
 
http://www.ussg.iu.edu/hypermail/linux/kernel/0603.1/1261.html 
 
that others are having this issue as well.  The thread above has the first 
version of a patch that later made it into 2.6.17-rc2-mm at kernel.org. 
 
Stack traceback for pid 11950 
0xc8585980    11950    11947  1    0   R  0xc8585c40 *stress 
EBP        EIP        Function (args) 
0xd3d41d60 0xc01643ca find_get_pages_tag+0x1a (0x10, 0xd3d41db0, 0xf8aaf880, 
0x0 
) 
0xd3d41d78 0xc01740d9 pagevec_lookup_tag+0x29 (0x1, 0x10, 0x0, 0xd3d41dd4, 
0xfff 
fffff) 
0xd3d41e04 0xc01669f4 wait_on_page_writeback_range_wq+0x84 (0x0) 
0xd3d41e10 0xc0166cb9 filemap_fdatawait+0x19 (0xdb018980, 0xc8585b54, 
0xd3d41e28 
, 0xf8d6e681, 0x0) 
0xd3d41e50 0xc01bcaa8 __writeback_single_inode+0x258 (0x40, 0x8, 0xf6ee4e88, 
0xd 
3d40000, 0x0) 
0xd3d41e8c 0xc01bd206 sync_sb_inodes+0x136 (0xd3d41f14, 0xd3d41ec4, 
0xf6ee4df8, 
0x10, 0x1) 
0xd3d41f84 0xc01bd4b9 sync_inodes_sb+0x99 (0x1, 0x0) 
0xd3d41f94 0xc01bd573 sync_inodes+0x53 (0x0, 0x0) 
0xd3d41fa4 0xc01959cd do_sync+0x3d (0x0, 0x0, 0x0, 0xbb8) 
0xd3d41fbc 0xc0195a30 sys_sync+0x40 
           0xc010a679 sysenter_past_esp+0x52 
Version-Release number of selected component (if applicable): 
 
 
How reproducible: 
The shell code below does the trick. 
 
Steps to Reproduce: 
 while true; do 
 dd if=/dev/zero of=/dev/hdg1 bs=512 count=1000 & sync 
 done 
   
Actual results: 
 
 
Expected results: 
 
 
Additional info:
Comment 1 Elan Kaplan 2007-01-26 11:52:16 EST
Folks, this bug has been sitting here for a long time with an easy reproduction
case. Do you have any plans to address it ? We would appreciate an update.

Thanks.
Comment 2 Elan Kaplan 2007-01-26 11:55:05 EST
BTW - we have a patch for this. I will post it to the bug.
Comment 3 Bob Miller 2007-01-26 16:30:52 EST
Created attachment 146716 [details]
Patch to eliminate the i_mapping race in bd_acquire().
Comment 4 Eric Sandeen 2007-08-23 14:14:58 EDT
upstream commit at
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=09d967c6f32b35eab15b45862ae16e4f06259d8e

Looks like there have been no further changes to these functions since then.
Comment 5 Eric Sandeen 2007-08-24 17:19:56 EDT
For what it's worth, I have not yet been able to reproduce with the
aforementioned reproducer...

-Eric
Comment 6 Eric Sandeen 2007-09-26 12:29:49 EDT
Bob, I see that the original report looks pasted from a SuSE bug report.  I
still have not been able to reproduce this on a RHEL4 kernel, and while it is
not an absolute requirement before we include this patch (since it does appear
that RHEL4 has this same problem), it's always best if we can demonstrate the
problem before we go fixing it.

Have you actually been able to reproduce this on a RHEL4 machine?

Thanks,

-Eric
Comment 7 Eric Sandeen 2008-12-02 13:01:33 EST
polyserv, ping?  Have you actually seen or reproduced this on RHEL?
Comment 8 Ric Wheeler 2010-03-16 14:52:17 EDT
If this is still an issue, please reopen the bug.

Note You need to log in before you can comment on or make changes to this bug.