Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 1156661 - Kernel crash when unmounting Ext4 filesystem
Kernel crash when unmounting Ext4 filesystem
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: kernel (Show other bugs)
6.6
x86_64 Linux
unspecified Severity high
: rc
: ---
Assigned To: Eric Sandeen
Xiong Murphy Zhou
:
Depends On:
Blocks: 1128951
  Show dependency treegraph
 
Reported: 2014-10-24 21:12 EDT by wshilong
Modified: 2015-07-22 04:29 EDT (History)
10 users (show)

See Also:
Fixed In Version: kernel-2.6.32-527.el6
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2015-07-22 04:29:03 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2015:1272 normal SHIPPED_LIVE Moderate: kernel security, bug fix, and enhancement update 2015-07-22 07:56:25 EDT

  None (edit)
Description wshilong 2014-10-24 21:12:54 EDT
Description of problem:

When using Ext4 filesystem, filesystem hits something bad which makes filesystem
force to be readonly..But in ext4_delete_inode() we don't cleanup orphan inode from
list properly..which will make us crash when unmounting..

See following message:
<2>LDISKFS-fs error (device dm-2): __ldiskfs_ext_check_block: bad header/extent in inode #659: invalid magic - magic e000, entries 456, max 0(0), depth 51424(0)
<3>Aborting journal on device dm-2-8.
<2>LDISKFS-fs error (device dm-2) in ldiskfs_free_blocks: Journal has aborted
<2>LDISKFS-fs error (device dm-2) in ldiskfs_free_blocks: Journal has aborted
<2>LDISKFS-fs (dm-2): Remounting filesystem read-only
<2>LDISKFS-fs error (device dm-2) in ldiskfs_free_blocks: Journal has aborted
<2>LDISKFS-fs error (device dm-2) in ldiskfs_ext_remove_space: Journal has aborted
<2>LDISKFS-fs error (device dm-2) in ldiskfs_reserve_inode_write: Journal has aborted
<2>LDISKFS-fs error (device dm-2) in ldiskfs_ext_truncate: Journal has aborted
<4>LDISKFS-fs warning (device dm-2): ldiskfs_delete_inode: couldn't extend journal (err -5)
<3>LDISKFS-fs (dm-2): Inode 280 (ffff8803a9ecb6d8): orphan list check failed!

Here ldiskfs is ext4, when unmounting, it will hit ASSERTION which will crash kernel,
original problem we hit is coming from Lustre, see this reports:
https://jira.hpdd.intel.com/browse/LU-5771?filter=-2

Version-Release number of selected component (if applicable):
This problem exists for rhel6 series..

So following upstream commit fixed kernel crash problem, could you please merge them into rhel6 kernel?

commit 4538821993f4486c76090dfb377c60c0a0e71ba3
Author: Theodore Ts'o <tytso@mit.edu>
Date:   Thu Jul 29 15:06:10 2010 -0400

    ext4: drop inode from orphan list if ext4_delete_inode() fails
    
    There were some error paths in ext4_delete_inode() which was not
    dropping the inode from the orphan list.  This could lead to a BUG_ON
    on umount when the orphan list is discovered to be non-empty.
    
    Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Comment 2 Ondrej Vasik 2014-10-25 15:46:43 EDT
filesystem component has nothing to do with ext4 filesystem (it just handles basic directory layout). You want kernel component and File systems subcomponent. Reassigning...
Comment 3 Eric Sandeen 2014-10-25 15:49:43 EDT
Looks reasonable, thanks for the report.
Comment 5 Xiong Murphy Zhou 2014-10-27 21:26:11 EDT
(In reply to wshilong from comment #0)
> Description of problem:
> 
> original problem we hit is coming from Lustre, see this reports:
> https://jira.hpdd.intel.com/browse/LU-5771?filter=-2

This requires an account login. It will be great that if there is a reproducer or any particular procedures. Thanks very much!
Comment 6 wshilong 2014-10-28 00:12:46 EDT
(In reply to xzhou from comment #5)
> (In reply to wshilong from comment #0)
> > Description of problem:
> > 
> > original problem we hit is coming from Lustre, see this reports:
> > https://jira.hpdd.intel.com/browse/LU-5771?filter=-2
> 
> This requires an account login. It will be great that if there is a
> reproducer or any particular procedures. Thanks very much!

Hello xzhou,

Unluckily i don't have a simple reproducer to reproduce this problem, however this problem did happen in our environment.

Previously i confirmed this problem and patch by hacking codes to force to expected error path, normally this problem did not happen, but it could in some error cases.

Best regards,
Wang Shilong
Comment 7 RHEL Product and Program Management 2014-11-10 18:09:49 EST
This request was evaluated by Red Hat Product Management for
inclusion in a Red Hat Enterprise Linux release.  Product
Management has requested further review of this request by
Red Hat Engineering, for potential inclusion in a Red Hat
Enterprise Linux release for currently deployed products.
This request is not yet committed for inclusion in a release.
Comment 8 Rafael Aquini 2015-01-30 12:11:21 EST
Patch(es) available on kernel-2.6.32-527.el6
Comment 13 errata-xmlrpc 2015-07-22 04:29:03 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2015-1272.html

Note You need to log in before you can comment on or make changes to this bug.