RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 981741 - BUG on dentry still in use when unmounting fuse
Summary: BUG on dentry still in use when unmounting fuse
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: kernel
Version: 6.4
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: rc
: ---
Assignee: Niels de Vos
QA Contact: Zorro Lang
URL:
Whiteboard:
: 1031614 (view as bug list)
Depends On:
Blocks: 988312 988708
TreeView+ depends on / blocked
 
Reported: 2013-07-05 16:44 UTC by Niels de Vos
Modified: 2014-01-14 03:36 UTC (History)
12 users (show)

Fixed In Version: kernel-2.6.32-408.el6
Doc Type: Bug Fix
Doc Text:
A dentry leak occurred in the FUSE code when, after a negative lookup, a negative dentry was neither dropped nor was the reference counter of the dentry decremented. This triggered a BUG() macro when unmounting a FUSE subtree containing the dentry, resulting in a kernel panic. A series of patches related to this problem has been applied to the FUSE code and negative dentries are now properly dropped so that triggering the BUG() macro is now avoided.
Clone Of:
: 988312 (view as bug list)
Environment:
Last Closed: 2013-11-21 19:24:57 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Proposed patch (838 bytes, patch)
2013-07-05 16:53 UTC, Niels de Vos
ndevos: review-
Details | Diff
Disable readdirplus for testing (482 bytes, patch)
2013-07-15 07:27 UTC, Niels de Vos
no flags Details | Diff
fix the dentry leak (402 bytes, patch)
2013-07-15 09:41 UTC, Niels de Vos
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2013:1645 0 normal SHIPPED_LIVE Important: Red Hat Enterprise Linux 6 kernel update 2013-11-20 22:04:18 UTC

Description Niels de Vos 2013-07-05 16:44:26 UTC
Description of problem:

While running the GlusterFS testsuite on RHEL-6.4 the kernel panics reliably:

<3>BUG: Dentry ffff8801eb4e9800{i=0,n=files9995} still in use (1) [unmount of fuse fuse]
<4>------------[ cut here ]------------
<2>kernel BUG at fs/dcache.c:670!
<4>invalid opcode: 0000 [#1] SMP 
<4>last sysfs file: /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/infiniband/mlx4_0/node_guid
<4>CPU 3 
<4>Modules linked in: xfs exportfs nfs lockd fscache auth_rpcgss nfs_acl sunrpc ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr ipv6 fuse r8169 mii sg serio_raw i2c_i801 iTCO_wdt iTCO_vendor_support shpchp snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm snd_timer snd soundcore snd_page_alloc mlx4_ib ib_sa ib_mad ib_core mlx4_en mlx4_core xhci_hcd ext4 mbcache jbd2 sd_mod crc_t10dif ahci i915 drm_kms_helper drm i2c_algo_bit i2c_core video output dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_wait_scan]
<4>
<4>Pid: 16277, comm: umount Not tainted 2.6.32-358.11.1.el6.x86_64 #1 System manufacturer System Product Name/P8Z77-V LX2
<4>RIP: 0010:[<ffffffff8119a9d8>]  [<ffffffff8119a9d8>] shrink_dcache_for_umount_subtree+0x2a8/0x2b0
<4>RSP: 0018:ffff8801f71dfdb8  EFLAGS: 00010292
<4>RAX: 000000000000005c RBX: ffff8801eb4e9800 RCX: 000000000000f21b
<4>RDX: 0000000000000000 RSI: 0000000000000046 RDI: 0000000000000246
<4>RBP: ffff8801f71dfdf8 R08: 0000000000000000 R09: 0000000000000001
<4>R10: ffffffff81641bc0 R11: ffff880217537b8b R12: 0000000000000d0b
<4>R13: ffffffff81a83fc0 R14: ffff8801b3f58780 R15: ffff8801eb4e9860
<4>FS:  00007f66d09e4740(0000) GS:ffff88002c380000(0000) knlGS:0000000000000000
<4>CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4>CR2: 00007f66d004b360 CR3: 00000001af6e6000 CR4: 00000000000407e0
<4>DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
<4>DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
<4>Process umount (pid: 16277, threadinfo ffff8801f71de000, task ffff8801f7364040)
<4>Stack:
<4> ffff880204cf5e70 ffff8801f7364040 0000000000000015 ffff880204cf5c00
<4><d> ffffffffa034c200 ffff8801f7102b38 ffff880204cf5c00 ffff88021c1ee380
<4><d> ffff8801f71dfe18 ffffffff8119aa16 0000000000000000 ffff880204cf5c00
<4>Call Trace:
<4> [<ffffffff8119aa16>] shrink_dcache_for_umount+0x36/0x60
<4> [<ffffffff8118336f>] generic_shutdown_super+0x1f/0xe0
<4> [<ffffffff81183496>] kill_anon_super+0x16/0x60
<4> [<ffffffffa03495d2>] fuse_kill_sb_anon+0x52/0x60 [fuse]
<4> [<ffffffff81183c37>] deactivate_super+0x57/0x80
<4> [<ffffffff811a1c2f>] mntput_no_expire+0xbf/0x110
<4> [<ffffffff811a269b>] sys_umount+0x7b/0x3a0
<4> [<ffffffff810dc847>] ? audit_syscall_entry+0x1d7/0x200
<4> [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b
<4>Code: 50 30 4c 8b 0a 31 d2 48 85 f6 74 04 48 8b 56 40 48 05 70 02 00 00 48 89 de 48 c7 c7 80 39 7b 81 48 89 04 24 31 c0 e8 e8 2b 37 00 <0f> 0b eb fe 0f 0b eb fe 55 48 89 e5 53 48 83 ec 08 0f 1f 44 00 
<1>RIP  [<ffffffff8119a9d8>] shrink_dcache_for_umount_subtree+0x2a8/0x2b0
<4> RSP <ffff8801f71dfdb8>


Version-Release number of selected component (if applicable):
kernel-2.6.32-358.11.1.el6.x86_64

How reproducible:
100%

Steps to Reproduce:
1. follow the instructions on http://www.gluster.org/community/documentation/index.php/Using_the_Gluster_Test_Framework
2. run the test like: prove -rf --timer $(dirname $0)/tests/bugs

Actual results:
Kernel panic (on RHEL), BUG+calltrace on Fedora.

Expected results:
No panic/BUG.

Additional info:

Comment 2 Niels de Vos 2013-07-05 16:53:58 UTC
Created attachment 769332 [details]
Proposed patch

jclift tested this patch successfully on RHEL-6.4.

Comment 4 Niels de Vos 2013-07-08 09:19:48 UTC
Comment on attachment 769332 [details]
Proposed patch

This patch does not correctly decrease the sb->s_active counter. This makes it impossible to unload the module after using it. Testing some variations now.

Comment 5 Niels de Vos 2013-07-15 07:27:06 UTC
Created attachment 773572 [details]
Disable readdirplus for testing

When I run the tests and the fuse-module does not support readdirplus, I can not reproduce the crashes. This narrows down the search for the cause considerably.

I'll read through the code over the next few days, do some further testing and see if there is anything obvious.

Comment 6 Niels de Vos 2013-07-15 09:41:38 UTC
Created attachment 773675 [details]
fix the dentry leak

There is a dentry leak in case d_lookup() returned a dentry that does not have a valid d_inode set. The attached patch fixes it for me.

Doing some further verification tests before posing upstream for review.

Comment 7 Niels de Vos 2013-07-15 13:26:02 UTC
This regression was introduced with the new READDIRPLUS support in fuse.

In order to hit the BUG() (which results in a kernel panic on RHEL), some
stressing of the VFS and the fuse mount seems needed. The GlusterFS tests make
a reliable reproducer:
 - http://www.gluster.org/community/documentation/index.php/Using_the_Gluster_Test_Framework

After some stressing of the VFS and fuse mountpoints, bug-860663.t will hit
the BUG(). It does not happen on running this test stand-alone.

Patch posted upstream for review:
- https://lkml.org/lkml/2013/7/15/203

Comment 8 RHEL Program Management 2013-07-15 13:37:42 UTC
This request was evaluated by Red Hat Product Management for
inclusion in a Red Hat Enterprise Linux release.  Product
Management has requested further review of this request by
Red Hat Engineering, for potential inclusion in a Red Hat
Enterprise Linux release for currently deployed products.
This request is not yet committed for inclusion in a release.

Comment 14 Niels de Vos 2013-08-01 08:03:00 UTC
RHEL-6 test-packages and the upstream patch can be found here:
- http://people.redhat.com/ndevos/bz981741/

Comment 17 Rafael Aquini 2013-08-07 15:48:39 UTC
Patch(es) available on kernel-2.6.32-408.el6

Comment 22 errata-xmlrpc 2013-11-21 19:24:57 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2013-1645.html

Comment 23 Raghavendra Talur 2013-12-23 09:22:12 UTC
*** Bug 1031614 has been marked as a duplicate of this bug. ***


Note You need to log in before you can comment on or make changes to this bug.