Bug 1032633

Summary: GFS2: kernel BUG at fs/dcache.c:1387
Product: [Fedora] Fedora Reporter: Andrew Price <anprice>
Component: kernelAssignee: Steve Whitehouse <swhiteho>
Status: CLOSED CURRENTRELEASE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: rawhideCC: anprice, gansalmon, itamar, jonathan, kernel-maint, madhu.chinakonda
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: 3.12.6-300.fc20 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-01-06 12:54:36 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Andrew Price 2013-11-20 14:01:22 UTC
Description of problem:

I tried out my reproducer for bug #1031320 on the rawhide kernel and it didn't reproduce the same, but on umount it hit a bug in a different part of the dcache code. It could be a different manifestation of the same problem but I figured it best to open a new bz for this one anyway.

Version-Release number of selected component (if applicable):

3.13.0-0.rc0.git7.1.fc21.x86_64

How reproducible:

Every time.

Steps to Reproduce:

[root@rawhide1 ~]# mkfs.gfs2 -Op lock_nolock testdev
[root@rawhide1 ~]# mount testdev /mnt/test
[root@rawhide1 ~]# mkdir /mnt/test/foo
[root@rawhide1 ~]# umount /mnt/test
[root@rawhide1 ~]# mount testdev /mnt/test
[root@rawhide1 ~]# cat boom.c
#include <dirent.h>
int main(void)
{
	DIR *dirp = opendir("/mnt/test/foo");
	closedir(dirp);
	return 0;
}
[root@rawhide1 ~]# gcc boom.c -o boom
[root@rawhide1 ~]# ./boom
[root@rawhide1 ~]# umount /mnt/test
[   48.191507] BUG: Dentry ffff8800232aea20{i=a40016,n=foo} still in use (-1) [unmount of gfs2 loop0]
[   48.192879] ------------[ cut here ]------------
[   48.193517] kernel BUG at fs/dcache.c:1387!
[   48.193811] invalid opcode: 0000 [#1] SMP 
[   48.193811] Modules linked in: gfs2 loop dlm sctp libcrc32c iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi cfg80211 rfkill microcode 8139too joydev serio_raw virtio_net 8139cp mii virtio_console virtio_balloon i2c_piix4 virtio_scsi virtio_blk qxl drm_kms_helper ttm drm virtio_pci virtio_ring virtio i2c_core ata_generic pata_acpi
[   48.193811] CPU: 0 PID: 381 Comm: umount Not tainted 3.13.0-0.rc0.git7.1.fc21.x86_64 #1
[   48.193811] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[   48.193811] task: ffff88002b89b3a0 ti: ffff88002bdde000 task.ti: ffff88002bdde000
[   48.193811] RIP: 0010:[<ffffffff812111a1>]  [<ffffffff812111a1>] umount_collect+0x101/0x120
[   48.193811] RSP: 0018:ffff88002bddfd88  EFLAGS: 00010282
[   48.193811] RAX: 0000000000000056 RBX: ffff8800232aea20 RCX: 0000000000000006
[   48.193811] RDX: 00000000000024a0 RSI: ffff88002b89be58 RDI: 0000000000000246
[   48.193811] RBP: ffff88002bddfda0 R08: 0000000000000000 R09: 0000000000000000
[   48.193811] R10: 0000000000000001 R11: ffff88002bddfab6 R12: ffff88002bddfe38
[   48.193811] R13: ffff8800232aea20 R14: ffff8800232aeb30 R15: ffff8800232aeab0
[   48.193811] FS:  00007f2b87b19880(0000) GS:ffff88002fc00000(0000) knlGS:0000000000000000
[   48.193811] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[   48.193811] CR2: 00007fd3eca693d0 CR3: 000000002875e000 CR4: 00000000000006f0
[   48.193811] Stack:
[   48.193811]  ffff880027cfa908 ffff88002325c798 ffff88002325c8b8 ffff88002bddfe28
[   48.193811]  ffffffff81212afb ffffffff812142e8 0000000000000000 01000000001d5cc0
[   48.193811]  ffff88002325c828 ffff88002325c798 ffff88002325c828 ffff88002325c8b8
[   48.193811] Call Trace:
[   48.193811]  [<ffffffff81212afb>] d_walk+0xdb/0x470
[   48.193811]  [<ffffffff812142e8>] ? shrink_dcache_for_umount+0x88/0x140
[   48.193811]  [<ffffffff812110a0>] ? check_and_collect+0x30/0x30
[   48.193811]  [<ffffffff812142e8>] shrink_dcache_for_umount+0x88/0x140
[   48.193811]  [<ffffffff811fb051>] generic_shutdown_super+0x21/0xf0
[   48.193811]  [<ffffffff811fb357>] kill_block_super+0x27/0x70
[   48.193811]  [<ffffffffa02ae462>] gfs2_kill_sb+0x72/0x80 [gfs2]
[   48.193811]  [<ffffffff811fb77d>] deactivate_locked_super+0x3d/0x60
[   48.193811]  [<ffffffff811fbd36>] deactivate_super+0x46/0x60
[   48.193811]  [<ffffffff8121e19d>] mntput_no_expire+0x17d/0x1f0
[   48.193811]  [<ffffffff8121e037>] ? mntput_no_expire+0x17/0x1f0
[   48.193811]  [<ffffffff8121fa1e>] SyS_umount+0x8e/0x100
[   48.193811]  [<ffffffff8175fc29>] system_call_fastpath+0x16/0x1b
[   48.193811] Code: 00 00 48 8b 40 28 4c 8b 08 48 8b 43 68 48 85 c0 74 2d 48 8b 50 40 48 89 34 24 48 c7 c7 78 3d a5 81 48 89 de 31 c0 e8 43 82 53 00 <0f> 0b 0f 1f 44 00 00 48 89 f7 e8 10 fd ff ff e9 56 ff ff ff 31 
[   48.193811] RIP  [<ffffffff812111a1>] umount_collect+0x101/0x120
[   48.193811]  RSP <ffff88002bddfd88>
[   48.242102] ---[ end trace d508dcba41033e87 ]---
[   48.243204] BUG: sleeping function called from invalid context at kernel/locking/rwsem.c:20
[   48.245097] in_atomic(): 1, irqs_disabled(): 0, pid: 381, name: umount
[   48.246621] INFO: lockdep is turned off.
[   48.247545] CPU: 0 PID: 381 Comm: umount Tainted: G      D      3.13.0-0.rc0.git7.1.fc21.x86_64 #1
[   48.249537] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[   48.250848]  ffffffff81a3e4f3 ffff88002bddfa88 ffffffff8174d2c7 0000000000000000
[   48.251752]  ffff88002bddfab0 ffffffff810ac2b5 ffff88002b91a768 ffff88002b91a7c8
[   48.252848]  ffff88002b89b3a0 ffff88002bddfad8 ffffffff8175492a ffff88002bddfae8
[   48.253910] Call Trace:
[   48.254271]  [<ffffffff8174d2c7>] dump_stack+0x4d/0x66
[   48.255274]  [<ffffffff810ac2b5>] __might_sleep+0x175/0x230
[   48.256544]  [<ffffffff8175492a>] down_read+0x2a/0xa0
[   48.257683]  [<ffffffff8108b534>] exit_signals+0x24/0x130
[   48.258908]  [<ffffffff81077573>] do_exit+0xc3/0xcf0
[   48.260240]  [<ffffffff810e2828>] ? kmsg_dump+0x1b8/0x230
[   48.261505]  [<ffffffff810e2695>] ? kmsg_dump+0x25/0x230
[   48.262708]  [<ffffffff8175798c>] oops_end+0x9c/0xe0
[   48.263830]  [<ffffffff8101cf8b>] die+0x4b/0x70
[   48.264947]  [<ffffffff81757230>] do_trap+0x60/0x170
[   48.266107]  [<ffffffff8101a115>] do_invalid_op+0x95/0xb0
[   48.267314]  [<ffffffff812111a1>] ? umount_collect+0x101/0x120
[   48.268629]  [<ffffffff8138d2fa>] ? trace_hardirqs_off_thunk+0x3a/0x3c
[   48.270066]  [<ffffffff81756a63>] ? restore_args+0x30/0x30
[   48.271279]  [<ffffffff8176136e>] invalid_op+0x1e/0x30
[   48.272433]  [<ffffffff812111a1>] ? umount_collect+0x101/0x120
[   48.273724]  [<ffffffff812111a1>] ? umount_collect+0x101/0x120
[   48.275036]  [<ffffffff81212afb>] d_walk+0xdb/0x470
[   48.276133]  [<ffffffff812142e8>] ? shrink_dcache_for_umount+0x88/0x140
[   48.277611]  [<ffffffff812110a0>] ? check_and_collect+0x30/0x30
[   48.279014]  [<ffffffff812142e8>] shrink_dcache_for_umount+0x88/0x140
[   48.280479]  [<ffffffff811fb051>] generic_shutdown_super+0x21/0xf0
[   48.281799]  [<ffffffff811fb357>] kill_block_super+0x27/0x70
[   48.283089]  [<ffffffffa02ae462>] gfs2_kill_sb+0x72/0x80 [gfs2]
[   48.284439]  [<ffffffff811fb77d>] deactivate_locked_super+0x3d/0x60
[   48.285943]  [<ffffffff811fbd36>] deactivate_super+0x46/0x60
[   48.287209]  [<ffffffff8121e19d>] mntput_no_expire+0x17d/0x1f0
[   48.288520]  [<ffffffff8121e037>] ? mntput_no_expire+0x17/0x1f0
[   48.289831]  [<ffffffff8121fa1e>] SyS_umount+0x8e/0x100
[   48.291011]  [<ffffffff8175fc29>] system_call_fastpath+0x16/0x1b
[   48.292392] note: umount[381] exited with preempt_count 2
Segmentation fault

Comment 1 Steve Whitehouse 2014-01-02 10:34:52 UTC
The fix for this has gone upstream:

http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/fs/gfs2?id=ea0341e071527d5cec350917b01ab901af09d758

and also to -stable too. Can you confirm that this is no longer a problem on kernels with this patch in?

Comment 2 Andrew Price 2014-01-06 12:26:54 UTC
Yes, tested and confirmed fixed in 3.12.6-300.fc20