When using gfs2 as the rootfs, I get the following oops I believe on mount. This is with 2.6.17-1.2510.fc6xen on x86_64 as a domU ----------- [cut here ] --------- [please bite here ] --------- Kernel BUG at fs/gfs2/glock.c:1173 invalid opcode: 0000 [1] SMP last sysfs file: /block/ram0/dev CPU 0 Modules linked in: dm_emc dm_round_robin dm_multipath dm_snapshot dm_mirror dm_zero dm_mod xfs jfs reiserfs lock_nolock gfs2 ext3 jbd msdos raid1 raid0 xenblk xennet iscsi_tcp libiscsi scsi_transport_iscsi sr_mod sd_mod scsi_mod ide_cd cdrom ipv6 squashfs pcspkr loop nfs nfs_acl fscache lockd sunrpc vfat fat cramfs Pid: 339, comm: anaconda Not tainted 2.6.17-1.2510.fc6xen #1 RIP: e030:[<ffffffff882a2360>] [<ffffffff882a2360>] :gfs2:gfs2_glock_nq+0x9d/0x184 RSP: e02b:ffff88000dce39d8 EFLAGS: 00010296 RAX: 0000000000000029 RBX: ffff88000dce3af8 RCX: 0000000000000001 RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffffffff804b9780 RBP: ffff880009dbfe20 R08: ffffffff804b9798 R09: ffff88000dce3658 R10: 0000000000000003 R11: 0000000000000000 R12: ffff880009dbfe20 R13: 0000000000000000 R14: ffffc2000029d000 R15: ffff880009dbfe20 FS: 00002aaaabbea120(0000) GS:ffffffff8063b000(0000) knlGS:0000000000000000 CS: e033 DS: 0000 ES: 0000 Process anaconda (pid: 339, threadinfo ffff88000dce2000, task ffff88000eb3a7f0) Stack: ffffc2000029d330 ffff88000dce3af8 ffff8800098242b8 ffffc2000029d000 ffff8800098242b8 ffffffff882a3df8 00000e1000000003 ffff880009dbfe00 0000420209dbfe00 ffffffff802614cc Call Trace: [<ffffffff882a3df8>] :gfs2:gfs2_glock_nq_atime+0xf4/0x2a2 [<ffffffff802614cc>] __mutex_lock_slowpath+0x27a/0x285 [<ffffffff882ab0d6>] :gfs2:gfs2_readpages+0x75/0x1d3 [<ffffffff882a1728>] :gfs2:gfs2_glock_put+0x92/0x99 [<ffffffff802b3847>] __rmqueue+0x4a/0xe8 [<ffffffff8020aade>] get_page_from_freelist+0x231/0x408 [<ffffffff882ab0c9>] :gfs2:gfs2_readpages+0x68/0x1d3 [<ffffffff802131c6>] __do_page_cache_readahead+0x145/0x218 [<ffffffff8026262c>] _spin_lock_irqsave+0x26/0x2b [<ffffffff802227ac>] __up_read+0x19/0x7f [<ffffffff883d3b82>] :dm_mod:dm_any_congested+0x3b/0x42 [<ffffffff80213aaa>] filemap_nopage+0x14a/0x34f [<ffffffff882b062b>] :gfs2:gfs2_sharewrite_nopage+0xcc/0x2ee [<ffffffff882b05b1>] :gfs2:gfs2_sharewrite_nopage+0x52/0x2ee [<ffffffff802614cc>] __mutex_lock_slowpath+0x27a/0x285 [<ffffffff80208f3a>] __handle_mm_fault+0x65d/0xf5e [<ffffffff80264f3c>] do_page_fault+0xe69/0x1203 [<ffffffff8020e2d4>] do_mmap_pgoff+0x608/0x773 [<ffffffff8026262c>] _spin_lock_irqsave+0x26/0x2b [<ffffffff8023135a>] __up_write+0x27/0xf2 [<ffffffff8025e173>] error_exit+0x0/0x6e Code: 0f 0b 68 5a 99 2b 88 c2 95 04 48 8b 73 18 49 8b 84 24 90 00 RIP [<ffffffff882a2360>] :gfs2:gfs2_glock_nq+0x9d/0x184 RSP <ffff88000dce39d8>
Happens on i386 as well. Things seem fine if I do a gfs2 /scratch, though.
Also reproducible with gfs2 as /usr, although we then start to get some bits installed before things blow up ------------[ cut here ]------------ kernel BUG at fs/gfs2/glock.c:1173! invalid opcode: 0000 [#1] SMP last sysfs file: /block/xvda/dev Modules linked in: dm_emc dm_round_robin dm_multipath dm_snapshot dm_mirror dm_zero dm_mod xfs jfs reiserfs lock_nolock gfs2 ext3 jbd msdos raid1 raid0 xenblk xennet iscsi_tcp libiscsi scsi_transport_iscsi sr_mod sd_mod scsi_mod ide_cd cdrom ipv6 squashfs pcspkr loop nfs nfs_acl fscache lockd sunrpc vfat fat cramfs CPU: 0 EIP: 0061:[<d92bc16d>] Not tainted VLI EFLAGS: 00210296 (2.6.17-1.2510.fc6xen #1) EIP is at gfs2_glock_nq+0x8f/0x14c [gfs2] eax: 00000029 ebx: ccbe5ca4 ecx: ccbe5a80 edx: d92d1be1 esi: d38f63d4 edi: d38f63d4 ebp: d38f640c esp: ccbe5bfc ds: 007b es: 007b ss: 0069 Process build-locale-ar (pid: 424, ti=ccbe5000 task=c0c88930 task.ti=ccbe5000) Stack: d95f0000 00000000 d95f02f0 ccbe5ca4 d35a9aa8 d95f0000 d92bd847 d35a9aa8 d35a9f38 00000000 d38f63d4 00000003 00000e10 00000000 d3ae7000 00004202 c0430a75 ccbe5ca4 d35a9b90 d35a9aa8 d35a9b80 d92c46af ccbe5ca4 ccbe5d4c Call Trace: [<d92bd847>] gfs2_glock_nq_atime+0xd7/0x2a5 [gfs2] [<c0430a75>] init_waitqueue_head+0x12/0x1d [<d92c46af>] gfs2_readpages+0x5a/0x199 [gfs2] [<d92c2187>] gfs2_meta_reread+0x59/0xc2 [gfs2] [<c044bcb1>] get_page_from_freelist+0x1f2/0x380 [<d92c46a3>] gfs2_readpages+0x4e/0x199 [gfs2] [<d92c4655>] gfs2_readpages+0x0/0x199 [gfs2] [<c044d38b>] __do_page_cache_readahead+0x120/0x1c0 [<d92bb726>] gfs2_glock_put+0x7b/0x81 [gfs2] [<c05f094e>] _spin_unlock_irq+0x5/0x27 [<c044a0c1>] filemap_nopage+0x150/0x333 [<d92c962d>] gfs2_sharewrite_nopage+0xb5/0x29e [gfs2] [<d92c95b4>] gfs2_sharewrite_nopage+0x3c/0x29e [gfs2] [<c0453c37>] __handle_mm_fault+0x64c/0x1076 [<c05ef82c>] __mutex_unlock_slowpath+0xb0/0x10f [<c05efaad>] __mutex_lock_slowpath+0x21d/0x225 [<d92bb8e7>] gfs2_glmutex_lock+0x72/0x78 [gfs2] [<c04b6a6e>] selinux_vm_enough_memory+0x3b/0x51 [<c0458804>] __vm_enough_memory+0xc/0xd0 [<c0457566>] expand_stack+0x10f/0x118 [<c05f1e4c>] do_page_fault+0x704/0xc07 [<c044fd10>] vma_prio_tree_insert+0x17/0x2a [<c0458e15>] do_mmap_pgoff+0x54d/0x6a0 [<c05f1748>] do_page_fault+0x0/0xc07 [<c0404e9b>] error_code+0x2b/0x30 Code: 0c 74 0e 89 d0 8b 10 0f 18 02 90 39 e8 75 ef eb 22 8b 50 3c b8 d0 1b 2d d9 e8 cf da 17 e7 8b 53 3c b8 e1 1b 2d d9 e8 c2 da 17 e7 <0f> 0b 95 04 2b 1b 2d d9 8b 6b 0c 8d 4f 50 8b 47 50 eb 07 39 68 EIP: [<d92bc16d>] gfs2_glock_nq+0x8f/0x14c [gfs2] SS:ESP 0069:ccbe5bfc
The important information is the new: and original: which should have been printed right before the stack trace. Can you get that for me please? The problem is due to an attempt at recursive locking (which the glock layer doesn't allow any more) and that will tell me which locks were involved.
Not seeing anything along the lines of new: or original: logged anywhere But it happens trivially just by booting with today's rawhide (or the test2 candidate tree) with 'linux gfs2' and then selecting gfs2 as the fs type to use
I think I know why this happens. I believe its related to taking page faults in an mmaped() area of memory. I'm surprised that you don't see the printk's though as lines 1171-1173 of glock.c read: print_symbol(KERN_WARNING "original: %s\n", existing->gh_ip); print_symbol(KERN_WARNING "new: %s\n", gh->gh_ip); BUG(); Perhaps warnings get put somewhere different and I should use another log level? In the mean time I'm looking at the path from fs/gfs2/ops_vm.c:gfs2_sharewrite_nopage through to ops_address.c:readpage(s) and wondering how best to indicate to the latter that they've been called via this path in order that they don't do their own locking as they would usually do. I think this is the correct solution to the problem.
Created attachment 133633 [details] Test patch to fix this bug Once I've confirmed that this is indeed the correct fix with a bit more testing, I'll commit it to the git tree for gfs2.
I've done some more testing and it looks like its the right fix, but I've run into what looks like another manifestation of bug #201082, so I'm going to commit the patch as it is and then transfer further work to that bug unless anybody finds any evidence otherwise.