Description of problem: I grew an XFS-on-top-of-LVM partition like this : # lvm lvextend -L+5G /dev/data/bck # xfs_growfs /dev/data/bck And all went fine, but to further recover free space, I decided to delete some old mail backups (Maildir, plenty of small files, many hardlinked across directories), but the rm command segfaulted, so I ran it again, but it is now defunct and I got this from dmesg : Unable to handle kernel NULL pointer dereference at virtual address 00000008 printing eip: 82c6c6ba *pde = 00003001 Oops: 0000 [#1] SMP Modules linked in: nfsd exportfs lockd md5 ipv6 sunrpc tg3 ip_conntrack_ftp ipt_limit ipt_state ip_conntrack ipt_multiport iptable_filter ip_tables floppy sg microcode xfs button battery asus_acpi ac ext3 jbd dm_mod megaraid sd_mod scsi_modCPU: 2 EIP: 0060:[<82c6c6ba>] Not tainted EFLAGS: 00010206 (2.6.8-1.521smp) EIP is at validate_fields+0x1a/0x8c [xfs] eax: 00000000 ebx: 00000000 ecx: 00000080 edx: 0f8c6eb8 esi: 6e9ef0a4 edi: 6e9ef0a4 ebp: 2200ea64 esp: 0f8c6eb8 ds: 007b es: 007b ss: 0068 Process rm (pid: 8120, threadinfo=0f8c6000 task=809862b0) Stack: 000020c0 00000002 000201c0 00000064 00000065 0e9c9082 00000000 0017a000 00000000 00010000 4178cc16 0969e318 4178cc17 0b7d8b50 4178cc17 0b7d8b50 00008180 00000000 00000288 00000000 00000000 ffffffff ffffffff 7e73732c Call Trace: [<82c6cbeb>] linvfs_unlink+0x27/0x2f [xfs] [<0216aa80>] vfs_unlink+0x182/0x1c1 [<0216ab6e>] sys_unlink+0xaf/0x131 [<02158bb5>] put_user_size+0x29/0x2d [<0216d651>] sys_getdents64+0xa0/0xaa Code: 8b 58 08 6a 00 ff 53 18 5a 85 c0 75 5d 0f b7 44 24 0a 89 46 Version-Release number of selected component (if applicable): 2.6.8-1.521smp How reproducible: Uknown. This is a 2TB LVM volume heavily accessed through nfs. Rebooting will be a problem... Additional info: I've already grown that very partition at least 2 or 3 times, and other volumes on the same machine that are also formatted with XFS too without any problems up to now.
I just ran dmesg again and now have this appended to the above : <4>xfs_inotobp: xfs_imap() returned an error 22 on dm-1. Returning error. xfs_iunlink_remove: xfs_inotobp() returned an error 22 on dm-1. Returning error. xfs_inactive: xfs_ifree() returned an error = 22 on dm-1 xfs_force_shutdown(dm-1,0x1) called from line 1759 of file fs/xfs/xfs_vnodeops.c. Return address = 0x82c6f03e Filesystem "dm-1": I/O Error Detected. Shutting down filesystem: dm-1 Please umount the filesystem, and rectify the problem(s) ------------[ cut here ]------------ kernel BUG at fs/inode.c:1122! invalid operand: 0000 [#2] SMP Modules linked in: nfsd exportfs lockd md5 ipv6 sunrpc tg3 ip_conntrack_ftp ipt_limit ipt_state ip_conntrack ipt_multiport iptable_filter ip_tables floppy sg microcode xfs button battery asus_acpi ac ext3 jbd dm_mod megaraid sd_mod scsi_modCPU: 0 EIP: 0060:[<0217554b>] Not tainted EFLAGS: 00010246 (2.6.8-1.521smp) EIP is at iput+0x19/0x61 eax: 82c83300 ebx: 370799a4 ecx: 370799b4 edx: 37079900 esi: 42619f24 edi: 370799a4 ebp: 00000020 esp: 7d2d8bec ds: 007b es: 007b ss: 0068 Process nfsd (pid: 13808, threadinfo=7d2d8000 task=7d1ad270) Stack: 42619f2c 02171cf7 00000000 00000000 7d2d8000 00000000 00000001 00000000 08b140a4 00000000 0217243d 022db178 00000000 00000000 2c11402c 08b140a4 00000000 08b140a4 11270000 02172704 7d2d8c60 82c60a49 2c11402c 82c82e40 Call Trace: [<02171cf7>] prune_dcache+0x1ff/0x2db [<0217243d>] d_alloc+0xa2/0x25e [<02172704>] d_alloc_anon+0x2c/0x1cd [<82c60a49>] xfs_vget+0x95/0x9b [xfs] [<82c6ec1a>] linvfs_get_dentry+0x57/0x6c [xfs] [<82a5e02d>] find_exported_dentry+0x2d/0x7fe [exportfs] [<0228d335>] qdisc_restart+0x13/0x230 [<82b923da>] ip_refrag+0x1a/0x58 [ip_conntrack] [<02158ba8>] put_user_size+0x1c/0x2d [<0227e9dc>] memcpy_toiovec+0x27/0x49 [<0227ef36>] skb_copy_datagram_iovec+0x4f/0x1e1 [<0227c921>] release_sock+0xa5/0xab [<022a0819>] tcp_recvmsg+0x63b/0x676 [<0227ca0d>] sock_common_recvmsg+0x30/0x46 [<02279796>] sock_recvmsg+0xae/0xcb [<0211b20d>] recalc_task_prio+0x128/0x133 [<0211b29e>] activate_task+0x86/0x93 [<02115e55>] smp_send_reschedule+0x1a/0x1b [<0211d0bc>] __wake_up_common+0x36/0x5b [<0211d130>] __wake_up+0x4f/0x7f [<82cc8df9>] svc_sock_enqueue+0x255/0x25d [sunrpc] [<82cc9f5c>] svc_tcp_recvfrom+0x2fb/0x36d [sunrpc] [<0212eae4>] set_current_groups+0xb2/0xba [<82a5eac9>] export_decode_fh+0x50/0x56 [exportfs] [<82d64bcc>] nfsd_acceptable+0x0/0x11a [nfsd] [<82a5ea79>] export_decode_fh+0x0/0x56 [exportfs] [<82d65026>] fh_verify+0x340/0x4b3 [nfsd] [<82d64bcc>] nfsd_acceptable+0x0/0x11a [nfsd] [<02128cd6>] process_timeout+0x0/0x5 [<82d6c53c>] nfsd3_proc_getattr+0x6d/0x76 [nfsd] [<82d6db17>] nfs3svc_decode_fhandle+0x0/0x6b [nfsd] [<82d6377d>] nfsd_dispatch+0xbf/0x162 [nfsd] [<82cc894e>] svc_process+0x323/0x55f [sunrpc] [<82d634dd>] nfsd+0x275/0x456 [nfsd] [<82d63268>] nfsd+0x0/0x456 [nfsd] [<82d63268>] nfsd+0x0/0x456 [nfsd] [<021041f1>] kernel_thread_helper+0x5/0xb Code: 0f 0b 62 04 50 38 2f 02 85 c0 74 0b 8b 50 14 85 d2 74 04 89 It seems like the filesystem and the kernel really didn't like the unlink problem at all. Any ideas about what the cause could have been?
This same server, which was currently running 2.6.9-1.6_FC2smp now froze with the message below to be found in /var/log/messages after reboot (no system partition are xfs). It is now running 2.6.10-1.9_FC2smp. This problem is probably unrelated, but just in case... [...] Jan 25 04:26:21 filer02 kernel: nfsd: page allocation failure. order:4, mode:0x50 Jan 25 04:26:21 filer02 kernel: [<02140445>] __alloc_pages+0x2a4/0x2be Jan 25 04:26:21 filer02 kernel: [<02140477>] __get_free_pages+0x18/0x24 Jan 25 04:26:21 filer02 kernel: [<02143518>] kmem_getpages+0x1c/0xbf Jan 25 04:26:21 filer02 kernel: [<02144188>] cache_grow+0xff/0x1e4 Jan 25 04:26:21 filer02 kernel: [<0213ed67>] mempool_alloc+0x79/0x18e Jan 25 04:26:21 filer02 kernel: [<0214441d>] cache_alloc_refill+0x1b0/0x1ec Jan 25 04:26:21 filer02 kernel: [<021448f8>] __kmalloc+0x76/0x88 Jan 25 04:26:21 filer02 kernel: [<82c66ea5>] kmem_alloc+0x49/0x97 [xfs] Jan 25 04:26:21 filer02 kernel: [<82c66f6b>] kmem_realloc+0x17/0x52 [xfs] Jan 25 04:26:21 filer02 kernel: [<82c4b00a>] xfs_iext_realloc+0xc8/0xdb [xfs] Jan 25 04:26:21 filer02 kernel: [<82c288df>] xfs_bmap_insert_exlist+0x22/0x75 [xfs] Jan 25 04:26:21 filer02 kernel: [<82c25d36>] xfs_bmap_add_extent_hole_delay+0x42f/0x485 [xfs] Jan 25 04:26:21 filer02 kernel: [<02219e6b>] __elv_add_request+0x35/0x6a Jan 25 04:26:21 filer02 kernel: [<0221cb23>] __make_request+0x479/0x4e7 Jan 25 04:26:21 filer02 kernel: [<82c23757>] xfs_bmap_add_extent+0x152/0x3a5 [xfs] Jan 25 04:26:21 filer02 kernel: [<82c2a786>] xfs_bmapi+0x96c/0x1073 [xfs] Jan 25 04:26:21 filer02 kernel: [<82c28f41>] xfs_bmap_search_extents+0x53/0x5a [xfs]do_IRQ: stack overflow: 456
Fedora Core 2 has now reached end of life, and no further updates will be provided by Red Hat. The Fedora legacy project will be producing further kernel updates for security problems only. If this bug has not been fixed in the latest Fedora Core 2 update kernel, please try to reproduce it under Fedora Core 3, and reopen if necessary, changing the product version accordingly. Thank you.