Bug 228273
Summary: | nfsd process, kernel BUG at include/asm/spinlock.h:133! | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 4 | Reporter: | Primoz Tolar <primoz.tolar> | ||||
Component: | kernel | Assignee: | Peter Staubach <staubach> | ||||
Status: | CLOSED DUPLICATE | QA Contact: | Brian Brock <bbrock> | ||||
Severity: | high | Docs Contact: | |||||
Priority: | medium | ||||||
Version: | 4.4 | CC: | leroy.vanlogchem, steved | ||||
Target Milestone: | --- | ||||||
Target Release: | --- | ||||||
Hardware: | i386 | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2007-02-28 21:33:43 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Primoz Tolar
2007-02-12 12:59:03 UTC
Created attachment 147889 [details]
kernel panic messages
Tonight, a different kernel module crashed the server: Feb 13 01:26:11 adore kernel: eip: c0172b81 Feb 13 01:26:11 adore kernel: ------------[ cut here ]------------ Feb 13 01:26:11 adore kernel: kernel BUG at include/asm/spinlock.h:133! Feb 13 01:26:11 adore kernel: invalid operand: 0000 [#1] Feb 13 01:26:11 adore kernel: SMP Feb 13 01:26:11 adore kernel: Modules linked in: mptctl mptbase 8021q sg cpqci(U) nfs nfsd exportfs lockd nfs_acl parport_pc lp parport autofs4 i2c_dev i2c_core sunrpc button battery ac md5 ipv6 joydev ehci_hcd uhci_hcd bnx2 qla2400 qla2300 dm_snapshot dm_zero dm_mirror ext3 jbd dm_mod qla2322 cciss qla2xxx scsi_transport_fc sd_mod scsi_mod Feb 13 01:26:11 adore kernel: CPU: 2 Feb 13 01:26:11 adore kernel: EIP: 0060:[<c02d3296>] Tainted: P VLI Feb 13 01:26:11 adore kernel: EFLAGS: 00010216 (2.6.9-42.ELsmp) Feb 13 01:26:11 adore kernel: EIP is at _spin_lock+0x1c/0x34 Feb 13 01:26:11 adore kernel: eax: c02e6f92 ebx: f708221c ecx: df2adee0 edx: c0172b81 Feb 13 01:26:11 adore kernel: esi: f7082214 edi: f708221c ebp: 00000020 esp: df2adee4 Feb 13 01:26:11 adore kernel: ds: 007b es: 007b ss: 0068 Feb 13 01:26:11 adore kernel: Process vbda (pid: 17506, threadinfo=df2ad000 task=f7102d30) Feb 13 01:26:11 adore kernel: Stack: 00000000 c0172b81 00000000 df2adf20 f58669d8 000001f0 c0172482 f7082214 Feb 13 01:26:11 adore kernel: 45d105a3 00000000 f5866a48 00000000 f58669d8 df2adf60 c0159c47 000001f0 Feb 13 01:26:11 adore kernel: 00000000 00000000 00000000 f7e2c600 f890e7f5 45d092cc 00000000 45ca4bd7 Feb 13 01:26:11 adore kernel: Call Trace: Feb 13 01:26:11 adore kernel: [<c0172b81>] dnotify_parent+0x1b/0x6e Feb 13 01:26:11 adore kernel: [<c0172482>] notify_change+0x1e2/0x1ec Feb 13 01:26:11 adore kernel: [<c0159c47>] sys_utime+0x110/0x134 Feb 13 01:26:11 adore kernel: [<c01268d8>] current_fs_time+0x44/0x4c Feb 13 01:26:11 adore kernel: [<c016b158>] filldir64+0x0/0x11a Feb 13 01:26:11 adore kernel: [<c0171bc9>] update_atime+0x6e/0x93 Feb 13 01:26:11 adore kernel: [<c016f070>] dput+0x34/0x1a7 Feb 13 01:26:11 adore kernel: [<c015bc5b>] __fput+0xda/0x100 Feb 13 01:26:11 adore kernel: [<c02d4703>] syscall_call+0x7/0xb Feb 13 01:26:11 adore kernel: [<c02d007b>] packet_rcv+0x242/0x307 Feb 13 01:26:11 adore kernel: Code: 00 75 09 f0 81 02 00 00 00 01 30 c9 89 c8 c3 53 89 c3 81 78 04 ad 4e ad de 74 18 ff 74 24 04 68 92 6f 2e c0 e8 0a f6 e4 ff 58 5a <0f> 0b 85 00 fd 5f 2e c0 f0 fe 0b 79 09 f3 90 80 3b 00 7e f9 eb Feb 13 01:26:11 adore kernel: <0>Fatal exception: panic in 5 seconds Yesterday we had three of these panics too. Listed below are two of them: Feb 12 09:58:15 kernel: eip: c016ff1d Feb 12 09:58:15 kernel: ------------[ cut here ]------------ Feb 12 09:58:15 kernel: kernel BUG at include/asm/spinlock.h:133! Feb 12 09:58:15 kernel: invalid operand: 0000 [0000001] Feb 12 09:58:15 kernel: SMP Feb 12 09:58:15 kernel: Modules linked in: nfsd exportfs nfs lockd nfs_acl sunrpc drbd(U) md5 ipv6 autofs4 dm_mod button battery ac uhci_hcd ehci_hcd e752x_edac edac_mc e1000 floppy aic79xx sg ext3 jbd arcmsr(U) sd_mod scsi_mod Feb 12 09:58:15 kernel: CPU: 0 Feb 12 09:58:15 kernel: EIP: 0060:[<c02d335a>] Not tainted VLI Feb 12 09:58:15 kernel: EFLAGS: 00010216 (2.6.9-42.0.3.ELsmp) Feb 12 09:58:15 kernel: EIP is at _spin_lock+0x1c/0x34 Feb 12 09:58:15 kernel: eax: c02e7076 ebx: d64656dc ecx: cbd98d38 edx: c016ff1d Feb 12 09:58:15 kernel: esi: d646573c edi: 00000000 ebp: edd84e8c esp: cbd98d3c Feb 12 09:58:15 kernel: ds: 007b es: 007b ss: 0068 Feb 12 09:58:15 kernel: Process smbd (pid: 13923, threadinfo=cbd98000 task=e5ba42b0) Feb 12 09:58:15 kernel: Stack: d64656d4 c016ff1d 00000000 c5782058 eb0296a9 e6648c8e 00000007 cbd98db4 Feb 12 09:58:15 kernel: e6648c8e cbd98f0c 00000000 cbd98db4 c0166777 ee942100 cbd98dac e6648c8e Feb 12 09:58:15 kernel: d9e3c0f8 e6648c8e cbd98f0c c0166b19 00000000 00000003 e032ba94 00000000 Feb 12 09:58:15 kernel: Call Trace: Feb 12 09:58:15 kernel: [<c016ff1d>] __d_lookup+0x8a/0x109 Feb 12 09:58:15 kernel: [<c0166777>] do_lookup+0x1f/0x8f Feb 12 09:58:15 kernel: [<c0166b19>] __link_path_walk+0x332/0xbb5 Feb 12 09:58:15 kernel: [<c01673df>] link_path_walk+0x43/0xbe Feb 12 09:58:15 kernel: [<c016720b>] __link_path_walk+0xa24/0xbb5 Feb 12 09:58:15 kernel: [<c01673df>] link_path_walk+0x43/0xbe Feb 12 09:58:15 kernel: [<c0167774>] path_lookup+0x14b/0x17f Feb 12 09:58:15 kernel: [<c01678bc>] __user_walk+0x21/0x51 Feb 12 09:58:15 kernel: [<c0162b41>] vfs_stat+0x14/0x3a Feb 12 09:58:15 kernel: [<c016314a>] sys_stat64+0xf/0x23 Feb 12 09:58:15 kernel: [<c02d47cb>] syscall_call+0x7/0xb Feb 12 09:58:15 kernel: [<c02d007b>] packet_rcv+0x17e/0x307 Feb 12 09:58:15 kernel: Code: 00 75 09 f0 81 02 00 00 00 01 30 c9 89 c8 c3 53 89 c3 81 78 04 ad 4e ad de 74 18 ff 74 24 04 68 76 70 2e c0 e8 46 f5 e4 ff 58 5a <0f> 0b 85 00 e1 60 2e c0 f0 fe 0b 79 09 f3 90 80 3b 00 7e f9 eb Feb 12 09:58:15 kernel: <0>Fatal exception: panic in 5 seconds And another one a while later Feb 12 10:30:26 kernel: eip: c016ff1d Feb 12 10:30:26 kernel: ------------[ cut here ]------------ Feb 12 10:30:26 kernel: kernel BUG at include/asm/spinlock.h:133! Feb 12 10:30:26 kernel: invalid operand: 0000 [0000001] Feb 12 10:30:26 kernel: SMP Feb 12 10:30:26 kernel: Modules linked in: nfsd exportfs lockd nfs_acl sunrpc drbd(U) md5 ipv6 autofs4 dm_mod button battery ac uhci_hcd ehci_hcd e752x_edac edac_mc e1000 floppy aic79xx sg ext3 jbd arcmsr(U) sd_mod scsi_mod Feb 12 10:30:26 kernel: CPU: 0 Feb 12 10:30:26 kernel: EIP: 0060:[<c02d335a>] Not tainted VLI Feb 12 10:30:26 kernel: EFLAGS: 00010216 (2.6.9-42.0.3.ELsmp) Feb 12 10:30:26 kernel: EIP is at _spin_lock+0x1c/0x34 Feb 12 10:30:26 kernel: eax: c02e7076 ebx: d46fb644 ecx: d1eace9c edx: c016ff1d Feb 12 10:30:26 kernel: esi: d46fb6a4 edi: e8e6c2ac ebp: e8e6c2ac esp: d1eacea0 Feb 12 10:30:26 kernel: ds: 007b es: 007b ss: 0068 Feb 12 10:30:26 kernel: Process nfsd (pid: 5103, threadinfo=d1eac000 task=ef4ed6b0) Feb 12 10:30:26 kernel: Stack: d46fb63c c016ff1d 00000000 c57bdbf0 d1ea1090 11dd5c7d 00000005 d1eacf00 Feb 12 10:30:26 kernel: 00000000 d1eacf00 e8e6c2ac 00000000 c01664dc 00000000 e8e6c2ac d1eacf00 Feb 12 10:30:26 kernel: 00000000 c01677ee d1f9cbc0 ffffffff d1ea1095 11dd5c7d e8e6c2ac c016788c Feb 12 10:30:26 kernel: Call Trace: Feb 12 10:30:26 kernel: [<c016ff1d>] __d_lookup+0x8a/0x109 Feb 12 10:30:26 kernel: [<c01664dc>] cached_lookup+0xf/0x56 Feb 12 10:30:26 kernel: [<c01677ee>] __lookup_hash+0x46/0x89 Feb 12 10:30:26 kernel: [<c016788c>] lookup_one_len+0x54/0x63 Feb 12 10:30:26 kernel: [<fac0fe18>] nfsd_lookup+0x321/0x3ad [nfsd] Feb 12 10:30:26 kernel: [<fabbb442>] svcauth_unix_set_client+0xa7/0xb5 [sunrpc] Feb 12 10:30:26 kernel: [<fac18275>] nfsd3_proc_lookup+0xa9/0xb3 [nfsd] Feb 12 10:30:26 kernel: [<fac1a253>] nfs3svc_decode_diropargs+0x0/0xfa [nfsd] Feb 12 10:30:26 kernel: [<fac0d681>] nfsd_dispatch+0xba/0x16d [nfsd] Feb 12 10:30:26 kernel: [<fabb8603>] svc_process+0x432/0x6e3 [sunrpc] Feb 12 10:30:26 kernel: [<fac0d45a>] nfsd+0x1cc/0x339 [nfsd] Feb 12 10:30:26 kernel: [<fac0d28e>] nfsd+0x0/0x339 [nfsd] Feb 12 10:30:26 kernel: [<c01041f5>] kernel_thread_helper+0x5/0xb Feb 12 10:30:26 kernel: Code: 00 75 09 f0 81 02 00 00 00 01 30 c9 89 c8 c3 53 89 c3 81 78 04 ad 4e ad de 74 18 ff 74 24 04 68 76 70 2e c0 e8 46 f5 e4 ff 58 5a <0f> 0b 85 00 e1 60 2e c0 f0 fe 0b 79 09 f3 90 80 3b 00 7e f9 eb Feb 12 10:30:26 kernel: <0>Fatal exception: panic in 5 seconds The netdump service has been enabled, next time I hope to gather a coredump. Reproduce is easy. Make shell script: mount -o vers=2 nfsserver:/share /point cp some file sleep 1; umount repeat disabling hyper-threading in cpu helps, also booting uni-cpu kernel makes crash not soo often and in different place http://people.redhat.com/~jbaron/rhel4/SRPMS.kernel/kernel-2.6.9-48.EL.src.rpm solved the problem here, YMMV. I think, this is duplicate of BUG 191831. Thanks for this. Is this latest kernel tested? Is it supported? We are going to put it on production server. (after some testing, of course :) Hmm, i expected different result with -48 kernel: (from Solaris 8/Sparc box) mount -o vers=2 srv2:/share /mnt/nfs/share NFS getattr failed for server srv2: error 9 (RPC: Program/version mismatch) nfs mount: mount: /mnt/nfs/share: I/O error The situation described in Comment #7 is a bug in Solaris. I recall fixing it quite sometime ago in a newer Solaris release. I would suggest upgrading or at least complaining to the Solaris folks. It is definitely fixed in Solaris 10 GA. *** This bug has been marked as a duplicate of 191831 *** |