Bug 179672
Summary: | withdraw causes kernel panic | ||||||
---|---|---|---|---|---|---|---|
Product: | [Retired] Red Hat Cluster Suite | Reporter: | Ryan O'Hara <rohara> | ||||
Component: | dlm | Assignee: | Ryan O'Hara <rohara> | ||||
Status: | CLOSED WONTFIX | QA Contact: | Cluster QE <mspqa-list> | ||||
Severity: | medium | Docs Contact: | |||||
Priority: | medium | ||||||
Version: | 4 | CC: | aaranya, ccaulfie, cluster-maint, juanino, k.georgiou, rohara | ||||
Target Milestone: | --- | ||||||
Target Release: | --- | ||||||
Hardware: | i686 | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2007-08-22 18:30:10 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Ryan O'Hara
2006-02-01 23:23:42 UTC
Created attachment 124017 [details]
log file showing withdraw and panic.
This bug happened for me over NFS with 2.6.9-34.ELsmp kernel. I put it in the Linux kernel bugzilla (http://bugzilla.kernel.org/show_bug.cgi?id=3986) but the kernel maintainers are unwilling to help with a 2.6.9 kernel. The scenario where it occurred for me is when I tried to interrupt a parallel make that writes over NFS. Hardware Environment: 2-Dual core AMD Opteron , 4GB RAM Software Environment: RHEL 4 SMP kernel. Process g++ while trying to write over NFS. Problem Description: Apr 14 15:10:21 hfs12 kernel: kernel BUG at fs/locks.c:1799! Apr 14 15:10:21 hfs12 kernel: invalid operand: 0000 [#1] Apr 14 15:10:21 hfs12 kernel: SMP Apr 14 15:10:21 hfs12 kernel: Modules linked in: nfsd exportfs parport_pc lp parport autofs4 i2c_dev i2c_core nfs lockd nfs_acl sunrpc dm_mirror dm_mod button battery ac md5 ipv6 ohci_hcd hw_random e100 mii tg3 floppy ext3 jbd raid0 sata_sil libata sd_mod scsi_mod Apr 14 15:10:21 hfs12 kernel: CPU: 1 Apr 14 15:10:21 hfs12 kernel: EIP: 0060:[<c016dd4c>] Not tainted VLI Apr 14 15:10:21 hfs12 kernel: EFLAGS: 00010246 (2.6.9-34.ELsmp) Apr 14 15:10:21 hfs12 kernel: EIP is at locks_remove_flock+0xa1/0xe1 Apr 14 15:10:21 hfs12 kernel: eax: f64efa8c ebx: f5be620c ecx: 00000000 edx: 00000081 Apr 14 15:10:21 hfs12 kernel: esi: 00000000 edi: f5be6164 ebp: f58c06c0 esp: f40b3f2c Apr 14 15:10:21 hfs12 kernel: ds: 007b es: 007b ss: 0068 Apr 14 15:10:21 hfs12 kernel: Process g++-4.0 (pid: 14863, threadinfo=f40b3000 task=f36ef830) Apr 14 15:10:21 hfs12 kernel: Stack: f58c06c0 f896643a f40b3f44 f8966e2a f8c3abd7 c016dca4 f40b3f6c 00000001 Apr 14 15:10:21 hfs12 kernel: 00000000 00000001 f5be60f8 f378c3c0 00003a0f f8c426ac 00000000 ffffffff Apr 14 15:10:21 hfs12 kernel: f6020f40 f58c06c0 00000201 00000000 00000000 00000246 00000000 f58c06c0 Apr 14 15:10:21 hfs12 kernel: Call Trace: Apr 14 15:10:21 hfs12 kernel: [<f896643a>] nlm_put_lockowner+0x11/0x49 [lockd] Apr 14 15:10:21 hfs12 kernel: [<f8966e2a>] nlmclnt_locks_release_private+0xb/0x14 [lockd] Apr 14 15:10:21 hfs12 kernel: [<f8c3abd7>] nfs_lock+0x0/0xc7 [nfs] Apr 14 15:10:21 hfs12 kernel: [<c016dca4>] locks_remove_posix+0x130/0x137 Apr 14 15:10:21 hfs12 kernel: [<f8c426ac>] nfs_wait_on_requests+0x7e/0xba [nfs] Apr 14 15:10:21 hfs12 kernel: [<c015b0c6>] __fput+0x41/0x100 Apr 14 15:10:21 hfs12 kernel: [<c0159d21>] filp_close+0x59/0x5f Apr 14 15:10:21 hfs12 kernel: [<c02d2657>] syscall_call+0x7/0xb Apr 14 15:10:21 hfs12 kernel: Code: 38 39 68 2c 75 2d 0f b6 50 30 f6 c2 02 74 09 89 d8 e8 b3 df ff ff eb 1d f6 c2 20 74 0e ba 02 00 00 00 89 d8 e8 ce ec ff ff eb 0a <0f> 0b 07 07 6c 74 2e c0 89 c3 8b 03 eb c4 b8 00 f0 ff ff 21 e0 Apr 14 15:10:21 hfs12 kernel: <0>Fatal exception: panic in 5 seconds I don't think that the original bug reported here is related to comment #2. The original post was seen while using GFS in a clustered environment and a node withdrew from the filesystem. The problem reported in comment #2 is occuring without GFS being involved. While they could be related, I think it is more likely that they are two different problems that happen to panic in the same place. Seems like flock's are not being cleaned-up properly. I'm letting Ryan decide what the status of this is. Moving this to NEEDINFO, haven't been able to recreate this one recently. Have not seen this problem in quite some time. Closing. |