Bug 214543

Summary: kernel crash with nfs rman backup
Product: Red Hat Enterprise Linux 4 Reporter: zhang <e4glez>
Component: kernelAssignee: Ric Wheeler <rwheeler>
Status: CLOSED WONTFIX QA Contact: Brian Brock <bbrock>
Severity: high Docs Contact:
Priority: medium    
Version: 4.3CC: jbaron
Target Milestone: ---   
Target Release: ---   
Hardware: i386   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2010-03-16 17:25:25 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description zhang 2006-11-08 03:16:59 UTC
Description of problem:
crash with nfs rman backup,
when I backup oracle db to nfs storage with rman,
there's no return with "alter database backup controlfile to * ",
and the system will crash latter,
maybe it's a nfslock problem

Version-Release number of selected component (if applicable):
2.6.9-34.0.1.ELsmp

How reproducible:
   file locks over nfs

Steps to Reproduce:
1. mount nfs:/export /mnt/nfs
2. rman target /
3. sql "alter database backup controlfile to ''/mnt/nfs/control.ctl''"
  
Actual results:

no reponse

Expected results:


Additional info:
failed message in /var/log/messages

Nov  2 12:35:33 localhost kernel: ------------[ cut here ]------------
Nov  2 12:35:33 localhost kernel: kernel BUG at fs/locks.c:1799!
Nov  2 12:35:33 localhost kernel: invalid operand: 0000 [#1]
Nov  2 12:35:33 localhost kernel: SMP
Nov  2 12:35:33 localhost kernel: Modules linked in: ipmi_devintf ipmi_si
ipmi_msghandler nfs lockd nfs_acl parport_pc lp parport autofs4 i2c_dev i2c_core
sunrpc button battery ac uhci_hcd ehci_hcd hw_random shpchp bnx2 dm_snapshot
dm_zero dm_mirror ext3 jbd dm_mod megaraid_sas sd_mod scsi_mod
Nov  2 12:35:33 localhost kernel: CPU:    3
Nov  2 12:35:33 localhost kernel: EIP:    0060:[<c016dd19>]    Not tainted VLI
Nov  2 12:35:33 localhost kernel: EFLAGS: 00210246   (2.6.9-34.0.1.ELsmp)
Nov  2 12:35:33 localhost kernel: EIP is at locks_remove_flock+0xa1/0xe1
Nov  2 12:35:33 localhost kernel: eax: f25687ec   ebx: e9bbd73c   ecx: 00000000
  edx: 00000000
Nov  2 12:35:33 localhost kernel: esi: 00000000   edi: e9bbd694   ebp: cbaa8180
  esp: e76def2c
Nov  2 12:35:33 localhost kernel: ds: 007b   es: 007b   ss: 0068
Nov  2 12:35:33 localhost kernel: Process oracle (pid: 10145,
threadinfo=e76de000 task=f726ae30)
Nov  2 12:35:33 localhost kernel: Stack: cbaa8180 f8d8443e e76def44 f8d84e2b
f8dd5bdb c016dc71 00000000 00000000
Nov  2 12:35:33 localhost kernel:        00000000 00d9000c 00000000 f2050280
000027a1 45495d61 00000000 45495d61
Nov  2 12:35:33 localhost kernel:        00000000 cbaa8180 00000201 00000000
00000000 00200246 00000000 cbaa8180
Nov  2 12:35:33 localhost kernel: Call Trace:
Nov  2 12:35:33 localhost kernel:  [<f8d8443e>] nlm_put_lockowner+0x11/0x49 [lockd]
Nov  2 12:35:33 localhost kernel:  [<f8d84e2b>]
nlmclnt_locks_release_private+0xb/0x14 [lockd]
Nov  2 12:35:33 localhost kernel:  [<f8dd5bdb>] nfs_lock+0x0/0xc7 [nfs]
Nov  2 12:35:33 localhost kernel:  [<c016dc71>] locks_remove_posix+0x130/0x137
Nov  2 12:35:33 localhost kernel:  [<c015affe>] __fput+0x41/0x100
Nov  2 12:35:33 localhost kernel:  [<c0159c3d>] filp_close+0x59/0x5f
Nov  2 12:35:33 localhost kernel:  [<c02cffe7>] syscall_call+0x7/0xb
Nov  2 12:35:33 localhost kernel: Code: 38 39 68 2c 75 2d 0f b6 50 30 f6 c2 02
74 09 89 d8 e8 4e df ff ff eb 1d 80 e2 20 74 0e ba 02 00 00 00 89 d8 e8 9a ec ff
ff eb 0a <0f> 0b 07 07 77 4a 2e c0 89 c3 8b 03 eb c4 b8 00 f0 ff ff 21 e0
Nov  2 12:35:33 localhost kernel:  <0>Fatal exception: panic in 5 seconds

Comment 1 Buck Huppmann 2006-11-20 18:21:09 UTC
FYI: this looks like it could potentially be the same as bz 207737

Comment 3 Ric Wheeler 2010-03-16 17:25:25 UTC
The last update here is 3+ years old. Please reopen if this is still an issue.