Description of problem: When we run one of our Cadence tools, kernel gets panic at fs/locks.c (line.no 1799). Here is the back trace: Jan 15 23:37:39 ldvlinux33 kernel BUG at fs/locks.c:1799! Jan 15 23:37:39 ldvlinux33 invalid operand: 0000 [#1] Jan 15 23:37:39 ldvlinux33 Modules linked in: netconsole nfs mvfs(U) vnode(U) nfsd exportfs lockd nfs_acl md5 ipv6 parport_pc lp parport autofs4 i2c_dev i2c_core sunrpc dm_mirror dm_multipath dm_mod joydev uhci_hcd snd_intel8x0 snd_ac97_codec snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd_page_alloc snd_mpu401_uart snd_rawmidi snd_seq_device snd soundcore 8139too mii floppy ext3 jbd aic7xxx sd_mod scsi_mod Jan 15 23:37:39 ldvlinux33 CPU: 0 Jan 15 23:37:39 ldvlinux33 EIP: 0060:[<c01810c1>] Tainted: PF VLI Jan 15 23:37:39 ldvlinux33 EFLAGS: 00010246 (2.6.9-34.EL) Jan 15 23:37:39 ldvlinux33 EIP is at locks_remove_flock+0x119/0x1b4 Jan 15 23:37:39 ldvlinux33 eax: f7ccfd10 ebx: f57cebd8 ecx: 00000000 edx: 00000081 Jan 15 23:37:39 ldvlinux33 esi: f3c50480 edi: f57ceb00 ebp: f43ee780 esp: f5ca0e24 Jan 15 23:37:39 ldvlinux33 ds: 007b es: 007b ss: 0068 Jan 15 23:37:39 ldvlinux33 Process ncvhdl_p (pid: 9951, threadinfo=f5ca0000 task=f49bf8f0) Jan 15 23:37:39 ldvlinux33 Stack: 00000001 f8d1eafe f43ee780 f57cebd8 f5ca0f30 c0180e2d 00000000 00001000 Jan 15 23:37:39 ldvlinux33 00000000 00000000 00000803 00000000 000026df 00000000 006d45de 00000000 Jan 15 23:37:39 ldvlinux33 45abc2eb 00000000 459a9d3d 00000000 459a9d3d f43ee780 00000201 00000000 Jan 15 23:37:39 ldvlinux33 Call Trace: Jan 15 23:37:39 ldvlinux33 [<f8d1eafe>] nfs_lock+0x0/0xc7 [nfs] Jan 15 23:37:39 ldvlinux33 [<c0180e2d>] locks_remove_posix+0x8f/0x20a Jan 15 23:37:39 ldvlinux33 [<c016993e>] __fput+0x41/0xee Jan 15 23:37:39 ldvlinux33 [<c016825a>] filp_close+0x59/0x5f Jan 15 23:37:39 ldvlinux33 [<f8b0440a>] mvop_linux_close_kernel+0xb/0x12 [vnode] Jan 15 23:37:39 ldvlinux33 [<f8b0386d>] mvop_linux_close+0x78/0x9c [vnode] Jan 15 23:37:39 ldvlinux33 [<f8c2624d>] mvfs_closev_ctx+0x15d/0x230 [mvfs] Jan 15 23:37:39 ldvlinux33 [<f8b00fd9>] vnode_fop_release+0x62/0x7d [vnode] Jan 15 23:37:39 ldvlinux33 [<c0169952>] __fput+0x55/0xee Jan 15 23:37:39 ldvlinux33 [<c016825a>] filp_close+0x59/0x5f Jan 15 23:37:39 ldvlinux33 [<c0123324>] put_files_struct+0x56/0xbf Jan 15 23:37:39 ldvlinux33 [<c0124317>] do_exit+0x2df/0x59c Jan 15 23:37:39 ldvlinux33 [<c012476c>] sys_exit_group+0x0/0xd Jan 15 23:37:39 ldvlinux33 [<c0311443>] syscall_call+0x7/0xb Jan 15 23:37:39 ldvlinux33 [<c031007b>] rwsem_down_read_failed+0x19f/0x204 Jan 15 23:37:39 ldvlinux33 Code: 38 39 68 3c 75 2d 0f b6 50 40 f6 c2 02 74 09 89 d8 e8 52 d8 ff ff eb 1d f6 c2 20 74 0e ba 02 00 00 00 89 d8 e8 19 e9 ff ff eb 0a <0f> 0b 07 07 94 38 32 c0 89 c3 8b 03 eb c4 b8 00 f0 ff ff 21 e0 Jan 15 23:37:39 ldvlinux33 <0>Fatal exception: panic in 5 seconds Jan 15 23:37:44 ldvlinux33 Kernel panic - not syncing: Fatal exception Version-Release number of selected component (if applicable): Other Kernels Tried: 1. Kernel 2.6.9-34 (Kernel Panic) 2. Kernel 2.6.9-11 (works fine) 3. Kernel 2.6.9-22 (Works fine) 4. Kernel 2.6.19.2 (Kernel Panic) What we found: It seems this bug has been introduced in the kernel (2.6.9-34) while fixing the bug (160844: dangling POSIX locks after close). While I further debug, identified the root cause: Two types of lock 1. FL_POSIX is created with calls to fcntl() 2. FL_FLOCK is created with calls to flock() it is fcntl() in older c library. In our case when we use 'FL_POSIX' the problem arise. In the call trace, the 'locks_remove_posix' function(fs/locks.c) is used to remove the FL_POSIX. Similarly we have another function called 'locks_remove_flock' which is used to remove the FL_FLOCK. In the Call trace, it makes a call to '__fput function(fs/file_table.c), which is called from task context when aio completion releases the last use of a struct file *. This in turn makes a call to locks_remove_flock(file) Here is the code snippet which has introduced the bug: The bug is the __fput function simply makes a call to locks_remove_flock(file)' without checking whether it is FL_POSIX or FL_FLOCK. ---------------- fs/file_table.c eventpoll_release(file); locks_remove_flock(file); if (file->f_op && file->f_op->release) file->f_op->release(inode, file); ---------------------- Actually the above code should have been Written like this: (My suggestion) fs/file_table.c eventpoll_release(file); if(file->f_op->flock) //FL_FLOCK //Just a suggestion, I havent tried. { locks_remove_flock(file); } else if(file->f_op->lock) //FL_POSIX { locks_remove_posix(file); } OR fs/locks.c Restore the old code as if (IS_FLOCK(fl) || IS_POSIX(fl)) { //Tried successfully locks_delete_lock(before); continue; } if (IS_LEASE(fl)) { lease_modify(before, F_UNLCK); continue; } Instead of if (IS_FLOCK(fl)) { //which has been introduced in 2.6.9-34) locks_delete_lock(before); continue; } if (IS_LEASE(fl)) { lease_modify(before, F_UNLCK); continue; } Earlier in 2.6.9-11 the locks_remove_flock (fs/locks.c) checks for both the FL_FLOCK and FL_POSIX as below. So we didn't get any crash. if (IS_FLOCK(fl) || IS_POSIX(fl)) { locks_delete_lock(before); continue; } if (IS_LEASE(fl)) { lease_modify(before, F_UNLCK); continue; } /* What? */ BUG(); In 2.6.9-34 & 2.6.9-42(u3 & u4) to fix the 'dangling POSIX locks after close' they altered the code as follows: if (IS_FLOCK(fl) { locks_delete_lock(before); continue; } if (IS_LEASE(fl)) { lease_modify(before, F_UNLCK); continue; } /* What? */ BUG(); As the lock is FL_POSIX, it simply skipped both the conditions and reached the BUG(); so the kernel panic. So the fix could be to introduce the following code snippet in fs/file_table.c if(file->f_op->flock) //FL_FLOCK //Just a suggestion, I havent tried. { locks_remove_flock(file); } else if(file->f_op->lock) //FL_POSIX { locks_remove_posix(file); } Or Re-introduce the IS_POSIX() checking in fs/locks.c code: if (IS_FLOCK(fl) || IS_POSIX(fl)) { //Tried Successfully locks_delete_lock(before); continue; } if (IS_LEASE(fl)) { lease_modify(before, F_UNLCK); continue; } Of course we need to check whether it breaks anything else. Conclusion: In fact I tried the second option. if (IS_FLOCK(fl) || IS_POSIX(fl)) { locks_delete_lock(before); continue; } if (IS_LEASE(fl)) { lease_modify(before, F_UNLCK); continue; } I edited the code fs/locks.c in 2.6.9-42 as above mentioned. It works fine, with no more kernel panics. How reproducible: Only reproducable in house Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
Here is the proposed patch for this issue. It has been widely tested by cadence in their environment and does solve the problem: --- ./linux-2.6.9-42/fs/locks.c 2007-05-15 19:35:17.000000000 +0530 +++ ./linux-2.6.9-42CDN1/fs/locks.c 2007-05-15 19:21:02.000000000 +0530 @@ -1786,7 +1786,7 @@ while ((fl = *before) != NULL) { if (fl->fl_file == filp) { - if (IS_FLOCK(fl)) { + if (IS_FLOCK(fl) || IS_POSIX(fl)) { locks_delete_lock(before); continue; }
Ok, in looking over this briefly, I have some questions that need to be answered before we can consider this patch. Given the the stack trace above, locks_remove_posix should have been called on this filp already when this machine crashed. Why are there any POSIX locks left at all? That BUG() call would seem to be appropriate to me. There should be no reason for a function intended to remove flock locks to deal with POSIX locks. I see 2 possibilities: 1) locks were interated over in the loop in locks_remove_posix, but were not released for some reason (maybe the posix_same_owner(fl, &lock) check failed?) 2) locks were slipped into the list during or after the locks_remove_posix call, but before locks_remove_flock was run. What might be appropriate is some instrumentation that tries to determine why these locks were left, though it may be possible to track down the lock in a core and see if whether the lock owner was correct. Unfortunately, I'm guessing that these are mvfs locks, and I presume they have their own lock ops. We'll likely need for IBM to take the lead on this and tell us why there are still posix locks on the list.
It's also possible that this is a dupe of bz 211092, which recently had a patch posted. Have they tested a kernel with that patch?
Actually this looks identical to bz 240403... *** This bug has been marked as a duplicate of 240403 ***