Description of problem: This is a request to backport the "iprune_sem" from 2.5.62/2.6.0 into AS2.1 as reported in IT#30898. Without the semaphore, the system could crash as: Oops: 0000 Kernel 2.4.9-e.12enterprise CPU: 0 EIP: 0010:[<c015caea>] Tainted: P EFLAGS: 00010286 EIP is at clear_inode [kernel] 0x11a eax: ba860e90 ebx: 00000000 ecx: e5d98e48 edx: d21edf80 esi: e5d98e40 edi: f094f3c8 ebp: 00000001 esp: d21edf38 ds: 0018 es: 0018 ss: 0018 Process kswapd (pid: 18, stackpage=d21ed000) Stack: f094f3c8 c015bdba d2106564 e7e6fe40 e5d98e40 d21edf80 c015cb85 e5d98e40 00000000 00000000 c015d007 d21edf80 000002e0 000005bf 00000001 00000003 000002dd e641c588 e4ffee48 e39dee48 d21ec000 00000008 0008e000 c01385b3 Call Trace: [<c015bdba>] destroy_inode [kernel] 0x2a [<c015cb85>] dispose_list [kernel] 0x45 [<c015d007>] prune_icache [kernel] 0x2d7 [<c01385b3>] __kmem_cache_shrink_locked [kernel] 0x53 [<c015d061>] shrink_icache_memory [kernel] 0x21 [<c013d76c>] kswapd [kernel] 0x13c [<c013d630>] kswapd [kernel] 0x0 [<c0105000>] stext [kernel] 0x0 [<c02f0018>] __kallsyms [kernel] 0x72e00 [<c0105000>] stext [kernel] 0x0 [<c0105836>] kernel_thread [kernel] 0x26 [<c013d630>] kswapd [kernel] 0x0 Code: 8b 40 30 85 c0 74 04 56 ff d0 58 8b 86 fc 00 00 00 85 c0 74 <0>Kernel panic: not continuing <0>Rebooting in 120 seconds..{{ Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info: 1. Patch will follow shortly. 2. Don Howard from sustainig engineering is handling this issue.
does this happen on untainted kernels too ????
It'll but we're not able to reliably recreate it.
Wendy, can you point me to a case where this happens on an untainted kernel? I've looked at IT 26999, 31051, 30898 - They are all tainted. Veritas has a known issue with a similar footprint to 31051 and 30898 where vx inodes end up being free()d via kernel inode reap code. Veritas has pushed a patch upstream that corrects this. It's included in e.31. I've been trying to wade through 26999 today to see if veritas is the cause there. Blocker bug clerical chores are getting in the way though.
Created attachment 96970 [details] iprune semaphore patch
The problem is in the base kernel - tainted kernel is not relevant. If this is accepted, we may want to do a parallel semaphore for dcache list.
ehm the vm won't destroy inodes with a usage count... and the other code is guaranteed to have usage count set afaics
also what is the point of the semaphore if the spinlock protects exactly the same....
I thought about this (why not just extending the inode_lock's coverage) but concluded that a new semaphore was probably added for performance reason. Note that the inode_lock is widely used so you want to release it as soon as possible while the new sempahore is just to prevent the inode code and cache releasing code step on each other's toes.
but.. if you look at your patch you take the semaphore right before you take the spinlock, and drop it right after you drop the spinlock... eg whats the point ??
First, it is not "MY" patch but I redo it on e.34 for testing. And second, read the code carefully, there is a "dispose_list()" between the unlock and the up. spin_unlock(&inode_lock); dispose_list(&throw_away); + up(&iprune_sem);
yes there is; a list of inodes nothing can get to anymore since it's a local (stack variable) list of inodes.... Finding inodes means searching the real list, something you need the inode lock for.
Ok, my dear gate keeper, this bugzilla is tracked by issue tracker. Our ping-pong discussions would make the issue record extremely long and hard to read. Let's move to email exchange. After we agree on something, we'll update it here. Make sense ?
We're going to hold the discussion using issue tracker #30898.
lsmod output from one of the affected hosts: Module Size Used by Tainted: P ide-cd 35328 0 (autoclean) cdrom 35520 0 (autoclean) [ide-cd] sg 35076 0 (autoclean) gab 238400 3 iscsi 40704 0 nfs 92192 3 (autoclean) lockd 61184 1 (autoclean) [nfs] sunrpc 86128 1 (autoclean) [nfs lockd] llt 111296 8 autofs 13796 0 (autoclean) (unused) openafs 560880 2 e100_2124k2 66968 1 tg3_12e3 49376 2 appletalk 29708 0 (autoclean) ipx 25492 0 (autoclean) vxio 657056 3 (autoclean) vxspec 4872 2 vxdmp 121304 4 lpfcdd 298216 4 megaraid 28288 6 aic7xxx 127200 0 cpqarray 23936 0 (unused) sd_mod 13888 10 scsi_mod 125980 6 [sg iscsi lpfcdd megaraid aic7xxx sd_mod] ext3 74240 4 jbd 55176 4 [ext3]
Havn't been able to find any hole within the dispose_list() that would cause this oops. The merits for this semaphore we identify so far is the protection of nodes_stat.nr_inodes which is for statistics report. The customer will be told that this patch doesn't pass code reivew and we no longer consider this is a high priority item unless it can be recreated and/or a full dump is provided.
Cleaning up old RHEL21 BZs. Closing this one as WONTFIX.