RHTS abort panic - http://rhts.redhat.com/cgi-bin/rhts/jobs.cgi?id=157128 2.6.33.3-rt19.15.el5rt rteval 1.21 x86_64 ibm-x3650m2-01.rhts.eng.bos.redhat.com I tried to attach to the console, but never got a system prompt to login. http://rhts.redhat.com/testlogs/2010/05/157128/403079/3266481/console.txt shows: Call Trace: [<ffffffff81143da8>] proc_flush_task+0xac/0x1ce [<ffffffff8106e90b>] ? rt_mutex_adjust_prio+0x3a/0x43 [<ffffffff81044972>] release_task+0x2d/0x3a3 [<ffffffff810451ba>] wait_consider_task+0x4d2/0x7af [<ffffffff81045589>] do_wait+0xf2/0x228 [<ffffffff81045766>] sys_wait4+0xa7/0xc4 [<ffffffff8104442a>] ? child_wait_callback+0x0/0x60 [<ffffffff81002d1b>] system_call_fastpath+0x16/0x1b Code: 8b 7d b0 48 83 c7 08 e8 5f 18 25 00 4c 8b 65 b0 c7 45 c4 00 00 00 00 4d 8b ac 24 a0 00 00 00 e9 cf 00 00 00 49 8d 9d 70 ff ff ff <4d> 8b 6d 00 4c 8d 73 08 4c 89 f7 e8 30 18 25 00 48 89 df e8 53 RIP [<ffffffff81105bfc>] shrink_dcache_parent+0xad/0x210 RSP <ffff88011a56fd28> CR2: 0000000000000000 ---[ end trace 7d5c2d6aabc01615 ]--- Kernel panic - not syncing: Fatal exception
Adding a few missing bits: BUG: unable to handle kernel NULL pointer dereference at (null) IP: [<ffffffff81105bfc>] shrink_dcache_parent+0xad/0x210 PGD 14516f067 PUD 10b82e067 PMD 0 Oops: 0000 [#1] PREEMPT SMP last sysfs file: /sys/devices/system/node/node1/cpumap CPU 7 Pid: 5287, comm: sh Not tainted 2.6.33.3-rt19.15.el5rt #1 46M7165 /IBM System x -[7947AC1]- RIP: 0010:[<ffffffff81105bfc>] [<ffffffff81105bfc>] shrink_dcache_parent+0xad/0x210 RSP: 0018:ffff88011a56fd28 EFLAGS: 00010207 RAX: ffff88011009acb0 RBX: ffffffffffffff70 RCX: ffffffff816c5860 RDX: ffffffff816c5860 RSI: 0000000000000003 RDI: ffff88011a56e000 RBP: ffff88011a56fd78 R08: ffff88011a56fc18 R09: ffff88011a56fd18 R10: ffff88011a56fc18 R11: ffff880172c47358 R12: ffff88011009ac10 R13: 0000000000000000 R14: ffff880168cfe740 R15: 000000000002555e FS: 00007f633d5006e0(0000) GS:ffff880183cc0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000000000 CR3: 0000000117c25000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process sh (pid: 5287, threadinfo ffff88011a56e000, task ffff88016f27c500) Stack: ffff880172c47358 ffff88017cc8d800 000000001a56fd98 ffff88011a56fda8 <0> 000000007c801168 ffff880172c47358 ffff88017cc88c80 0000000000002373 <0> ffff88011a56fdb8 ffff880264556df0 ffff88011a56fe08 ffffffff81143da8 Call Trace: [<ffffffff81143da8>] proc_flush_task+0xac/0x1ce [<ffffffff8106e90b>] ? rt_mutex_adjust_prio+0x3a/0x43 [<ffffffff81044972>] release_task+0x2d/0x3a3 [<ffffffff810451ba>] wait_consider_task+0x4d2/0x7af [<ffffffff81045589>] do_wait+0xf2/0x228 [<ffffffff81045766>] sys_wait4+0xa7/0xc4 [<ffffffff8104442a>] ? child_wait_callback+0x0/0x60 [<ffffffff81002d1b>] system_call_fastpath+0x16/0x1b Code: 8b 7d b0 48 83 c7 08 e8 5f 18 25 00 4c 8b 65 b0 c7 45 c4 00 00 00 00 4d 8b ac 24 a0 00 00 00 e9 cf 00 00 00 49 8d 9d 70 ff ff ff <4d> 8b 6d 00 4c 8d 73 08 4c 89 f7 e8 30 18 25 00 48 89 df e8 53 RIP [<ffffffff81105bfc>] shrink_dcache_parent+0xad/0x210 RSP <ffff88011a56fd28> CR2: 0000000000000000 ---[ end trace 7d5c2d6aabc01615 ]---
Yesterday I briefly discussed this issue with John Stultz on IRC and he pointed out that this BUG is close to the one Clark has seen in the RT (non-MRG) kernel. I will copy here the original email from Clark: Date: Fri, 14 May 2010 14:45:06 -0500 From: Clark Williams Subject: backtrace from tmpfs umount on 2.6.33.4-rt19 (tip/rt/2.6.33) Thomas/Peter, I got the below backtrace while running the 'mock' unit-tests (most of which make heavy use of tmpfs). Basically it's creating a chroot build environment for a particular distro (in this case fedora-12-x86_64) and building a source RPM inside that chroot. I'm running 2.6.33.4-rt19 from tip/rt/2.6.33 Clark BUG: Dentry ffff880053401928{i=1ef806,n=ptmx} still in use (-1) [unmount of tmpfs tmpfs] ------------[ cut here ]------------ kernel BUG at fs/dcache.c:835! invalid opcode: 0000 [#1] PREEMPT SMP last sysfs file: /sys/devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A08:00/device:00/PNP0C09:00/PNP0C0A:00/power_supply/BAT0/status CPU 0 Pid: 14563, comm: umount Not tainted 2.6.33.4-rt19 #36 / RIP: 0010:[<ffffffff81107f12>] [<ffffffff81107f12>] shrink_dcache_for_umount_subtree+0x119/0x253 RSP: 0018:ffff880080073da8 EFLAGS: 00010292 RAX: 000000000000005f RBX: ffff880053401928 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000046 RDI: ffff880080073c88 RBP: ffff880080073de8 R08: ffff88009823c000 R09: 0000000000000073 R10: 0000000000000000 R11: 0000000000000000 R12: ffff880011f7ad18 R13: ffff8800534019c8 R14: ffff880011f7ad10 R15: ffff8800b4bb6e08 FS: 00007f62035d4740(0000) GS:ffff88000a200000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00007f6202c9b488 CR3: 0000000076b36000 CR4: 00000000000006f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process umount (pid: 14563, threadinfo ffff880080072000, task ffff88003d6da5c0) Stack: ffff88004329b2f0 0000000000000000 ffff88003d6da5c0 ffff880037ddeb18 <0> ffff88004329b000 ffff880037ddeb20 ffffffff81af26d0 ffff880080073fd8 <0> ffff880080073e18 ffffffff81108092 ffff880037ddeb20 ffff88004329b000 Call Trace: [<ffffffff81108092>] shrink_dcache_for_umount+0x46/0x5b [<ffffffff810f824f>] generic_shutdown_super+0x1f/0xf9 [<ffffffff810f837e>] kill_anon_super+0x16/0x54 [<ffffffff810f83e3>] kill_litter_super+0x27/0x2b [<ffffffff810f8aab>] deactivate_super+0x6d/0x82 [<ffffffff8110eadb>] mntput_no_expire+0x1a5/0x218 [<ffffffff8110f0f5>] sys_umount+0x2d5/0x300 [<ffffffff8108af27>] ? audit_syscall_entry+0x1ec/0x218 [<ffffffff81002c1b>] system_call_fastpath+0x16/0x1b Code: 0a 48 8b 4b 70 31 d2 48 85 f6 74 04 48 8b 56 40 48 05 f0 02 00 00 48 89 de 48 89 04 24 48 c7 c7 6f 9a 78 81 31 c0 e8 93 be 32 00 <0f> 0b eb fe 4c 8b 63 60 4c 39 e3 75 3c 48 8b 93 90 00 00 00 48 RIP [<ffffffff81107f12>] shrink_dcache_for_umount_subtree+0x119/0x253 RSP <ffff880080073da8> ---[ end trace 34e97e0ec2c5ae6a ]---
It's in the same area, yes. Can you please decode the source line for you crash addr2line -e vmlinux ffffffff81105bfc
# addr2line -e /usr/lib/debug/lib/modules/2.6.33.3-rt19.15.el5rt/vmlinux ffffffff81105bfc /usr/src/debug/kernel-rt-2.6.33.3-rt19.15.el5rt/linux-2.6.33.3.x86_64/fs/dcache.c:1030
Here it is the kernel panic (NULL pointer deref in shrink_dcache_parent), but this time in 2.6.33.4-rt20.17.el5rt. Thomas, would you mind have a look at this backtrace? As we have talked on IRC, it sounds close to the tmpfs issue Clark saw a while ago. This backtrace came from: http://rhts.redhat.com/testlogs/2010/05/158429/406370/3307697/console.txt BUG: unable to handle kernel NULL pointer dereference at (null) IP: [<ffffffff81105bfc>] shrink_dcache_parent+0xad/0x210 PGD 316166067 PUD 315bd5067 PMD 0 Oops: 0000 [#1] PREEMPT SMP last sysfs file: /sys/devices/system/node/node1/cpumap CPU 3 Pid: 1078, comm: sh Not tainted 2.6.33.4-rt20.17.el5rt #1 49Y5114 /IBM System x -[7870AC1]- RIP: 0010:[<ffffffff81105bfc>] [<ffffffff81105bfc>] shrink_dcache_parent+0xad/0x210 RSP: 0018:ffff880302b11d28 EFLAGS: 00010213 RAX: ffff8802dde9f3f8 RBX: ffffffffffffff70 RCX: ffffffff816c5850 RDX: ffffffff816c5850 RSI: 0000000000000003 RDI: ffff880302b10000 RBP: ffff880302b11d78 R08: ffff880302b11c18 R09: ffff880302b11d18 R10: ffff880302b11c18 R11: ffff8802dde9f358 R12: ffff8802dde9f358 R13: 0000000000000000 R14: ffff8802b0e16550 R15: 000000000004fd3a FS: 00007fea667486e0(0000) GS:ffff880204a40000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000000000 CR3: 00000003128cf000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process sh (pid: 1078, threadinfo ffff880302b10000, task ffff880319712580) Stack: ffff8802dde9f358 ffff8801f9cbd800 0000000019712580 ffff880302b11da8 <0> 0000000039801168 ffff8802dde9f358 ffff8801f9cb8c80 0000000000000491 <0> ffff880302b11db8 ffff8802f86753f0 ffff880302b11e08 ffffffff81143dd8 Call Trace: [<ffffffff81143dd8>] proc_flush_task+0xac/0x1ce [<ffffffff81046b1c>] ? div_u64+0x16/0x18 [<ffffffff81044972>] release_task+0x2d/0x3a3 [<ffffffff810451ba>] wait_consider_task+0x4d2/0x7af [<ffffffff81045589>] do_wait+0xf2/0x228 [<ffffffff81045766>] sys_wait4+0xa7/0xc4 [<ffffffff8104442a>] ? child_wait_callback+0x0/0x60 [<ffffffff81002d1b>] system_call_fastpath+0x16/0x1b Code: 8b 7d b0 48 83 c7 08 e8 cf 19 25 00 4c 8b 65 b0 c7 45 c4 00 00 00 00 4d 8b ac 24 a0 00 00 00 e9 cf 00 00 00 49 8d 9d 70 ff ff ff <4d> 8b 6d 00 4c 8d 73 08 4c 89 f7 e8 a0 19 25 00 48 89 df e8 53 RIP [<ffffffff81105bfc>] shrink_dcache_parent+0xad/0x210 RSP <ffff880302b11d28> CR2: 0000000000000000 ---[ end trace 578af5cbf3c98777 ]--- Kernel panic - not syncing: Fatal exception
I believe the shrink_dcache_parent issue in this bug is a dup of bug #595825.
I agree; let's close this one as a dupe of 595825 *** This bug has been marked as a duplicate of bug 595825 ***