592989 – 2.6.33.3-rt19.15 x86_64 panic

Bug 592989 - 2.6.33.3-rt19.15 x86_64 panic

Summary: 2.6.33.3-rt19.15 x86_64 panic

Keywords:
Status:	CLOSED DUPLICATE of bug 595825
Alias:	None
Product:	Red Hat Enterprise MRG
Classification:	Red Hat
Component:	realtime-kernel
Sub Component:
Version:	Development
Hardware:	All
OS:	Linux
Priority:	low
Severity:	medium
Target Milestone:	1.3
Target Release:	---
Assignee:	Red Hat Real Time Maintenance
QA Contact:	David Sommerseth
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2010-05-17 14:51 UTC by Beth Uptagrafft
Modified:	2016-05-22 23:30 UTC (History)
CC List:	7 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2010-06-04 20:12:39 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Description Beth Uptagrafft 2010-05-17 14:51:01 UTC

RHTS abort panic - http://rhts.redhat.com/cgi-bin/rhts/jobs.cgi?id=157128

    2.6.33.3-rt19.15.el5rt
    rteval 1.21
    x86_64 ibm-x3650m2-01.rhts.eng.bos.redhat.com

I tried to attach to the console, but never got a system prompt to login.

http://rhts.redhat.com/testlogs/2010/05/157128/403079/3266481/console.txt
shows:

Call Trace:
 [<ffffffff81143da8>] proc_flush_task+0xac/0x1ce
 [<ffffffff8106e90b>] ? rt_mutex_adjust_prio+0x3a/0x43
 [<ffffffff81044972>] release_task+0x2d/0x3a3
 [<ffffffff810451ba>] wait_consider_task+0x4d2/0x7af
 [<ffffffff81045589>] do_wait+0xf2/0x228
 [<ffffffff81045766>] sys_wait4+0xa7/0xc4
 [<ffffffff8104442a>] ? child_wait_callback+0x0/0x60
 [<ffffffff81002d1b>] system_call_fastpath+0x16/0x1b
Code: 8b 7d b0 48 83 c7 08 e8 5f 18 25 00 4c 8b 65 b0 c7 45 c4 00 00 00 00 4d 8b ac 24 a0 00 00 00 e9 cf 00 00 00 49 8d 9d 70 ff ff ff <4d> 8b 6d 00 4c 8d 73 08 4c 89 f7 e8 30 18 25 00 48 89 df e8 53 
RIP  [<ffffffff81105bfc>] shrink_dcache_parent+0xad/0x210
 RSP <ffff88011a56fd28>
CR2: 0000000000000000
---[ end trace 7d5c2d6aabc01615 ]---
Kernel panic - not syncing: Fatal exception

Comment 1 Luis Claudio R. Goncalves 2010-05-17 14:58:34 UTC

Adding a few missing bits:

BUG: unable to handle kernel NULL pointer dereference at (null)
IP: [<ffffffff81105bfc>] shrink_dcache_parent+0xad/0x210
PGD 14516f067 PUD 10b82e067 PMD 0 
Oops: 0000 [#1] PREEMPT SMP 
last sysfs file: /sys/devices/system/node/node1/cpumap
CPU 7 
Pid: 5287, comm: sh Not tainted 2.6.33.3-rt19.15.el5rt #1 46M7165     /IBM System x -[7947AC1]-
RIP: 0010:[<ffffffff81105bfc>]  [<ffffffff81105bfc>] shrink_dcache_parent+0xad/0x210
RSP: 0018:ffff88011a56fd28  EFLAGS: 00010207
RAX: ffff88011009acb0 RBX: ffffffffffffff70 RCX: ffffffff816c5860
RDX: ffffffff816c5860 RSI: 0000000000000003 RDI: ffff88011a56e000
RBP: ffff88011a56fd78 R08: ffff88011a56fc18 R09: ffff88011a56fd18
R10: ffff88011a56fc18 R11: ffff880172c47358 R12: ffff88011009ac10
R13: 0000000000000000 R14: ffff880168cfe740 R15: 000000000002555e
FS:  00007f633d5006e0(0000) GS:ffff880183cc0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000000 CR3: 0000000117c25000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process sh (pid: 5287, threadinfo ffff88011a56e000, task ffff88016f27c500)
Stack:
 ffff880172c47358 ffff88017cc8d800 000000001a56fd98 ffff88011a56fda8
<0> 000000007c801168 ffff880172c47358 ffff88017cc88c80 0000000000002373
<0> ffff88011a56fdb8 ffff880264556df0 ffff88011a56fe08 ffffffff81143da8
Call Trace:
 [<ffffffff81143da8>] proc_flush_task+0xac/0x1ce
 [<ffffffff8106e90b>] ? rt_mutex_adjust_prio+0x3a/0x43
 [<ffffffff81044972>] release_task+0x2d/0x3a3
 [<ffffffff810451ba>] wait_consider_task+0x4d2/0x7af
 [<ffffffff81045589>] do_wait+0xf2/0x228
 [<ffffffff81045766>] sys_wait4+0xa7/0xc4
 [<ffffffff8104442a>] ? child_wait_callback+0x0/0x60
 [<ffffffff81002d1b>] system_call_fastpath+0x16/0x1b
Code: 8b 7d b0 48 83 c7 08 e8 5f 18 25 00 4c 8b 65 b0 c7 45 c4 00 00 00 00 4d 8b ac 24 a0 00 00 00 e9 cf 00 00 00 49 8d 9d 70 ff ff ff <4d> 8b 6d 00 4c 8d 73 08 4c 89 f7 e8 30 18 25 00 48 89 df e8 53 
RIP  [<ffffffff81105bfc>] shrink_dcache_parent+0xad/0x210
 RSP <ffff88011a56fd28>
CR2: 0000000000000000
---[ end trace 7d5c2d6aabc01615 ]---

Comment 2 Luis Claudio R. Goncalves 2010-05-19 11:21:04 UTC

Yesterday I briefly discussed this issue with John Stultz on IRC and he pointed out that this BUG is close to the one Clark has seen in the RT (non-MRG) kernel. 

I will copy here the original email from Clark:

Date: Fri, 14 May 2010 14:45:06 -0500
From: Clark Williams
Subject: backtrace from tmpfs umount on 2.6.33.4-rt19 (tip/rt/2.6.33)

Thomas/Peter,

I got the below backtrace while running the 'mock' unit-tests (most of
which make heavy use of tmpfs). Basically it's creating a chroot
build environment for a particular distro (in this case
fedora-12-x86_64) and building a source RPM inside that chroot.

I'm running 2.6.33.4-rt19 from tip/rt/2.6.33

Clark

BUG: Dentry ffff880053401928{i=1ef806,n=ptmx} still in use (-1)
[unmount of tmpfs tmpfs] ------------[ cut here ]------------
kernel BUG at fs/dcache.c:835!
invalid opcode: 0000 [#1] PREEMPT SMP
last sysfs
file:
/sys/devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A08:00/device:00/PNP0C09:00/PNP0C0A:00/power_supply/BAT0/status
CPU 0 Pid: 14563, comm: umount Not tainted 2.6.33.4-rt19
#36        / RIP: 0010:[<ffffffff81107f12>]  [<ffffffff81107f12>]
shrink_dcache_for_umount_subtree+0x119/0x253 RSP:
0018:ffff880080073da8  EFLAGS: 00010292 RAX: 000000000000005f RBX:
ffff880053401928 RCX: 0000000000000000 RDX: 0000000000000000 RSI:
0000000000000046 RDI: ffff880080073c88 RBP: ffff880080073de8 R08:
ffff88009823c000 R09: 0000000000000073 R10: 0000000000000000 R11:
0000000000000000 R12: ffff880011f7ad18 R13: ffff8800534019c8 R14:
ffff880011f7ad10 R15: ffff8800b4bb6e08 FS:  00007f62035d4740(0000)
GS:ffff88000a200000(0000) knlGS:0000000000000000 CS:  0010 DS: 0000 ES:
0000 CR0: 000000008005003b CR2: 00007f6202c9b488 CR3: 0000000076b36000
CR4: 00000000000006f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
0000000000000400 Process umount (pid: 14563, threadinfo
ffff880080072000, task ffff88003d6da5c0) Stack:
 ffff88004329b2f0 0000000000000000 ffff88003d6da5c0 ffff880037ddeb18
<0> ffff88004329b000 ffff880037ddeb20 ffffffff81af26d0 ffff880080073fd8
<0> ffff880080073e18 ffffffff81108092 ffff880037ddeb20 ffff88004329b000
Call Trace:
 [<ffffffff81108092>] shrink_dcache_for_umount+0x46/0x5b
 [<ffffffff810f824f>] generic_shutdown_super+0x1f/0xf9
 [<ffffffff810f837e>] kill_anon_super+0x16/0x54
 [<ffffffff810f83e3>] kill_litter_super+0x27/0x2b
 [<ffffffff810f8aab>] deactivate_super+0x6d/0x82
 [<ffffffff8110eadb>] mntput_no_expire+0x1a5/0x218
 [<ffffffff8110f0f5>] sys_umount+0x2d5/0x300
 [<ffffffff8108af27>] ? audit_syscall_entry+0x1ec/0x218
 [<ffffffff81002c1b>] system_call_fastpath+0x16/0x1b
Code: 0a 48 8b 4b 70 31 d2 48 85 f6 74 04 48 8b 56 40 48 05 f0 02 00 00
48 89 de 48 89 04 24 48 c7 c7 6f 9a 78 81 31 c0 e8 93 be 32 00 <0f> 0b
eb fe 4c 8b 63 60 4c 39 e3 75 3c 48 8b 93 90 00 00 00 48 RIP
[<ffffffff81107f12>] shrink_dcache_for_umount_subtree+0x119/0x253 RSP
<ffff880080073da8> ---[ end trace 34e97e0ec2c5ae6a ]---

Comment 3 Thomas Gleixner 2010-05-19 15:43:08 UTC

It's in the same area, yes.

Can you please decode the source line for you crash

addr2line -e vmlinux ffffffff81105bfc

Comment 4 Luis Claudio R. Goncalves 2010-05-19 16:17:07 UTC

# addr2line -e /usr/lib/debug/lib/modules/2.6.33.3-rt19.15.el5rt/vmlinux ffffffff81105bfc
/usr/src/debug/kernel-rt-2.6.33.3-rt19.15.el5rt/linux-2.6.33.3.x86_64/fs/dcache.c:1030

Comment 5 Luis Claudio R. Goncalves 2010-05-21 16:27:43 UTC

Here it is the kernel panic (NULL pointer deref in shrink_dcache_parent), but this time in 2.6.33.4-rt20.17.el5rt.

Thomas, would you mind have a look at this backtrace? As we have talked on IRC, it sounds close to the tmpfs issue Clark saw a while ago.

This backtrace came from: http://rhts.redhat.com/testlogs/2010/05/158429/406370/3307697/console.txt

BUG: unable to handle kernel NULL pointer dereference at (null)
IP: [<ffffffff81105bfc>] shrink_dcache_parent+0xad/0x210
PGD 316166067 PUD 315bd5067 PMD 0 
Oops: 0000 [#1] PREEMPT SMP 
last sysfs file: /sys/devices/system/node/node1/cpumap
CPU 3 
Pid: 1078, comm: sh Not tainted 2.6.33.4-rt20.17.el5rt #1 49Y5114     /IBM System x -[7870AC1]-
RIP: 0010:[<ffffffff81105bfc>]  [<ffffffff81105bfc>] shrink_dcache_parent+0xad/0x210
RSP: 0018:ffff880302b11d28  EFLAGS: 00010213
RAX: ffff8802dde9f3f8 RBX: ffffffffffffff70 RCX: ffffffff816c5850
RDX: ffffffff816c5850 RSI: 0000000000000003 RDI: ffff880302b10000
RBP: ffff880302b11d78 R08: ffff880302b11c18 R09: ffff880302b11d18
R10: ffff880302b11c18 R11: ffff8802dde9f358 R12: ffff8802dde9f358
R13: 0000000000000000 R14: ffff8802b0e16550 R15: 000000000004fd3a
FS:  00007fea667486e0(0000) GS:ffff880204a40000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000000 CR3: 00000003128cf000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process sh (pid: 1078, threadinfo ffff880302b10000, task ffff880319712580)
Stack:
 ffff8802dde9f358 ffff8801f9cbd800 0000000019712580 ffff880302b11da8
<0> 0000000039801168 ffff8802dde9f358 ffff8801f9cb8c80 0000000000000491
<0> ffff880302b11db8 ffff8802f86753f0 ffff880302b11e08 ffffffff81143dd8
Call Trace:
 [<ffffffff81143dd8>] proc_flush_task+0xac/0x1ce
 [<ffffffff81046b1c>] ? div_u64+0x16/0x18
 [<ffffffff81044972>] release_task+0x2d/0x3a3
 [<ffffffff810451ba>] wait_consider_task+0x4d2/0x7af
 [<ffffffff81045589>] do_wait+0xf2/0x228
 [<ffffffff81045766>] sys_wait4+0xa7/0xc4
 [<ffffffff8104442a>] ? child_wait_callback+0x0/0x60
 [<ffffffff81002d1b>] system_call_fastpath+0x16/0x1b
Code: 8b 7d b0 48 83 c7 08 e8 cf 19 25 00 4c 8b 65 b0 c7 45 c4 00 00 00 00 4d 8b ac 24 a0 00 00 00 e9 cf 00 00 00 49 8d 9d 70 ff ff ff <4d> 8b 6d 00 4c 8d 73 08 4c 89 f7 e8 a0 19 25 00 48 89 df e8 53 
RIP  [<ffffffff81105bfc>] shrink_dcache_parent+0xad/0x210
 RSP <ffff880302b11d28>
CR2: 0000000000000000
---[ end trace 578af5cbf3c98777 ]---
Kernel panic - not syncing: Fatal exception

Comment 6 john stultz 2010-06-04 18:59:15 UTC

I believe the shrink_dcache_parent issue in this bug is a dup of bug #595825.

Comment 7 Clark Williams 2010-06-04 20:12:39 UTC

I agree; let's close this one as a dupe of 595825

*** This bug has been marked as a duplicate of bug 595825 ***

Note You need to log in before you can comment on or make changes to this bug.