Bug 144905 - 2.6.10-1.737_FC3 kernel panic
Summary: 2.6.10-1.737_FC3 kernel panic
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 3
Hardware: All
OS: Linux
medium
high
Target Milestone: ---
Assignee: Dave Jones
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2005-01-12 17:04 UTC by Dax Kelson
Modified: 2015-01-04 22:15 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2005-07-30 06:25:24 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
Picture of the kernel crash signature (121.33 KB, image/jpeg)
2005-01-12 17:06 UTC, Dax Kelson
no flags Details
Trace from oops on Thinkpad T40 (23.83 KB, text/plain)
2005-01-12 19:15 UTC, Rahul Karnik
no flags Details

Description Dax Kelson 2005-01-12 17:04:18 UTC
Description of problem:
I've run many FC2/FC3 kernels without trouble on my IBM ThinkPad T42p.
I think this is related to the new 2.6.10-1.737_FC3 kernel. My kernel
is not tainted.

I have now gotten three kernel panics in a row on booting, the last
one I snapped a picture of with my cell phone and transcribed it to text. 

On the forth boot it came up ok.

Version-Release number of selected component (if applicable):
2.6.10-1.737_FC3

Note that I transcribed the following and it's possible I've gotten
the number eight and zero swapped in a few placed.


ds: 007b  es: 007b ss: 0068
Process udevd (pid: 1501, threadinfo=f744e000 task=f7ae0bb0)
Stack: f7ffb110 f7fffac0 f77e3000 f7ffb110 f7ffe2c0 c0145b0b 0000000c
f7ffb100
       f7ffb100 f77e3000 f7ffb110 00000286 c0145d43 f7b78750 f7b78750
00000000
       f70ad22c c0118679 f7b78750 c011c696 f7b707fc f7b78750 00000000
0000ea00
Call Trace:
 [<c0145b0b>] cache_flusharray+0xc8/0x144
 [<c0145d43>] kfree+0x3b/0x49
 [<c0118679>] free_task+0xb/0x18
 [<c011c696>] release_task+0x1b1/0x1c0
 [<c011ec4f>] wait_task_zombie+0x650/0x66b
 [<c011fed4>] do_wait+0x173/0x403
 [<c0117643>] default_wake_function+0x0/0xc
 [<c0117643>] default_wake_function+0x0/0xc
 [<c015dc77>] vfs_read+0xb6/0xe2
 [<c011f6f7>] sys_wait4+0x27/0x2a
 [<c011f70d>] sys_waidpid+0x13/0x17
 [<c0103337>] syscalll_call+0x7/0xb
Code: 40 24 39 fd 0f 0d 99 00 00 00 0b 04 24 0b 15 10 0e 3f c0 8b 0c
a8 8d 81 00 00 00 40 c1 e0 0c c1 e0 05 8b 5c 10 1c 0b 53 04 8b 03 <89>
50 04 89 02 31 d2 2b <0>Kernel panic - not syncing: mm/slab.c:2202:
spin_lock(mm/slab.c:f7fffb04) already locked by mm/slab.c/2202

Comment 1 Dax Kelson 2005-01-12 17:06:50 UTC
Created attachment 109677 [details]
Picture of the kernel crash signature

I've attached a picture of the crash signature in case I screwed up the
transcribing of it.

Comment 2 Dave Jones 2005-01-12 18:30:10 UTC
if this is reproducable, could you try booting with vga=1 to get more
lines of text on the screen? Then we'll be able to see the top of the
dump too.

Comment 3 Dax Kelson 2005-01-12 18:48:11 UTC
I added that to my kernel cmd line. If it crashes again, I'll add
another comment.

Comment 4 Rahul Karnik 2005-01-12 18:49:03 UTC
Just wanted to add I can reproduce this issue on my Thinkpad T40,
although in my case I saw different symptoms on every boot. Will try
to get a trace.

Comment 5 Rahul Karnik 2005-01-12 19:15:20 UTC
Created attachment 109683 [details]
Trace from oops on Thinkpad T40

Comment 6 Rahul Karnik 2005-01-12 19:16:02 UTC
Looks like a different bug; different trace, oops and not panic.

Comment 7 Thomas Chung 2005-01-12 19:43:08 UTC
http://fedoranews.org/blog/index.php?p=263

Comment 8 Thomas Chung 2005-01-12 21:55:00 UTC
I just performed a test on Dell Precision Workstation 370n with SATA
drive.
http://www1.us.dell.com/content/products/productdetails.aspx/precn_370n

Here are the steps I took:
1) Wipe entire HD
2) Install a fresh FC3
3) Install over 100 updates including kernel-2.6.10-1.737_FC3 via yum

Since this system came with Hyper-Threading enabled, I made sure if I
can boot from both 2.6.10-1.737_FC3smp and 2.6.10-1.737_FC3.

Apprently, this system did not have any problem with the latest kernel.

Please let me know if you'd like to know any other information from
this test system.

Thomas

Comment 9 Clive Long 2005-06-30 12:41:37 UTC
Hello,

  I get kernel panic (??) with FC3 and 2.6.11. All I can say is that the machine
is a generic, no-name device with recognized as an i686 processor. The problem
occurred when the machine was running over-night with no users logged on. In the
morning the screen was blank (probably die to screen saver) and no response to
keyboard and mouse. I had to power off and power on to recover. 

I have had the machine freeze before but this is the first time I have known
where to find diagnostic / error messages. I will run the machine over-night
with a 2.6.10 kernel and see what happens. 

I previously had hangs on this machine within a few minutes of starting X, when
I was running 2.6.11 but not with 2.6.10. I think that has been fixed by
installing the ATI supplied X.org drivers - but I'm not sure as I only follow
the install instructions. My experience of installations is really at the level
of rpm , not make.

I don't know if this is the same problem as this bug. I reproduce below what
seem to be the relevant messages from /var/log/messages
****************************************

Jun 30 02:51:34 localhost nmbd[3873]: [2005/06/30 02:51:34, 0]
nmbd/nmbd_browsesync.c:find_domain_master_name_query_fail(353) 
Jun 30 02:51:34 localhost nmbd[3873]:   find_domain_master_name_query_fail: 
Jun 30 02:51:34 localhost nmbd[3873]:   Unable to find the Domain Master Browser
name CRLGROUP<1b> for the workgroup CRLGROUP. 
Jun 30 02:51:34 localhost nmbd[3873]:   Unable to sync browse lists in this
workgroup. 
Jun 30 03:00:49 localhost smartd[3592]: Device: /dev/hda, 1 Currently unreadable
(pending) sectors 
Jun 30 03:01:01 localhost crond(pam_unix)[5120]: session opened for user root by
(uid=0)
Jun 30 03:01:01 localhost crond(pam_unix)[5120]: session closed for user root
Jun 30 03:06:45 localhost nmbd[3873]: [2005/06/30 03:06:45, 0]
nmbd/nmbd_browsesync.c:find_domain_master_name_query_fail(353) 
Jun 30 03:06:45 localhost nmbd[3873]:   find_domain_master_name_query_fail: 
Jun 30 03:06:45 localhost nmbd[3873]:   Unable to find the Domain Master Browser
name CRLGROUP<1b> for the workgroup CRLGROUP. 
Jun 30 03:06:45 localhost nmbd[3873]:   Unable to sync browse lists in this
workgroup. 
Jun 30 03:21:51 localhost nmbd[3873]: [2005/06/30 03:21:51, 0]
nmbd/nmbd_browsesync.c:find_domain_master_name_query_fail(353) 
Jun 30 03:21:51 localhost nmbd[3873]:   find_domain_master_name_query_fail: 
Jun 30 03:21:51 localhost nmbd[3873]:   Unable to find the Domain Master Browser
name CRLGROUP<1b> for the workgroup CRLGROUP. 
Jun 30 03:21:51 localhost nmbd[3873]:   Unable to sync browse lists in this
workgroup. 
Jun 30 03:30:48 localhost smartd[3592]: Device: /dev/hda, 1 Currently unreadable
(pending) sectors 
Jun 30 03:36:49 localhost nmbd[3873]: [2005/06/30 03:36:49, 0]
nmbd/nmbd_browsesync.c:find_domain_master_name_query_fail(353) 
Jun 30 03:36:49 localhost nmbd[3873]:   find_domain_master_name_query_fail: 
Jun 30 03:36:49 localhost nmbd[3873]:   Unable to find the Domain Master Browser
name CRLGROUP<1b> for the workgroup CRLGROUP. 
Jun 30 03:36:49 localhost nmbd[3873]:   Unable to sync browse lists in this
workgroup. 
Jun 30 03:52:01 localhost nmbd[3873]: [2005/06/30 03:52:01, 0]
nmbd/nmbd_browsesync.c:find_domain_master_name_query_fail(353) 
Jun 30 03:52:01 localhost nmbd[3873]:   find_domain_master_name_query_fail: 
Jun 30 03:52:01 localhost nmbd[3873]:   Unable to find the Domain Master Browser
name CRLGROUP<1b> for the workgroup CRLGROUP. 
Jun 30 03:52:01 localhost nmbd[3873]:   Unable to sync browse lists in this
workgroup. 
Jun 30 04:00:48 localhost smartd[3592]: Device: /dev/hda, 1 Currently unreadable
(pending) sectors 
Jun 30 04:01:01 localhost crond(pam_unix)[5127]: session opened for user root by
(uid=0)
Jun 30 04:01:01 localhost crond(pam_unix)[5127]: session closed for user root
Jun 30 04:02:01 localhost crond(pam_unix)[5129]: session opened for user root by
(uid=0)
Jun 30 04:06:18 localhost kernel: Unable to handle kernel NULL pointer
dereference at virtual address 00000470
Jun 30 04:06:18 localhost kernel:  printing eip:
Jun 30 04:06:18 localhost kernel: c0159948
Jun 30 04:06:18 localhost kernel: *pde = 00000000
Jun 30 04:06:18 localhost kernel: Oops: 0000 [#1]
Jun 30 04:06:18 localhost kernel: Modules linked in: md5 ipv6 autofs4 sunrpc
microcode vfat fat dm_mod video button battery ac ohci1394 ieee1394 yenta_socket
rsrc_nonstatic pcmcia_core ohci_hcd ehci_hcd i2c_sis96x snd_intel8x0
snd_ac97_codec snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd soundcore
snd_page_alloc 8139too mii floppy usbvision(U) i2c_algo_usb(U) i2c_core videodev
ext3 jbd
Jun 30 04:06:18 localhost kernel: CPU:    0
Jun 30 04:06:18 localhost kernel: EIP:    0060:[<c0159948>]    Not tainted VLI
Jun 30 04:06:18 localhost kernel: EFLAGS: 00010046   (2.6.11-1.27_FC3) 
Jun 30 04:06:18 localhost kernel: EIP is at wakeup_kswapd+0x46/0x88
Jun 30 04:06:18 localhost kernel: eax: 00000000   ebx: c07b3840   ecx: 00000000
  edx: 00000000
Jun 30 04:06:18 localhost kernel: esi: 00000000   edi: 00000000   ebp: 00000000
  esp: cf800b30
Jun 30 04:06:18 localhost kernel: ds: 007b   es: 007b   ss: 0068
Jun 30 04:06:18 localhost kernel: Process prelink (pid: 5614,
threadinfo=cf800000 task=c7f042b0)
Jun 30 04:06:18 localhost kernel: Stack: 00000000 00000000 00000000 c07b3840
00000002 00000220 c014fd3f 00000001 
Jun 30 04:06:19 localhost kernel:        00000000 00000000 c169cdf0 00000001
00000001 c7f042b0 00000000 c03b3c3c 
Jun 30 04:06:19 localhost kernel:        c10013a0 00000220 00000220 dfd4ae80
00000000 c015366b dfd4ae80 00000220 
Jun 30 04:06:19 localhost kernel: Call Trace:
Jun 30 04:06:19 localhost kernel:  [<c014fd3f>] __alloc_pages+0x1cc/0x3b8
Jun 30 04:06:19 localhost kernel:  [<c015366b>] kmem_getpages+0x2a/0x77
Jun 30 04:06:19 localhost kernel:  [<c0154595>] cache_grow+0xd6/0x2fe
Jun 30 04:06:19 localhost kernel:  [<c0154a1e>] cache_alloc_refill+0x261/0x319
Jun 30 04:06:19 localhost kernel:  [<c0177cb1>] __bread+0x10/0x2f
Jun 30 04:06:19 localhost kernel:  [<e088d2bb>] ext3_get_branch+0x69/0xdf [ext3]
Jun 30 04:06:19 localhost kernel:  [<c0154dc1>] kmem_cache_alloc+0x47/0x49
Jun 30 04:06:19 localhost kernel:  [<c0201070>] radix_tree_node_alloc+0x10/0x49
Jun 30 04:06:19 localhost kernel:  [<c020124d>] radix_tree_insert+0x74/0x10b
Jun 30 04:06:19 localhost kernel:  [<c014a290>] add_to_page_cache+0x63/0x18e
Jun 30 04:06:19 localhost kernel:  [<c01a47e2>] mpage_readpages+0x73/0x10d
Jun 30 04:06:19 localhost kernel:  [<c014f1c6>] prep_new_page+0x5c/0x5f
Jun 30 04:06:19 localhost kernel:  [<c014f946>] buffered_rmqueue+0x154/0x2e2
Jun 30 04:06:19 localhost kernel:  [<e088e928>] ext3_readpages+0x0/0x15 [ext3]
Jun 30 04:06:19 localhost kernel:  [<c0152af4>] read_pages+0xf5/0x105
Jun 30 10:18:34 localhost syslogd 1.4.1: restart.


Comment 10 Clive Long 2005-07-01 07:13:27 UTC
Hello again

I ran FC2 / 2.6.10 last night on the same machine. No X was running - not even
logged on.

No error message on the console this morning.

I was able to successfully start X.

Not much evidence I know, but looks like the error I have reported is for the
2.6.11  kernel, not 2.6.10 as in this thread.

Clive

Comment 11 Dave Jones 2005-07-15 19:59:30 UTC
An update has been released for Fedora Core 3 (kernel-2.6.12-1.1372_FC3) which
may contain a fix for your problem.   Please update to this new kernel, and
report whether or not it fixes your problem.

If you have updated to Fedora Core 4 since this bug was opened, and the problem
still occurs with the latest updates for that release, please change the version
field of this bug to 'fc4'.

Thank you.

Comment 12 Clay Campbell 2005-07-20 12:32:27 UTC
Same here

Dell Poweredge 1550 SCSI (pizza box)

Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 6.2.36

kernel panic at boot time

reverting to last kernel is current fix

I don't see the update, sited in comment #11, in yum yet




Comment 13 Clay Campbell 2005-07-20 13:20:27 UTC
I didn't realize this bug was about an earlier kernel.  No coffee yet.  

My kernel panic was with kernel-smp.i686 2.6.12-1.1372_FC3 the first attempt
after installation

switching back to 2.6.11-1.35_FC3smp for now

2.6.12-1.1372_FC3 worked fine on a poweredge 1400SC ( not smp )

Clay

Comment 14 Dave Jones 2005-07-30 06:25:24 UTC
this bug has become a dumping ground for various unrelated oopses (The vague
summary line probably didn't help).  Dax's original oops doesn't seem to have
reoccured, so I'm going to close this.

If any of you are still seeing oopses with the 1372 kernel (after installing
yesterdays mkinitrd update, and reinstalling the kernel), please file a new bug.

Thanks.



Note You need to log in before you can comment on or make changes to this bug.