From Bugzilla Helper: User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8b2) Gecko/20050415 Firefox/1.0+ Description of problem: Kernel is crashing when the machine is started, even if I haven't booted into X. This is 100% reproduceable, and has been a problem for a few weeks now (haven't had time to report it so far). Backtrace is attached. Version-Release number of selected component (if applicable): 2.6.11-1.1261_FC4smp and earlier How reproducible: Always Steps to Reproduce: 1. Boot up FC -devel 2. Shutdown 3. Actual Results: Machine started to shut down, then part way through, crashed out Expected Results: Shut down gracefully Additional info: File system is an ext3. The box has an Intel 925 board with P4-2.8, and is NOT overclocked. Also has a dual serial port Netmos PCI card for a total of 3 ttySx ports.
Created attachment 113591 [details] output of console when crashing (repeats over and over)
hrmm, looks like a use-after-free bug.
Hold this one for now, I have found a patch from Dmitry Torokhov which is in -rc2-mm3 to serport.c which seems to help the situation. http://www.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.12-rc2/2.6.12-rc2-mm3/broken-out/serport-oops-fix.patch I'm still getting an oops but it looks different now, so I'm confirming with Dmitry whether it is still a serial problem or something else.
Ok I can confirm that the patch mentioned above definitely does fix the problem. However, I am now oopsing out in another place further down. Here's the trace: Shutting down mDNSResponder services: [ OK ] Shutting down nifd services: [ OK ] Shutting down ntpd: [ OK ] Unmounting NFS filesystems: RPC: error 5 connecting to server localhost RPC: failed to contact portmap (errno -5). Unable to handle kernel paging request at virtual address[ OK ] f325ff2c printing eip: c0133819 *pde = 5a5a5a5a Oops: 0000 [#1] SMP DEBUG_PAGEALLOC Modules linked in: nfsd exportfs md5 ipv6 lp autofs4 eeprom lm85 i2c_sensor hci_ usb rfcomm l2cap bluetooth nfs lockd sunrpc snd_usb_audio usb_storage dm_mod snd _usb_lib pwc videodev video button battery ac ohci1394 ieee1394 uhci_hcd ehci_hc d parport_serial parport_pc parport hw_random i2c_i801 i2c_core emu10k1_gp gamep ort snd_emu10k1 snd_rawmidi snd_seq_device snd_ac97_codec snd_pcm_oss snd_mixer_ oss snd_pcm snd_timer snd_page_alloc snd_util_mem snd_hwdep snd soundcore e100 m ii floppy ext3 jbd ata_piix libata sd_mod scsi_mod CPU: 1 EIP: 0060:[<c0133819>] Not tainted VLI EFLAGS: 00010083 (2.6.12-rc2-mm3) EIP is at worker_thread+0x149/0x230 eax: 00000001 ebx: 00000216 ecx: f7ea4098 edx: f325ff20 esi: f325ff24 edi: f7ea4080 ebp: 00000000 esp: f7f99f7c ds: 007b es: 007b ss: 0068 Process events/1 (pid: 9, threadinfo=f7f99000 task=f7f98ad0) Stack: f7ea40a8 f7ea4090 f7ea4098 f7f99000 f325ff20 c01ba160 00000001 00000000 f7f99000 00010000 00000000 00000000 f7f98ad0 c011eca0 00100100 00200200 ffffffff ffffffff fffffffc f7e46f58 f7ea4080 c01336d0 c01376e4 ffffffff Call Trace: [<c01ba160>] key_cleanup+0x0/0xe0 [<c011eca0>] default_wake_function+0x0/0x10 [<c01336d0>] worker_thread+0x0/0x230 [<c01376e4>] kthread+0x94/0xa0 [<c0137650>] kthread+0x0/0xa0 [<c01023f5>] kernel_thread_helper+0x5/0x10 Code: 00 00 89 f8 e8 99 ee 1e 00 89 c3 8b 47 40 40 89 47 40 83 f8 03 0f 8f bd 00 00 00 8b 77 10 3b 74 24 04 74 71 8d 56 fc 89 54 24 10 <8b> 42 0c 89 44 24 14 8b 6a 10 8b 46 04 8b 16 89 10 89 36 89 42 Dmitry suggested: -------------------------- It looks like it crashes in key management code. Unfortunately I have never looked there so last time you posted the new trace I kept my mouth shut. Could you try disabling serport module and shutting down. If you still see the same oops (and I suspect you will) I'd suggest contacting David Howells: >> /* key.c: basic authentication token and access key management >> * >> * Copyright (C) 2004 Red Hat, Inc. All Rights Reserved. >> * Written by David Howells (dhowells) >> * -------------------------------------------------- I'm about to build a kernel without the key stuff in and with Dmitry's patch to see if that confirms the theory. FWIW, I am not using SELINUX.
Cool! I see the serio patch was included in the latest kernel-smp-2.6.11-1.1267_FC4, and as a result now I've been able to shut down cleanly a couple of times in a row. Thanks Dave :) If the key_cleanup crash reappears I'll open a new bugzilla report for it.