Bug 684345

Summary: GPF on almost every boot
Product: [Fedora] Fedora Reporter: Ronny Buchmann <ronny-rhbugzilla>
Component: kernelAssignee: Kernel Maintainer List <kernel-maint>
Status: CLOSED INSUFFICIENT_DATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: unspecified    
Version: 14CC: esandeen, gansalmon, itamar, jonathan, kernel-maint, madhu.chinakonda
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-10-11 19:58:42 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Attachments:
Description Flags
crash outputs
none
syslog 1st boot
none
crash output 2nd boot
none
crash output 3rd boot
none
crash data 2.6.35.11 none

Description Ronny Buchmann 2011-03-11 20:57:51 UTC
Created attachment 483820 [details]
crash outputs

Description of problem:
I get a general protection fault on almost every boot and does a reboot.

I have setup kdump to capture the messages.

After kdump has written the dump it reboots to normal kernel, and strangely this usually works.

Version-Release number of selected component (if applicable):
2.6.35.6-48

How reproducible:
I think at least on every cold boot

From the crash dump it seems to happen during nfslock (rpc.statd) start.

If necessary I can provide the complete dump file.

For now I'm attaching the output of these crash commands:

sys
log
set
bt
ps
ps -t
foreach bt
foreach task
foreach files
mount
mount -f
net
foreach net -s
mod
sym -l
kmem -i
kmem -s

Comment 1 Dave Jones 2011-03-11 22:53:03 UTC
the first oops is in the ext4 htree code.  Adding Eric to cc, perhaps he has ideas.

Comment 2 Ronny Buchmann 2011-03-12 09:28:52 UTC
todays bootup went this way:

1st attempt (cold boot): oops but no automatic reboot, did a manual reboot with sysrq-B (will attach syslog)
2nd attempt: oops with automatic reboot and kdump (will attach crash-data)
3rd atempt: oops with no automatic reboot, triggered kdump with sysrq-C (will attach crash-data)
4th attempt: successfull startup without errors

I'm confused.

Comment 3 Ronny Buchmann 2011-03-12 09:32:22 UTC
Created attachment 483875 [details]
syslog 1st boot

Comment 4 Ronny Buchmann 2011-03-12 09:35:13 UTC
Created attachment 483876 [details]
crash output 2nd boot

Comment 5 Ronny Buchmann 2011-03-12 09:37:39 UTC
Created attachment 483877 [details]
crash output 3rd boot

Comment 6 Eric Sandeen 2011-03-13 17:59:48 UTC
(In reply to comment #1)
> the first oops is in the ext4 htree code.  Adding Eric to cc, perhaps he has
> ideas.

Down an htree path, but oopsed in __kmalloc?  Hrm.  That's a good trick... In fact all of the oopses seem to be in memory allocation (kmem_cache_alloc, __kmalloc, etc) - and not just down the ext4 paths....

Comment 7 Ronny Buchmann 2011-03-21 18:17:27 UTC
Created attachment 486663 [details]
crash data 2.6.35.11

after updating to 2.6.35.11-83 yesterday, it crashed only once until now

Comment 8 Ronny Buchmann 2011-03-21 22:09:57 UTC
it might be hardware related, I found another report for ubuntu 9.10 (2.6.31).
This guy has the same wifi adapater zyxel G-202

http://forum.ubuntuusers.de/topic/ubuntu-startet-nicht-immer-1/#post-2325202
(post is in german, basically he has the same problem like me, sometimes the machine doesn't start up and shows a GPF in kmem_cache_alloc)

my lsusb
Bus 002 Device 003: ID 093a:2468 Pixart Imaging, Inc. SoC PC-Camera
Bus 002 Device 002: ID 8087:0020 Intel Corp. Integrated Rate Matching Hub
Bus 002 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 001 Device 008: ID 0586:3410 ZyXEL Communications Corp. ZyAIR G-202 802.11bg
Bus 001 Device 007: ID 0a12:0001 Cambridge Silicon Radio, Ltd Bluetooth Dongle (HCI mode)
Bus 001 Device 006: ID 046d:c504 Logitech, Inc. Cordless Mouse+Keyboard Receiver
Bus 001 Device 003: ID 05e3:0608 Genesys Logic, Inc. USB-2.0 4-Port HUB
Bus 001 Device 002: ID 8087:0020 Intel Corp. Integrated Rate Matching Hub
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub

Comment 9 Eric Sandeen 2011-03-21 22:16:27 UTC
Could be memory corruption; try disabling/removing the adapter, or boot a debug kernel and see if it's caught any earlier, or with more info?

Comment 10 Josh Boyer 2011-08-24 14:50:45 UTC
Is this still happening and have you tried the steps Eric suggested in comment #9?