Bug 155285 - System System hangs. Kernel Oops, Unable to handle kernel NULL pointer
System System hangs. Kernel Oops, Unable to handle kernel NULL pointer
Status: CLOSED CANTFIX
Product: Fedora
Classification: Fedora
Component: kernel (Show other bugs)
3
x86_64 Linux
medium Severity high
: ---
: ---
Assigned To: Stephen Tweedie
Brian Brock
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2005-04-18 15:38 EDT by Ian Watson
Modified: 2007-11-30 17:11 EST (History)
1 user (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2005-10-02 21:20:54 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
/var/log/messages (411.56 KB, text/plain)
2005-04-18 15:40 EDT, Ian Watson
no flags Details
system into: lspci -vvv, proc/cpuinfo (11.36 KB, text/plain)
2005-04-18 15:44 EDT, Ian Watson
no flags Details
Full messages log Apr17 to Apr18 (500.22 KB, text/plain)
2005-04-19 15:54 EDT, Ian Watson
no flags Details

  None (edit)
Description Ian Watson 2005-04-18 15:38:57 EDT
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-GB; rv:1.6) Gecko/20040115 Epiphany/1.0.7

Description of problem:
AMD64 x86_64 Gigabyte K8NS Pro Nvidia 3 chipset.
Running FC2 x86_64 and MDK10.0 i586 No problem.

Install FC3.
Hung on first boot. Hung 15 minutes into 2nd boot whilst up2date was running.
Used rescue disk to comment out raid disks and other from fstab.
Another boot hung after password at gdm login.
Following messages log (the interesting bit):-

Apr 18 19:07:04 pool fstab-sync[3035]: added mount point /media/cdrecorder for /dev/hdc
Apr 18 19:07:21 pool kernel: Unable to handle kernel NULL pointer dereference at 0000000000000031 RIP: 
Apr 18 19:07:21 pool kernel: <ffffffffa002af81>{:ext3:init_once+119}
Apr 18 19:07:21 pool kernel: PML4 37a10067 PGD 0 
Apr 18 19:07:21 pool kernel: Oops: 0002 [1] 
Apr 18 19:07:21 pool kernel: CPU 0 
Apr 18 19:07:21 pool kernel: Modules linked in: md5 ipv6 parport_pc lp parport autofs4 i2c_dev i2c_core ds yenta_socket pcmcia_core sunrpc ipt_REJECT ipt_state ip_conntrack iptable_filter ip_tables usblp joydev dm_mod button battery ac ohci1394 ieee1394 ohci_hcd ehci_hcd snd_intel8x0 snd_ac97_codec snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd_page_alloc gameport snd_mpu401_uart snd_rawmidi snd_seq_device snd soundcore sk98lin floppy ext3 jbd raid1
Apr 18 19:07:21 pool kernel: Pid: 3058, comm: gdm-binary Not tainted 2.6.9-1.667
Apr 18 19:07:21 pool kernel: RIP: 0010:[<ffffffffa002af81>] <ffffffffa002af81>{:ext3:init_once+119}
Apr 18 19:07:21 pool kernel: RSP: 0018:000001003725bc70  EFLAGS: 00010286
Apr 18 19:07:21 pool kernel: RAX: ffffffffa002af65 RBX: 0000000000000000 RCX: 0000010001000000
Apr 18 19:07:21 pool kernel: RDX: 0000000000000001 RSI: 0000010037ce7580 RDI: 0000000040000188
Apr 18 19:07:21 pool kernel: RBP: 0000010037ce7580 R08: 0000000000000000 R09: 0000010037d4d14c
Apr 18 19:07:21 pool kernel: R10: 0000010037d4d14c R11: 000000000000001c R12: 0000010039fd0140
Apr 18 19:07:21 pool kernel: R13: 0000000000000001 R14: 0000000000000001 R15: 0000000000000050
Apr 18 19:07:21 pool kernel: FS:  0000002a97d74660(0000) GS:ffffffff80503480(0000) knlGS:0000000000000000
Apr 18 19:07:21 pool kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Apr 18 19:07:21 pool kernel: CR2: 0000000000000031 CR3: 0000000000101000 CR4: 00000000000006e0
Apr 18 19:07:21 pool gdm[2566]: gdm_cleanup_children: child 3058 crashed of signal 9
Apr 18 19:07:21 pool kernel: Process gdm-binary (pid: 3058, threadinfo 000001003725a000, task 00000100377f0a20)
Apr 18 19:07:21 pool gdm[2566]: gdm_cleanup_children: Slave crashed, killing its children
Apr 18 19:07:21 pool kernel: Stack: ffffffff8016bbd9 0000010001caf5b8 0000010037ce7580 0000010039fd0000 
Apr 18 19:07:21 pool kernel:        0000010039fd0140 0000000000000050 ffffffff8016c20f 0000000000001000 
Apr 18 19:07:21 pool kernel:        0000000000000050 0000010037ce7580 
Apr 18 19:07:21 pool kernel: Call Trace:<ffffffff8016bbd9>{cache_init_objs+54} <ffffffff8016c20f>{cache_alloc_refill+1152} 
Apr 18 19:07:21 pool kernel:        <ffffffff8016bd31>{kmem_cache_alloc+79} <ffffffffa002aeb8>{:ext3:ext3_alloc_inode+18} 
Apr 18 19:07:21 pool kernel:        <ffffffffa002aea6>{:ext3:ext3_alloc_inode+0} <ffffffff801ac4a1>{alloc_inode+21} 
Apr 18 19:07:21 pool kernel:        <ffffffff801ae5a3>{iget_locked+611} <ffffffffa00282b2>{:ext3:ext3_lookup+86} 
Apr 18 19:07:21 pool kernel:        <ffffffff8019c204>{do_lookup+229} <ffffffff8019d37a>{link_path_walk+3921} 
Apr 18 19:07:21 pool kernel:        <ffffffff8019d915>{path_lookup+366} <ffffffff8019e1ae>{open_namei+172} 
Apr 18 19:07:21 pool kernel:        <ffffffff8018958b>{filp_open+39} <ffffffff80214ba1>{strncpy_from_user+74} 
Apr 18 19:07:21 pool kernel:        <ffffffff80189661>{get_unused_fd+179} <ffffffff80189a12>{sys_open+58} 
Apr 18 19:07:21 pool kernel:        <ffffffff801107a2>{system_call+126} 
Apr 18 19:07:21 pool kernel: 
Apr 18 19:07:21 pool kernel: Code: 48 89 42 30 48 89 42 38 e8 1d 17 18 e0 48 83 c4 30 5b c3 51 
Apr 18 19:07:21 pool kernel: RIP <ffffffffa002af81>{:ext3:init_once+119} RSP <000001003725bc70>

Apr 18 19:07:21 pool kernel: CR2: 0000000000000031
Apr 18 19:09:20 pool syslogd 1.4.1: restart.


Version-Release number of selected component (if applicable):
kernel-2.6.9-1.667

How reproducible:
Sometimes

Steps to Reproduce:
1. Just trying to log in, and it soon hangs.


Additional info:

Message and system info in attachments.
Comment 1 Ian Watson 2005-04-18 15:40:41 EDT
Created attachment 113341 [details]
/var/log/messages
Comment 2 Ian Watson 2005-04-18 15:44:02 EDT
Created attachment 113342 [details]
system into: lspci -vvv, proc/cpuinfo
Comment 3 Stephen Tweedie 2005-04-18 16:36:27 EDT
You said this was "sometimes" reproducible; do you have records of other oopses?
 Also, is this reproducible on a more recent FC3 update kernel?
Comment 4 Ian Watson 2005-04-19 15:54:04 EDT
Created attachment 113372 [details]
Full messages log Apr17 to Apr18

Full messages from installing FC3 to last use.
Comment 5 Ian Watson 2005-04-19 16:03:13 EDT
(In reply to comment #3)
> You said this was "sometimes" reproducible; do you have records of other oopses?
>  Also, is this reproducible on a more recent FC3 update kernel?
> 

  I have attached the full log 'messages_full' from when I updated from FC2x64
to FC3x64 on 17 Apr 2005, to last use on Apr 18. A summary follows. Hopefully it
helps.
  It has hung 1 of 3 goes with lastest kernel 2.6.11-1.14_FC3. No trace. I am
down loading all updates before continuing.

11:50 to 12:36
  Backed up system.

Installed FC3

13:56:50  1st boot. Note,
  pcmcia: cardmgr[3055]: open_sock(socket 0) failed: Bad file descriptor
  Machine has no pcmcia.
13:56:58  Hangs at/after haldaemon. No stack trace.

  Boot into Mandrake 10 (32 bit) to check things out. Seem fine.

14:20:30  2nd boot. Again bad pcmcia file descriptor. It appears every time.
14:20:54  No nvidia drive. Machine Hangs. No trace.
  Boot into mandrake to change X driver from nvidia to nv. Would be nice for
Anaconda upgrader to of run X conf to of removed this.

15:01:45  Restart.
15:58:24  Hangs while up2date running, trying to get latest kernel.
  kernel: 50-fstab-sync.h[4284]: segfault at 0000000000000000 rip
0000002a95b50c12 rsp 0000007fbffff668 error 4

16:01:34  Restart.
16:41:31  Login, shutsdown OK.

16:56:03  Restart.
16:57:16  Hangs after login password accepted. No trace.

16:59:03  Restart.
17:23:57  Login, shutsdown OK.

18 Apr 2005.

19:06:54  Restart.
19:07:21  Crash as reported. Occurs just after haldamon, as gdm would be starting.

19:09:20  Restart.
19:11:28  Hung after login again. No trace.

20:47:46  Restart.
  Installed latest kernel 2.6.11-1.14_FC3.
20:53:16  Login, shutsdown OK.

20:55:27  Restart.
21:01:01  Hang.

21:12:27  Restart.
  gftp will not run. Trying to connect ftp.mirror.ac.uk to download updates.
  kernel: gftp-gtk[4906]: segfault at 000000000000807f rip 0000000000439203 rsp
00000000409fd160 error 6
23:01:52  Shutsdown OK.
Comment 6 Stephen Tweedie 2005-04-20 16:46:02 EDT
  kernel: 50-fstab-sync.h[4284]: segfault at 0000000000000000 rip
0000002a95b50c12 rsp 0000007fbffff668 error 4

  kernel: gftp-gtk[4906]: segfault at 000000000000807f rip 0000000000439203 rsp
00000000409fd160 error 6

Do you have any other parts of those traces?  The backtraces are really useful,
but the

kernel: RIP <ffffffffa002af81>{:ext3:init_once+119} RSP <000001003725bc70>

line is helpful even without full backtrace.

From the extent of the problems you're reporting, I wonder if it's not a
hardware problem; have you tried a memtest86 run?
Comment 7 Dave Jones 2005-07-15 15:17:45 EDT
An update has been released for Fedora Core 3 (kernel-2.6.12-1.1372_FC3) which
may contain a fix for your problem.   Please update to this new kernel, and
report whether or not it fixes your problem.

If you have updated to Fedora Core 4 since this bug was opened, and the problem
still occurs with the latest updates for that release, please change the version
field of this bug to 'fc4'.

Thank you.
Comment 8 Dave Jones 2005-10-02 21:20:54 EDT
This bug has been automatically closed as part of a mass update.
It had been in NEEDINFO state since July 2005.
If this bug still exists in current errata kernels, please reopen this bug.

There are a large number of inactive bugs in the database, and this is the only
way to purge them.

Thank you.

Note You need to log in before you can comment on or make changes to this bug.