Bug 132229
Summary: | poll_wait oops. | ||
---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Marco Varanda <fedora> |
Component: | kernel | Assignee: | Dave Jones <davej> |
Status: | CLOSED WONTFIX | QA Contact: | Brian Brock <bbrock> |
Severity: | high | Docs Contact: | |
Priority: | medium | ||
Version: | 2 | CC: | bonomo, pfrields, wtogami |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | i686 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2004-11-27 21:52:19 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Marco Varanda
2004-09-10 00:56:46 UTC
This same thing is happening to my desktop system as well. Here is what is logged: Sep 13 07:22:00 bren kernel: Unable to handle kernel NULL pointer dereference at virtual address 00000004 Sep 13 07:22:00 bren kernel: printing eip: Sep 13 07:22:00 bren kernel: 42adba19 Sep 13 07:22:00 bren kernel: *pde = 00000000 Sep 13 07:22:00 bren kernel: Oops: 0002 [#1] Sep 13 07:22:00 bren kernel: Modules linked in: loop snd_seq_midi snd_emu10k1_synth snd_emux_synth snd_seq_virmidi snd_seq_midi_emul snd_seq_oss snd_seq_midi_event snd_seq nls_utf8 vmnet(U) vmmon(U) sg sr_mod snd_pcm_oss sd_mod scsi_mod floppy snd_intel8x0 gameport snd_mpu401_uart snd_mixer_oss snd_emu10k1 snd_rawmidi snd_pcm snd_timer snd_seq_device snd_ac97_codec snd_page_alloc snd_util_mem snd_hwdep snd soundcore mga parport_pc lp parport autofs4 sunrpc sk98lin 3c59x microcode udf dm_mod joydev uhci_hcd ehci_hcd button battery asus_acpi ac md5 ipv6 ext3 jbd Sep 13 07:22:00 bren kernel: CPU: 0 Sep 13 07:22:00 bren kernel: EIP: 0060:[<42adba19>] Tainted: P Sep 13 07:22:00 bren kernel: EFLAGS: 00210293 (2.6.8-1.521) Sep 13 07:22:00 bren kernel: EIP is at mga_freelist_put+0x66/0x73 [mga] Sep 13 07:22:00 bren kernel: eax: 352d5c40 ebx: 352d4f40 ecx: 352d5700 edx: 00000000 Sep 13 07:22:00 bren kernel: esi: 039f5980 edi: 42863938 ebp: 42863898 esp: 388abf1c Sep 13 07:22:00 bren kernel: ds: 007b es: 007b ss: 0068 Sep 13 07:22:00 bren kernel: Process hypertorus (pid: 30031, threadinfo=388ab000 task=39836740) Sep 13 07:22:00 bren kernel: Stack: 039f5980 352fdc80 42add231 00000001 0000ffc0 e05c8000 352d4f40 42aeaf20 Sep 13 07:22:00 bren kernel: 039f5980 352d4f40 352fdc80 42aeaf20 42adda11 0000004c 0000ffc0 00000001 Sep 13 07:22:00 bren kernel: 08253000 42aeaf20 1d8cf140 42ae3cdc 400c6445 42ad7b59 feffeeb0 42add8ef Sep 13 07:22:00 bren kernel: Call Trace: Sep 13 07:22:00 bren kernel: [<42add231>] mga_dma_dispatch_vertex+0x14f/0x182 [mga] Sep 13 07:22:00 bren kernel: [<42adda11>] mga_dma_vertex+0x122/0x12e [mga] Sep 13 07:22:00 bren kernel: [<42ad7b59>] mga_ioctl+0xe3/0xef [mga] Sep 13 07:22:00 bren kernel: [<42add8ef>] mga_dma_vertex+0x0/0x12e [mga] Sep 13 07:22:00 bren kernel: [<021752d2>] sys_ioctl+0x29a/0x33c Sep 13 07:22:00 bren kernel: [<02107e67>] do_IRQ+0x2f7/0x303 Sep 13 07:22:00 bren kernel: Code: 89 42 04 89 10 89 48 04 5b 31 c0 5e c3 55 89 c5 57 56 53 83 The system is unusable and has to be shut down hard. This has happened twice since the last kernel update. Kernel is 2.6.8-1.521. Intel(R) Pentium(R) 4 CPU 2.40GHz 1 GB ram IDE Drives We are having the same problem here. It has gotten worse since updating to 2.6.8-1.521. Daily, usually hard, reboots are required (on a production system!). System hangs/lockups are not predictable, but are likely exacerbated by running xfs_dump. (I am going to try to downgrade the kernel to see if that helps.) This may or may not be related, but we are also having trouble dumping to our SCSI tape drive (logs show many SCSI errors). I will not include a log file unless requested, as it would seem redundant with what is already submitted. THIS IS A SERIOUS BUG!!! The owner should change this to HIGH priority! Marco: Is this still happening for you with current rawhide kernels ? Brendan: Your oops is completely different and unrelated to this bug. It's also tainted with proprietary modules. Please try to repeat with a current kernel without those loaded, and file a seperate bug report if it persists. Richard: can you also test with a current kernel, and paste the oops you are seeing if its still reproducable. Thanks. Thanks for your answer Mr Dave Jones. Sorry I can't understand "rawhide kernels" my kernel is a "vanilla" version 2.6.8-1.521 (yum update) with the latest updates, my server still freeze and is unstable (two or three reboots per week) I think downgrade FC1 or change to Debian Sarge :( Here is a log file snippet from last August. I THINK I was running the most current kernel then. Since then, I downgraded a tad and applied a custom patch on the web which dealt with xfs software deficiencies. using linux kernel 2.6.5-1.358: Aug 17 17:20:17 bob kernel: printing eip: Aug 17 17:20:17 bob kernel: 4298030e Aug 17 17:20:17 bob kernel: *pde = 00000000 Aug 17 17:20:17 bob kernel: Oops: 0000 [#1] Aug 17 17:20:17 bob kernel: CPU: 0 Aug 17 17:20:17 bob kernel: EIP: 0060:[<4298030e>] Not tainted Aug 17 17:20:17 bob kernel: EFLAGS: 00010206 (2.6.5-1.358) Aug 17 17:20:17 bob kernel: EIP is at xfs_iget_core+0x4e/0x42b [xfs] Aug 17 17:20:17 bob kernel: eax: 00004000 ebx: 4040a038 ecx: 00004000 edx: 00000000 Aug 17 17:20:17 bob kernel: esi: 000c0c4f edi: 00000000 ebp: 40f31000 esp: 35ab6d2c Aug 17 17:20:17 bob kernel: ds: 007b es: 007b ss: 0068 Aug 17 17:20:17 bob kernel: Process xfsdump (pid: 15597, threadinfo=35ab6000 task=412aaeb0) Aug 17 17:20:17 bob kernel: Stack: 000008d0 39e4fb00 40770200 00000000 00000000 00000000 08e9bc80 0000400 0 Aug 17 17:20:17 bob kernel: 08e9bc9c 08e9bc80 00000008 40f31000 42980774 000c0c4f 00000000 00000008 Aug 17 17:20:17 bob kernel: 35ab6dd4 00000000 00000000 00000000 00000000 000c0c4f 00000000 00000000 Aug 17 17:20:17 bob kernel: Call Trace: Aug 17 17:20:17 bob kernel: [<42980774>] xfs_iget+0x89/0x124 [xfs] Aug 17 17:20:17 bob kernel: [<42986f7a>] xfs_bulkstat_one+0x112/0x551 [xfs] Aug 17 17:20:17 bob kernel: [<429880fd>] xfs_bulkstat_single+0x37/0xbf [xfs] Aug 17 17:20:17 bob kernel: [<4296d16c>] xfs_da_read_buf+0x2b/0x31 [xfs] Aug 17 17:20:17 bob kernel: [<4296d6a6>] xfs_da_brelse+0xb2/0xf2 [xfs] Aug 17 17:20:17 bob kernel: [<02135342>] follow_page+0xda/0xe5 Aug 17 17:20:17 bob kernel: [<0213f1ba>] rw_vm+0x1ce/0x1ea Aug 17 17:20:17 bob kernel: [<429a2f69>] xfs_ioc_bulkstat+0xe0/0x15c [xfs] Aug 17 17:20:17 bob kernel: [<429a2a82>] xfs_ioctl+0x303/0x680 [xfs] Aug 17 17:20:17 bob kernel: [<429a1dac>] linvfs_readdir+0x1c1/0x1cf [xfs] Aug 17 17:20:17 bob kernel: [<0214edcd>] filldir64+0x0/0x12e Aug 17 17:20:17 bob kernel: [<02135342>] follow_page+0xda/0xe5 Aug 17 17:20:17 bob kernel: [<0213f1ba>] rw_vm+0x1ce/0x1ea Aug 17 17:20:17 bob kernel: [<429a1e42>] linvfs_ioctl+0x18/0x22 [xfs] Aug 17 17:20:17 bob kernel: [<0214ea0e>] sys_ioctl+0x1f2/0x224 Aug 17 17:20:17 bob kernel: Aug 17 17:20:17 bob kernel: Code: 8b 51 3c 8b 41 38 39 fa 0f 85 ec 00 00 00 39 f0 0f 85 e4 00 Aug 17 17:23:14 bob kernel: <4>pagebuf_get: failed to lookup pages Aug 17 17:23:14 bob kernel: pagebuf_get: failed to lookup pages Aug 17 17:23:17 bob last message repeated 6 times more (complete cycle): Aug 29 04:02:04 bob syslogd 1.4.1: restart. Aug 30 09:55:49 bob rpc.mountd: authenticated unmount request from carol.sal.wisc.edu:874 for /usr/users (/usr/users) Aug 30 12:02:23 bob sshd(pam_unix)[13597]: session opened for user bonomo by (uid=1110) Aug 30 12:03:37 bob sshd(pam_unix)[13597]: session closed for user bonomo Aug 30 13:15:42 bob sshd(pam_unix)[13664]: session opened for user angela by (uid=1160) Aug 30 14:18:27 bob sshd(pam_unix)[13664]: session closed for user angela Aug 30 14:20:42 bob sshd(pam_unix)[13710]: authentication failure; logname= uid=0 euid=0 tty=NODEVssh ruser= rhost=maddog.sal.wisc.edu user=adam Aug 30 14:20:50 bob sshd(pam_unix)[13712]: session opened for user adam by (uid=1162) Aug 30 14:23:51 bob sshd(pam_unix)[13712]: session closed for user adam Aug 30 14:24:21 bob sshd(pam_unix)[13792]: session opened for user adam by (uid=1162) Aug 30 14:24:34 bob sshd(pam_unix)[13792]: session closed for user adam Aug 30 14:25:14 bob sshd(pam_unix)[13814]: session opened for user adam by (uid=1162) Aug 30 14:26:55 bob sshd(pam_unix)[13843]: session opened for user bonomo by (uid=1110) Aug 30 14:28:26 bob sshd(pam_unix)[13843]: session closed for user bonomo Aug 30 14:32:30 bob kernel: st0: Block limits 1 - 16777215 bytes. Aug 30 14:36:37 bob kernel: pagebuf_get: failed to lookup pages Aug 30 14:36:39 bob last message repeated 5 times Aug 30 14:36:54 bob sshd(pam_unix)[14019]: session opened for user adam by (uid=1162) Aug 30 14:39:59 bob kernel: pagebuf_get: failed to lookup pages Aug 30 14:44:38 bob last message repeated 15 times Aug 30 14:44:45 bob last message repeated 8 times Aug 30 14:47:54 bob sshd(pam_unix)[13814]: session closed for user adam Aug 30 14:52:34 bob sshd(pam_unix)[14019]: session closed for user adam Aug 30 14:53:40 bob sshd(pam_unix)[14212]: session opened for user adam by (uid=1162) Aug 30 14:54:53 bob sshd(pam_unix)[14212]: session closed for user adam Aug 30 14:54:59 bob sshd(pam_unix)[14257]: session opened for user adam by (uid=1162) Aug 30 14:55:13 bob sshd(pam_unix)[14257]: session closed for user adam Aug 30 15:45:56 bob sshd(pam_unix)[14290]: session opened for user bonomo by (uid=1110) Aug 30 16:13:39 bob sshd(pam_unix)[14290]: session closed for user bonomo Aug 30 17:07:35 bob sshd(pam_unix)[14386]: session opened for user bonomo by (uid=1110) Aug 30 17:16:48 bob sshd(pam_unix)[14386]: session closed for user bonomo Aug 31 09:24:06 bob sshd(pam_unix)[15395]: session opened for user angela by (uid=1160) Aug 31 09:24:58 bob sshd(pam_unix)[15395]: session closed for user angela Aug 31 11:36:44 bob sshd(pam_unix)[15479]: session opened for user bonomo by (uid=1110) Aug 31 11:38:17 bob pam_rhosts_auth[15523]: denied to root.wisc.edu as root: access not allowed Aug 31 11:38:17 bob in.rshd[15523]: rsh denied to root.wisc.edu as root: Permission denied. Aug 31 11:38:17 bob in.rshd[15523]: rsh command was 'rdistd -S' Aug 31 11:38:17 bob rdist[15511]: bob: LOCAL ERROR: Unexpected input from server: "". Aug 31 11:40:11 bob sshd(pam_unix)[15479]: session closed for user bonomo Aug 31 13:12:20 bob sshd(pam_unix)[15561]: session opened for user adam by (uid=1162) Aug 31 13:13:15 bob sshd(pam_unix)[15649]: session opened for user adam by (uid=1162) Aug 31 13:22:16 bob sshd(pam_unix)[15649]: session closed for user adam Aug 31 13:22:20 bob sshd(pam_unix)[15561]: session closed for user adam Aug 31 13:28:06 bob sshd(pam_unix)[15698]: session opened for user bonomo by (uid=1110) Aug 31 14:34:20 bob sshd(pam_unix)[15698]: session closed for user bonomo Aug 31 22:46:00 bob kernel: Unable to handle kernel NULL pointer dereference at virtual address 00000040 Aug 31 22:46:00 bob kernel: printing eip: Aug 31 22:46:00 bob kernel: 4299a599 Aug 31 22:46:00 bob kernel: *pde = 00000000 Aug 31 22:46:00 bob kernel: Oops: 0000 [#1] Aug 31 22:46:00 bob kernel: Modules linked in: snd_mixer_oss snd soundcore nfsd exportfs lockd md5 ipv6 parport_pc lp parport autofs4 sunrpc e1000 floppy sg sr_mod microcode st dm_mod uhci_hcd ehci_hcd button battery asus_acpi ac xfs gdth aic79xx sd_mod scsi_mod Aug 31 22:46:00 bob kernel: CPU: 0 Aug 31 22:46:00 bob kernel: EIP: 0060:[<4299a599>] Not tainted Aug 31 22:46:00 bob kernel: EFLAGS: 00010202 (2.6.8-1.521) Aug 31 22:46:00 bob kernel: EIP is at xfs_iget_core+0x58/0x56e [xfs] Aug 31 22:46:00 bob kernel: eax: 00000004 ebx: 3fe78b70 ecx: 00000004 edx: 00000000 Aug 31 22:46:00 bob kernel: esi: 120b0d00 edi: 00000000 ebp: 00000008 esp: 3ee60d98 Aug 31 22:46:00 bob kernel: ds: 007b es: 007b ss: 0068 Aug 31 22:46:00 bob kernel: Process xfsdump (pid: 15864, threadinfo=3ee60000 task=417200b0) Aug 31 22:46:00 bob kernel: Stack: 3ee60da0 022f4c1a 00000000 00000000 00000000 40eaf800 39089a80 00000004 Aug 31 22:46:00 bob kernel: 39089ab4 39089a80 00000008 40eaf800 4299ab38 120b0d00 00000000 0000000 8 Aug 31 22:46:00 bob kernel: 3ee60e40 00000000 00000000 00000000 00000000 120b0d00 00000000 0000000 0 Aug 31 22:46:00 bob kernel: Call Trace: Aug 31 22:46:00 bob kernel: [<022f4c1a>] __cond_resched+0x14/0x3b Aug 31 22:46:00 bob kernel: [<4299ab38>] xfs_iget+0x89/0x124 [xfs] Aug 31 22:46:00 bob kernel: [<429a1562>] xfs_bulkstat_one+0xfe/0x4b9 [xfs] Aug 31 22:46:00 bob kernel: [<429a25c7>] xfs_bulkstat_single+0x37/0xb8 [xfs] Aug 31 22:46:00 bob kernel: [<429c1e14>] xfs_ioc_bulkstat+0xdb/0x154 [xfs] Aug 31 22:46:00 bob kernel: [<429c1932>] xfs_ioctl+0x303/0x680 [xfs] Aug 31 22:46:00 bob kernel: [<429c0814>] linvfs_readdir+0x1c1/0x1cf [xfs] Aug 31 22:46:00 bob kernel: [<0214d7e2>] follow_page_pfn+0xec/0xfd Aug 31 22:46:00 bob kernel: [<0215d9ab>] rw_vm+0x3db/0x467 Aug 31 22:46:00 bob kernel: [<429c0942>] linvfs_ioctl+0xb0/0x23a [xfs] Aug 31 22:46:00 bob kernel: [<021752d2>] sys_ioctl+0x29a/0x33c Aug 31 22:46:00 bob kernel: Code: 8b 51 3c 8b 41 38 39 fa 0f 85 fe 00 00 00 39 f0 0f 85 f6 00 Sep 1 04:02:21 bob kernel: <1>Unable to handle kernel NULL pointer dereference at virtual address 00000014 Sep 1 04:02:21 bob kernel: printing eip: Sep 1 04:02:21 bob kernel: 4299adee Sep 1 04:02:21 bob kernel: *pde = 00000000 Sep 1 04:02:21 bob kernel: Oops: 0002 [#2] Sep 1 04:02:21 bob kernel: Modules linked in: snd_mixer_oss snd soundcore nfsd exportfs lockd md5 ipv6 parport_pc lp parport autofs4 sunrpc e1000 floppy sg sr_mod microcode st dm_mod uhci_hcd ehci_hcd button battery asus_acpi ac xfs gdth aic79xx sd_mod scsi_mod Sep 1 04:02:21 bob kernel: CPU: 0 Sep 1 04:02:21 bob kernel: EIP: 0060:[<4299adee>] Not tainted Sep 1 04:02:21 bob kernel: EFLAGS: 00010202 (2.6.8-1.521) Sep 1 04:02:21 bob kernel: EIP is at xfs_iextract+0x12/0x243 [xfs] Sep 1 04:02:21 bob kernel: eax: 101fc730 ebx: 09d8a02c ecx: 09d8a02c edx: 00000004 Sep 1 04:02:21 bob kernel: esi: 00000001 edi: 00000257 ebp: 09d8a02c esp: 41eefeb0 Sep 1 04:02:21 bob kernel: ds: 007b es: 007b ss: 0068 Sep 1 04:02:21 bob kernel: Process pdflush (pid: 46, threadinfo=41eef000 task=41eaecd0) Sep 1 04:02:21 bob kernel: Stack: 00000001 00000001 09d8a02c 00000001 00000257 09d8c780 4299ad97 09d8a02c Sep 1 04:02:21 bob kernel: 429baec9 09d8c780 41eeff48 429c4eaf 00000000 429c58a5 40eaf800 00000046 Sep 1 04:02:21 bob kernel: 41eeff48 00000001 00000000 fffffffe ffffffff ffffffff 00000000 00000000 Sep 1 04:02:21 bob kernel: Call Trace: Sep 1 04:02:21 bob kernel: [<4299ad97>] xfs_ireclaim+0xe/0x53 [xfs] Sep 1 04:02:21 bob kernel: [<429baec9>] xfs_finish_reclaim+0xb8/0xbd [xfs] Sep 1 04:02:21 bob kernel: [<429c4eaf>] vn_reclaim+0x16/0xfa [xfs] Sep 1 04:02:21 bob kernel: [<429c58a5>] vn_purge+0x2b0/0x2cd [xfs] Sep 1 04:02:21 bob kernel: [<02145c28>] slab_destroy+0x3d/0x54 Sep 1 04:02:21 bob kernel: [<429b8422>] xfs_inactive+0x10e/0x3e3 [xfs] Sep 1 04:02:21 bob kernel: [<429c5bdb>] vn_remove+0x3d/0x42 [xfs] Sep 1 04:02:21 bob kernel: [<429c4402>] linvfs_clear_inode+0xf/0x19 [xfs] Sep 1 04:02:21 bob kernel: [<0217e8cf>] clear_inode+0xcc/0xf8 Sep 1 04:02:21 bob kernel: [<0217e942>] dispose_list+0x47/0x160 Sep 1 04:02:21 bob kernel: [<0217f079>] prune_icache+0x366/0x3a3 Sep 1 04:02:21 bob kernel: [<0214435f>] pdflush+0x0/0x1e Sep 1 04:02:21 bob kernel: [<021441dd>] __pdflush+0x349/0x4cb Sep 1 04:02:21 bob kernel: [<02144379>] pdflush+0x1a/0x1e Sep 1 04:02:21 bob kernel: [<02143238>] wb_kupdate+0x0/0xee Sep 1 04:02:21 bob kernel: [<0214435f>] pdflush+0x0/0x1e Sep 1 04:02:21 bob kernel: [<02134705>] kthread+0x69/0x91 Sep 1 04:02:21 bob kernel: [<0213469c>] kthread+0x0/0x91 Sep 1 04:02:21 bob kernel: [<021041d9>] kernel_thread_helper+0x5/0xb Sep 1 04:02:21 bob kernel: Code: 89 42 10 8b 45 10 89 10 8b 45 14 31 d2 89 44 24 04 8b 4c 24 Sep 1 09:03:38 bob syslogd 1.4.1: restart. you can see that the restart occurred before any shutdown ==> hang! marco: Try with the 2.6.9 based update for FC2 released today. richard, your oops is also completely unrelated (its XFS related). Please file a seperate bug (try the latest kernel update also -- it has various xfs fixes). Sorry Mr. Dave, I needed to downgrade to FC1, because this server is my production server. FC1 is much more stable to me, server is on-line since Nov/07/2004 with just one reboot until today :-) I have other identical server (backup), but is not possible to test in the same conditions (same users, link, load ...) Thanks for your efforts. Dave: I submitted bug # 139450 with reference to XFS crashes in kernel 2.6.9-1.667 not long ago. It's pretty bad. (Feel like taking it on, too?) I had also submitted somewhat related bugs 133424 and 130729 quite some time ago. I have been running (well, limpng along with) a patched version of 2.6.5-1.358 which still crashes, but does so less frequently. marco, if you get a chance to test the latest kernel and this is still a problem, please reopen. otherwise, theres not really much I can do. everyone else, I'll deal with your issues in their seperate bugs. |