Bug 67946

Summary: kernel crashes due to invalid paging request in pte_chain_alloc
Product: [Retired] Red Hat Linux Reporter: marc.lefranc
Component: kernelAssignee: Arjan van de Ven <arjanv>
Status: CLOSED CURRENTRELEASE QA Contact: Brian Brock <bbrock>
Severity: high Docs Contact:
Priority: medium    
Version: 7.3CC: jonmisc
Target Milestone: ---   
Target Release: ---   
Hardware: athlon   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2004-09-30 15:39:43 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description marc.lefranc 2002-07-04 16:27:02 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.0.0) Gecko/20020605

Description of problem:
Sometimes the machine suddently experiences a critical slowing down. Some
commands cannot be launched any longer. Load increases steadily. If I kill the X
server it does not come back. The only way to get the machine to a normal state
is to reboot it.

After that, I find oopses in /var/log/messages such as given in the additional
information section. They seem to point at the VM and at the rmap patches.I have
had many such crashes with 
kernel 2.4.18-4, but the frequency has somewhat lowered with 2.4.18-5. I thought
the bug was cured until I experienced it again today. In fact, the first glitch
occurred yesterday as the log shows.  
This killed the X server but everything else was normal until today

I don't know if if is a hardware problem. Before this problem, the machine
(Athlon XP1800/Asus A7V266-E/512 MB RAM) has run concurrently several intensive
numerical simulations programs with a total virtual size of 750 MB for days
without the slightest problem, so I believe the machine can stand the heat.


Version-Release number of selected component (if applicable):


How reproducible:
Didn't try


Additional info:

Jul  3 15:16:28 platon kernel: Unable to handle kernel paging request at virtual
address 2316857e
Jul  3 15:16:28 platon kernel:  printing eip:
Jul  3 15:16:28 platon kernel: c0137b1c
Jul  3 15:16:28 platon kernel: *pde = 00000000
Jul  3 15:16:28 platon kernel: Oops: 0000
Jul  3 15:16:28 platon kernel: sg nfs lockd sunrpc cmpci soundcore mga agpgart
autofs via-rhine mii 3c59x ide
Jul  3 15:16:28 platon kernel: CPU:    0
Jul  3 15:16:28 platon kernel: EIP:    0010:[<c0137b1c>]    Not tainted
Jul  3 15:16:28 platon kernel: EFLAGS: 00013206
Jul  3 15:16:28 platon kernel: 
Jul  3 15:16:28 platon kernel: EIP is at pte_chain_alloc [kernel] 0x1c (2.4.18-5)
Jul  3 15:16:28 platon kernel: eax: 00000001   ebx: c02cfc24   ecx: c1000030  
edx: 2316857e
Jul  3 15:16:28 platon kernel: esi: d978229c   edi: d978229c   ebp: 00000025  
esp: dacf5e50
Jul  3 15:16:30 platon kernel: ds: 0018   es: 0018   ss: 0018
Jul  3 15:16:30 platon kernel: Process X (pid: 1316, stackpage=dacf5000)
Jul  3 15:16:31 platon kernel: Stack: c12303b0 c013776a c02cfc24 0a010067
c12303b0 c0127756 dacf4000 dacf4000 
Jul  3 15:16:32 platon kernel:        ffff037f ffff0020 d3521140 de6813c0
de6813c0 d3521140 c01277a3 de6813c0 
Jul  3 15:16:32 platon gnome-name-server[4795]: input condition is: 0x11, exiting
Jul  3 15:16:33 platon kernel:        d3521140 d978229c 00000001 468a7000
c0107fca bffff570 bffff510 dacf4564 
Jul  3 15:16:35 platon kernel: Call Trace: [<c013776a>] page_add_rmap [kernel] 0x3a 
Jul  3 15:16:36 platon kernel: [<c0127756>] do_anonymous_page [kernel] 0xf6 
Jul  3 15:16:37 platon kernel: [<c01277a3>] do_no_page [kernel] 0x33 
Jul  3 15:16:38 platon kernel: [<c0107fca>] setup_sigcontext [kernel] 0xda 
Jul  3 15:16:38 platon kernel: [<c01279ea>] handle_mm_fault [kernel] 0xca 
Jul  3 15:16:39 platon kernel: [<c010853d>] handle_signal [kernel] 0x7d 
Jul  3 15:16:40 platon kernel: [<c01155ca>] do_page_fault [kernel] 0x12a 
Jul  3 15:16:41 platon kernel: [<c0107cc5>] restore_sigcontext [kernel] 0x115 
Jul  3 15:16:42 platon kernel: [<c0107da9>] sys_sigreturn [kernel] 0xb9 
Jul  3 15:16:43 platon gdm(pam_unix)[1315]: session closed for user lefranc
Jul  3 15:16:43 platon kernel: [<c01154a0>] do_page_fault [kernel] 0x0 
Jul  3 15:16:45 platon kernel: [<c0108a04>] error_code [kernel] 0x34 
Jul  3 15:16:46 platon kernel: 
Jul  3 15:16:46 platon kernel: 
Jul  3 15:16:47 platon kernel: Code: 8b 02 89 83 bc 00 00 00 c7 02 00 00 00 00
89 d0 5b c3 89 f6 



Jul  4 16:30:57 platon kernel: Unable to handle kernel paging request at virtual
address 50622f37
Jul  4 16:30:57 platon kernel:  printing eip:
Jul  4 16:30:57 platon kernel: c0137b1c
Jul  4 16:30:57 platon kernel: *pde = 00000000
Jul  4 16:30:57 platon kernel: Oops: 0000
Jul  4 16:30:57 platon kernel: sg nfs lockd sunrpc cmpci soundcore mga agpgart
autofs via-rhine mii 3c59x ide
Jul  4 16:30:57 platon kernel: CPU:    0
Jul  4 16:30:57 platon kernel: EIP:    0010:[<c0137b1c>]    Not tainted
Jul  4 16:30:57 platon kernel: EFLAGS: 00010202
Jul  4 16:30:57 platon kernel: 
Jul  4 16:30:57 platon kernel: EIP is at pte_chain_alloc [kernel] 0x1c (2.4.18-5)
Jul  4 16:30:57 platon kernel: eax: 00000001   ebx: c02cfc24   ecx: c1000030  
edx: 50622f37
Jul  4 16:30:57 platon kernel: esi: cc83227c   edi: cc83227c   ebp: 00000025  
esp: cd5d9e50
Jul  4 16:30:57 platon kernel: ds: 0018   es: 0018   ss: 0018
Jul  4 16:30:57 platon kernel: Process emacs (pid: 13251, stackpage=cd5d9000)
Jul  4 16:30:57 platon kernel: Stack: c1645b10 c013776a c02cfc24 1cac4067
c1645b10 c0127756 00000286 00000286 
Jul  4 16:30:57 platon kernel:        00000001 d31d33c0 c3596940 d920a140
d920a140 c3596940 c01277a3 d920a140 
Jul  4 16:30:57 platon kernel:        c3596940 cc83227c 00000001 08c9f006
00000282 00001000 0001bac4 00000282 
Jul  4 16:30:57 platon kernel: Call Trace: [<c013776a>] page_add_rmap [kernel] 0x3a 
Jul  4 16:30:57 platon kernel: [<c0127756>] do_anonymous_page [kernel] 0xf6 
Jul  4 16:30:57 platon kernel: [<c01277a3>] do_no_page [kernel] 0x33 
Jul  4 16:30:57 platon kernel: [<c01279ea>] handle_mm_fault [kernel] 0xca 
Jul  4 16:30:57 platon kernel: [<c01728ac>] pty_unthrottle [kernel] 0x3c 
Jul  4 16:30:57 platon kernel: [<c01155ca>] do_page_fault [kernel] 0x12a 
Jul  4 16:30:57 platon kernel: [<c016d1fc>] tty_read [kernel] 0xac 
Jul  4 16:30:57 platon kernel: [<c01c6dcb>] kfree_skbmem [kernel] 0xb 
Jul  4 16:30:57 platon kernel: [<c01c6f3a>] __kfree_skb [kernel] 0x11a 
Jul  4 16:30:57 platon kernel: [<c01cace2>] net_tx_action [kernel] 0x52 
Jul  4 16:30:57 platon kernel: [<c011cdab>] do_softirq [kernel] 0x4b 
Jul  4 16:30:57 platon kernel: [<c01154a0>] do_page_fault [kernel] 0x0 
Jul  4 16:30:57 platon kernel: [<c0108a04>] error_code [kernel] 0x34 
Jul  4 16:30:57 platon kernel: 
Jul  4 16:30:57 platon kernel: 
Jul  4 16:30:57 platon kernel: Code: 8b 02 89 83 bc 00 00 00 c7 02 00 00 00 00
89 d0 5b c3 89 f6

Comment 1 Jonathan Cheyer 2002-07-15 01:44:02 UTC
I just upgraded to 2.4.18-5 using rhn and I'm having the same error occur
whenever I try to restart the X server or do a user logout from kde.  After this
occurs, I need to do a reboot (ctrl-alt-del works but ctrl-alt-backspace does not).

When I boot on 2.4.18-3 kernel, I don't get this problem.


Additional info:
Jul 14 18:10:28 syzygy kernel: Unable to handle kernel paging request at virtual
address 0100001b
Jul 14 18:10:28 syzygy kernel:  printing eip:
Jul 14 18:10:28 syzygy kernel: c0128962
Jul 14 18:10:28 syzygy kernel: *pde = 00000000
Jul 14 18:10:28 syzygy kernel: Oops: 0000
Jul 14 18:10:28 syzygy kernel: i810 parport_pc agpgart lp parport autofs 3c59x
ide-scsi scsi_mod ide-cd cdrom
Jul 14 18:10:28 syzygy kernel: CPU:    0
Jul 14 18:10:28 syzygy kernel: EIP:    0010:[<c0128962>]    Not tainted
Jul 14 18:10:28 syzygy kernel: EFLAGS: 00013246
Jul 14 18:10:28 syzygy kernel:
Jul 14 18:10:28 syzygy kernel: EIP is at unlock_page [kernel] 0x2 (2.4.18-5)
Jul 14 18:10:28 syzygy kernel: eax: 01000000   ebx: c161eeb0   ecx: c161eeb0  
edx: 00000000
Jul 14 18:10:28 syzygy kernel: esi: dbfb0000   edi: dec9f800   ebp: 00000000  
esp: ddde1ee4
Jul 14 18:10:28 syzygy kernel: ds: 0018   es: 0018   ss: 0018
Jul 14 18:10:28 syzygy kernel: Process X (pid: 1094, stackpage=ddde1000)
Jul 14 18:10:28 syzygy kernel: Stack: c161eeb0 dbfb0000 e09a2953 c161eeb0
ddb98160 ddb47000 e09a29af dec9f800
Jul 14 18:10:28 syzygy kernel:        dbfb0000 00000000 bffffce0 ddde1f60
e09a2f0c dec9f800 dec9f800 00000002
Jul 14 18:10:28 syzygy kernel:        00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000
Jul 14 18:10:28 syzygy kernel: Call Trace: [<e09a2953>] i810_free_page [i810] 0x33
Jul 14 18:10:28 syzygy kernel: [<e09a29af>] i810_dma_cleanup [i810] 0x3f
Jul 14 18:10:28 syzygy kernel: [<e09a2f0c>] i810_dma_init [i810] 0xac
Jul 14 18:10:28 syzygy kernel: [<e099e924>] i810_ioctl [i810] 0xe4
Jul 14 18:10:28 syzygy kernel: [<c0145c47>] sys_ioctl [kernel] 0x217
Jul 14 18:10:28 syzygy kernel: [<c0109e1c>] do_IRQ [kernel] 0x9c
Jul 14 18:10:28 syzygy kernel: [<c0108913>] system_call [kernel] 0x33
Jul 14 18:10:28 syzygy kernel:
Jul 14 18:10:28 syzygy kernel:
Jul 14 18:10:28 syzygy kernel: Code: 0f b6 50 1b 8b 1c 95 2c 55 33 c0 89 c2 69
d2 01 00 37 9e 8b
Jul 14 18:10:29 syzygy kdm[1126]: IO Error in XOpenDisplay
Jul 14 18:10:29 syzygy kdm[1084]: Display :0 cannot be opened
Jul 14 18:10:31 syzygy kdm[1128]: IO Error in XOpenDisplay
Jul 14 18:10:31 syzygy kdm[1084]: Display :0 cannot be opened
Jul 14 18:10:32 syzygy kdm[1130]: IO Error in XOpenDisplay
Jul 14 18:10:32 syzygy kdm[1084]: Display :0 cannot be opened
Jul 14 18:10:34 syzygy kdm[1132]: IO Error in XOpenDisplay
Jul 14 18:10:34 syzygy kdm[1084]: Display :0 cannot be opened
Jul 14 18:10:34 syzygy kdm[1084]: Display :0 is being disabled (restarting too fast)
... [additional duplicates of messages above are removed] ...
Jul 14 18:12:08 syzygy init: Id "x" respawning too fast: disabled for 5 minutes


Comment 2 Arjan van de Ven 2002-07-15 06:53:53 UTC
cheyspam:
that is a different bug and actually fixed in
http://people.redhat.com/arjanv/testkernels

Comment 3 Bugzilla owner 2004-09-30 15:39:43 UTC
Thanks for the bug report. However, Red Hat no longer maintains this version of
the product. Please upgrade to the latest version and open a new bug if the problem
persists.

The Fedora Legacy project (http://fedoralegacy.org/) maintains some older releases, 
and if you believe this bug is interesting to them, please report the problem in
the bug tracker at: http://bugzilla.fedora.us/