Bug 175925

Summary: Bad page state at free_hot_cold_page (in process 'kswapd0'...
Product: [Fedora] Fedora Reporter: Frode Tennebø <frodet>
Component: kernelAssignee: Dave Jones <davej>
Status: CLOSED INSUFFICIENT_DATA QA Contact: Brian Brock <bbrock>
Severity: high Docs Contact:
Priority: medium    
Version: 5CC: jonstanley, pfrields, trevor, wtogami
Target Milestone: ---   
Target Release: ---   
Hardware: i386   
OS: Linux   
Whiteboard: MassClosed
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2008-01-20 04:42:00 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
/var/log/messages from the machine in trouble
none
/var/log/messages none

Description Frode Tennebø 2005-12-16 12:05:54 UTC
Description of problem:

At relatively frequent 
intervals (app. every 
two weeks) the OS hangs 
and I'm forced to do a 
hardware reset.

Just prior to the 
incident I have in /var/
log/messages one or two 
of those:

Dec 16 10:03:44 garvin 
kernel: Bad page state 
at free_hot_cold_page 
(in process 
'kswapd0', page c115c580)
Dec 16 10:03:44 garvin 
kernel: flags:0x40000000 
mapping:00000000 
mapcount:-1 cou
nt:0 (Tainted: G    B)
Dec 16 10:03:44 garvin 
kernel: Backtrace:
Dec 16 10:03:44 garvin 
kernel:  [<c013f38d>] bad
_page+0x8c/0xc3
Dec 16 10:03:44 garvin 
kernel:  [<c013fbf7>] 
free_hot_cold_page+0x47/0
xca
Dec 16 10:03:44 garvin 
kernel:  [<c01403a5>] __
pagevec_free+0x1f/0x2e
Dec 16 10:03:44 garvin 
kernel:  [<c0145123>] __
pagevec_release_nonlru+0x
29/0x8a
Dec 16 10:03:44 garvin 
kernel:  [<c01460fd>] 
shrink_list+0x207/0x47b
Dec 16 10:03:44 garvin 
kernel:  [<c014651e>] 
shrink_cache+0xe7/0x29a
Dec 16 10:03:44 garvin 
kernel:  [<c0146b41>] 
shrink_zone+0x88/0xd6
Dec 16 10:03:44 garvin 
kernel:  [<c0146f95>] 
balance_pgdat+0x20d/0x3e7
Dec 16 10:03:44 garvin 
kernel:  [<c014723a>] 
kswapd+0xcb/0x109
Dec 16 10:03:44 garvin 
kernel:  [<c012db56>] 
autoremove_wake_function
+0x0/0x37
Dec 16 10:03:44 garvin 
kernel:  [<c014716f>] 
kswapd+0x0/0x109
Dec 16 10:03:44 garvin 
kernel:  [<c0101301>] 
kernel_thread_helper+0x5/
0xb
Dec 16 10:03:44 garvin 
kernel: Trying to fix it 
up, but a reboot is 
needed


Version-Release number 
of selected component 
(if applicable):

It has been like this 
for all FC4 kernels I 
have tried. This 
includes:

kernel-2.6.11-1.1369_FC4
kernel-2.6.12-1.1398_FC4
kernel-2.6.12-1.1447_FC4
kernel-2.6.13-1.1526_FC4
kernel-2.6.13-1.1532_FC4
kernel-2.6.14-1.1637_FC4

Currently I'm running 
(for a few hours):

kernel-2.6.14-1.1644_FC4

...but will shortly boot 
into:

kernel-2.6.14-1.1653_FC4

I will confirm when/if 
it happens again with 
any of these releases.

How reproducible:

It's periodically and I 
have no deterministic 
way of reproducing it.


Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:

Prior to the actual hang 
I have the following in 
messages:

Dec 15 02:57:13 garvin 
kernel: ------------[ 
cut here ]------------
Dec 15 02:57:13 garvin 
kernel: kernel BUG at mm/
rmap.c:487!
Dec 15 02:57:13 garvin 
kernel: invalid operand: 
0000 [#1]
Dec 15 02:57:13 garvin 
kernel: Modules linked 
in: loop parport_pc lp 
parport nfs
 lockd nfs_acl autofs4 
sunrpc dm_mod ipv6 uhci_
hcd i2c_piix4 i2c_core 
snd_es18xx
 snd_seq_dummy snd_seq_
oss snd_seq_midi_event 
snd_seq snd_pcm_oss snd_
mixer_oss
snd_pcm snd_page_alloc 
snd_opl3_lib snd_timer 
snd_hwdep snd_mpu401_
uart snd_rawm
idi snd_seq_device snd 
soundcore tlan floppy ext
3 jbd aic7xxx scsi_
transport_spi
 sd_mod scsi_mod
Dec 15 02:57:13 garvin 
kernel: CPU:    0
Dec 15 02:57:13 garvin 
kernel: EIP:    0060:[<c
014f97b>]    Not tainted 
VLI
Dec 15 02:57:13 garvin 
kernel: EFLAGS: 00010286
   (2.6.14-1.1637_FC4)
Dec 15 02:57:13 garvin 
kernel: EIP is at page_
remove_rmap+0x37/0x41
Dec 15 02:57:13 garvin 
kernel: eax: ffffffff   
ebx: c85d5e30   ecx: 
00000006   e
dx: c115c580
Dec 15 02:57:13 garvin 
kernel: esi: c115c580   
edi: 0038c000   ebp: c03f
7a7c   e
sp: cd7ddec8
Dec 15 02:57:13 garvin 
kernel: ds: 007b   es: 
007b   ss: 0068
Dec 15 02:57:13 garvin 
kernel: Process udev (
pid: 4008, threadinfo=cd7
dd000 task
=c7059ab0)
Dec 15 02:57:13 garvin 
kernel: Stack: c0149137 
00000000 00391000 c03f7a7
c c0a7d0
00 00391000 00391000 
00390fff
Dec 15 02:57:13 garvin 
kernel:        c01492ca 
00391000 00000000 c03f7a7
c 000090
00 00391000 c4ce3ddc 
00391000
Dec 15 02:57:13 garvin 
kernel:        c0149401 
00391000 00000000 cd7dd
000 cdb671
c0 cd7ddf58 002d7000 
00000000
Dec 15 02:57:13 garvin 
kernel: Call Trace:
Dec 15 02:57:13 garvin 
kernel:  [<c0149137>] zap
_pte_range+0xe5/0x1f5
Dec 15 02:57:13 garvin 
kernel:  [<c01492ca>] 
unmap_page_range+0x83/0xb
7
Dec 15 02:57:13 garvin 
kernel:  [<c0149401>] 
unmap_vmas+0x103/0x222
Dec 15 02:57:13 garvin 
kernel:  [<c014dc05>] 
exit_mmap+0x7c/0x14c
Dec 15 02:57:13 garvin 
kernel:  [<c01189a0>] 
mmput+0x1f/0x95
Dec 15 02:57:13 garvin 
kernel:  [<c011d33d>] do_
exit+0xe0/0x3b8
Dec 15 02:57:13 garvin 
kernel:  [<c011d66a>] do_
group_exit+0x29/0x90
Dec 15 02:57:13 garvin 
kernel:  [<c0102edd>] 
syscall_call+0x7/0xb
Dec 15 02:57:13 garvin 
kernel: Code: ff 0f 98 c0
 84 c0 75 01 c3 8b 42 08 
83 c0 0
1 90 78 19 ba ff ff ff 
ff b8 10 00 00 00 e9 43 0
c ff ff 0f 0b e4 01 ad 4a
 32 c0 
eb d2 <0f> 0b e7 01 ad 4a
 32 c0 eb dd 55 57 56 53 
83 ec 04 89 c7 89 d3 
Dec 15 02:57:13 garvin 
kernel:  <3>Debug: 
sleeping function called 
from invalid 
context at include/linux/
rwsem.h:43
Dec 15 02:57:13 garvin 
kernel: in_atomic():1, 
irqs_disabled():0
Dec 15 02:57:13 garvin 
kernel:  [<c011ba33>] 
profile_task_exit+0x13/0x
48
Dec 15 02:57:13 garvin 
kernel:  [<c011d278>] do_
exit+0x1b/0x3b8
Dec 15 02:57:13 garvin 
kernel:  [<c0103827>] do_
divide_error+0x0/0xa8
Dec 15 02:57:14 garvin 
kernel:  [<c01039b7>] do_
invalid_op+0x0/0xab
Dec 15 02:57:14 garvin 
kernel:  [<c0103a59>] do_
invalid_op+0xa2/0xab
Dec 15 02:57:14 garvin 
kernel:  [<c014f97b>] 
page_remove_rmap+0x37/0x
41
Dec 15 02:57:14 garvin 
kernel:  [<c0148c69>] pte
_alloc_map+0x29/0xab
Dec 15 02:57:14 garvin 
kernel:  [<c0148e3b>] 
copy_pte_range+0xe8/0x214
Dec 15 02:57:14 garvin 
kernel:  [<c0103107>] 
error_code+0x4f/0x54
Dec 15 02:57:14 garvin 
kernel:  [<c014007b>] __
alloc_pages+0x14e/0x403
Dec 15 02:57:14 garvin 
kernel:  [<c014f97b>] 
page_remove_rmap+0x37/0x
41
Dec 15 02:57:14 garvin 
kernel:  [<c0149137>] zap
_pte_range+0xe5/0x1f5
Dec 15 02:57:14 garvin 
kernel:  [<c01492ca>] 
unmap_page_range+0x83/0xb
7
Dec 15 02:57:14 garvin 
kernel:  [<c0149401>] 
unmap_vmas+0x103/0x222
Dec 15 02:57:14 garvin 
kernel:  [<c014dc05>] 
exit_mmap+0x7c/0x14c
Dec 15 02:57:14 garvin 
kernel:  [<c01189a0>] 
mmput+0x1f/0x95
Dec 15 02:57:14 garvin 
kernel:  [<c011d33d>] do_
exit+0xe0/0x3b8
Dec 15 02:57:14 garvin 
kernel:  [<c011d66a>] do_
group_exit+0x29/0x90
Dec 15 02:57:14 garvin 
kernel:  [<c0102edd>] 
syscall_call+0x7/0xb
Dec 15 02:57:14 garvin 
kernel: Fixing recursive 
fault but reboot is 
needed!
Dec 15 02:57:14 garvin 
kernel: scheduling while 
atomic: udev/0x00000001/
4008
Dec 15 02:57:14 garvin 
kernel:  [<c030b8b4>] 
schedule+0x504/0x5bb
Dec 15 02:57:14 garvin 
kernel:  [<c0102edd>] 
syscall_call+0x7/0xb
Dec 15 02:57:14 garvin 
kernel:  [<c012b1a6>] __
kernel_text_address+0x1c/
0x27
Dec 15 02:57:14 garvin 
kernel:  [<c0103329>] 
show_trace+0x2a/0x78
Dec 15 02:57:14 garvin 
kernel:  [<c0102edd>] 
syscall_call+0x7/0xb
Dec 15 02:57:14 garvin 
kernel:  [<c011d599>] do_
exit+0x33c/0x3b8
Dec 15 02:57:14 garvin 
kernel:  [<c0103827>] do_
divide_error+0x0/0xa8
Dec 15 02:57:14 garvin 
kernel:  [<c01039b7>] do_
invalid_op+0x0/0xab
Dec 15 02:57:14 garvin 
kernel:  [<c0103a59>] do_
invalid_op+0xa2/0xab
Dec 15 02:57:14 garvin 
kernel:  [<c014f97b>] 
page_remove_rmap+0x37/0x
41
Dec 15 02:57:14 garvin 
kernel:  [<c0148c69>] pte
_alloc_map+0x29/0xab
Dec 15 02:57:14 garvin 
kernel:  [<c0148e3b>] 
copy_pte_range+0xe8/0x214
Dec 15 02:57:15 garvin 
kernel:  [<c0103107>] 
error_code+0x4f/0x54
Dec 15 02:57:15 garvin 
kernel:  [<c014007b>] __
alloc_pages+0x14e/0x403
Dec 15 02:57:15 garvin 
kernel:  [<c014f97b>] 
page_remove_rmap+0x37/0x
41
Dec 15 02:57:15 garvin 
kernel:  [<c0149137>] zap
_pte_range+0xe5/0x1f5
Dec 15 02:57:15 garvin 
kernel:  [<c01492ca>] 
unmap_page_range+0x83/0xb
7
Dec 15 02:57:15 garvin 
kernel:  [<c0149401>] 
unmap_vmas+0x103/0x222
Dec 15 02:57:15 garvin 
kernel:  [<c014dc05>] 
exit_mmap+0x7c/0x14c
Dec 15 02:57:15 garvin 
kernel:  [<c01189a0>] 
mmput+0x1f/0x95
Dec 15 02:57:15 garvin 
kernel:  [<c011d33d>] do_
exit+0xe0/0x3b8
Dec 15 02:57:15 garvin 
kernel:  [<c011d66a>] do_
group_exit+0x29/0x90
Dec 15 02:57:15 garvin 
kernel:  [<c0102edd>] 
syscall_call+0x7/0xb

and

Dec 15 04:40:04 garvin 
kernel: ------------[ 
cut here ]------------
Dec 15 04:40:04 garvin 
kernel: kernel BUG at mm/
rmap.c:487!
Dec 15 04:40:04 garvin 
kernel: invalid operand: 
0000 [#2]
Dec 15 04:40:04 garvin 
kernel: Modules linked 
in: loop parport_pc lp 
parport nfs
 lockd nfs_acl autofs4 
sunrpc dm_mod ipv6 uhci_
hcd i2c_piix4 i2c_core 
snd_es18xx
 snd_seq_dummy snd_seq_
oss snd_seq_midi_event 
snd_seq snd_pcm_oss snd_
mixer_oss 
snd_pcm snd_page_alloc 
snd_opl3_lib snd_timer 
snd_hwdep snd_mpu401_
uart snd_rawm
idi snd_seq_device snd 
soundcore tlan floppy ext
3 jbd aic7xxx scsi_
transport_spi
 sd_mod scsi_mod
Dec 15 04:40:04 garvin 
kernel: CPU:    0
Dec 15 04:40:04 garvin 
kernel: EIP:    0060:[<c
014f97b>]    Not tainted 
VLI
Dec 15 04:40:04 garvin 
kernel: EFLAGS: 00010286
   (2.6.14-1.1637_FC4) 
Dec 15 04:40:04 garvin 
kernel: EIP is at page_
remove_rmap+0x37/0x41
Dec 15 04:40:04 garvin 
kernel: eax: ffffffff   
ebx: c82ffe30   ecx: 
00000002   e
dx: c111b8c0
Dec 15 04:40:04 garvin 
kernel: esi: c111b8c0   
edi: 09b8c000   ebp: c03f
7a7c   e
sp: c1843de8
Dec 15 04:40:04 garvin 
kernel: ds: 007b   es: 
007b   ss: 0068
Dec 15 04:40:04 garvin 
kernel: Process udev (
pid: 6933, threadinfo=c
1843000 task
=cd943570)
Dec 15 04:40:04 garvin 
kernel: Stack: c0149137 
00000000 09bec000 c03f7a7
c c80430
98 09bec000 09bec000 09
bebfff 
Dec 15 04:40:04 garvin 
kernel:        c01492ca 
09bec000 00000000 c03f7a7
c 000c60
00 09bec000 cf329284 09
bec000 
Dec 15 04:40:04 garvin 
kernel:        c0149401 
09bec000 00000000 c
1843000 cdb661
40 c1843e78 00298000 
00000000 
Dec 15 04:40:04 garvin 
kernel: Call Trace:
Dec 15 04:40:04 garvin 
kernel:  [<c0149137>] zap
_pte_range+0xe5/0x1f5
Dec 15 04:40:04 garvin 
kernel:  [<c01492ca>] 
unmap_page_range+0x83/0xb
7
Dec 15 04:40:04 garvin 
kernel:  [<c0149401>] 
unmap_vmas+0x103/0x222
Dec 15 04:40:04 garvin 
kernel:  [<c014dc05>] 
exit_mmap+0x7c/0x14c
Dec 15 04:40:04 garvin 
kernel:  [<c01189a0>] 
mmput+0x1f/0x95
Dec 15 04:40:04 garvin 
kernel:  [<c011d33d>] do_
exit+0xe0/0x3b8
Dec 15 04:40:04 garvin 
kernel:  [<c012444e>] __
dequeue_signal+0xef/0x1b6
Dec 15 04:40:04 garvin 
kernel:  [<c011d66a>] do_
group_exit+0x29/0x90
Dec 15 04:40:04 garvin 
kernel:  [<c0125f38>] get
_signal_to_deliver+0x260/
0x36d
Dec 15 04:40:05 garvin 
kernel:  [<c030d6c0>] do_
page_fault+0x0/0x640
Dec 15 04:40:05 garvin 
kernel:  [<c0102ced>] do_
signal+0x4b/0x105
Dec 15 04:40:05 garvin 
kernel:  [<c01616a1>] vfs
_lstat+0x11/0x37
Dec 15 04:40:05 garvin 
kernel:  [<c0148c69>] pte
_alloc_map+0x29/0xab
Dec 15 04:40:05 garvin 
kernel:  [<c014ae5a>] __
handle_mm_fault+0x14a/0x
190
Dec 15 04:40:05 garvin 
kernel:  [<c0126f96>] 
notifier_call_chain+0x17/
0x27
Dec 15 04:40:05 garvin 
kernel:  [<c030d9fd>] do_
page_fault+0x33d/0x640
Dec 15 04:40:05 garvin 
kernel:  [<c030d6c0>] do_
page_fault+0x0/0x640
Dec 15 04:40:05 garvin 
kernel:  [<c0102dce>] do_
notify_resume+0x27/0x35
Dec 15 04:40:05 garvin 
kernel:  [<c0102f6e>] 
work_notifysig+0x13/0x19
Dec 15 04:40:05 garvin 
kernel: Code: ff 0f 98 c0
 84 c0 75 01 c3 8b 42 08 
83 c0 0
1 90 78 19 ba ff ff ff 
ff b8 10 00 00 00 e9 43 0
c ff ff 0f 0b e4 01 ad 4a
 32 c0 
eb d2 <0f> 0b e7 01 ad 4a
 32 c0 eb dd 55 57 56 53 
83 ec 04 89 c7 89 d3 
Dec 15 04:40:05 garvin 
kernel:  <3>Debug: 
sleeping function called 
from invalid 
context at include/linux/
rwsem.h:43
Dec 15 04:40:05 garvin 
kernel: in_atomic():1, 
irqs_disabled():0
Dec 15 04:40:05 garvin 
kernel:  [<c011ba33>] 
profile_task_exit+0x13/0x
48
Dec 15 04:40:05 garvin 
kernel:  [<c011d278>] do_
exit+0x1b/0x3b8
Dec 15 04:40:05 garvin 
kernel:  [<c0103827>] do_
divide_error+0x0/0xa8
Dec 15 04:40:05 garvin 
kernel:  [<c01039b7>] do_
invalid_op+0x0/0xab
Dec 15 04:40:05 garvin 
kernel:  [<c0103a59>] do_
invalid_op+0xa2/0xab
Dec 15 04:40:05 garvin 
kernel:  [<c014f97b>] 
page_remove_rmap+0x37/0x
41
Dec 15 04:40:05 garvin 
kernel:  [<c030cb5e>] _
read_unlock_irq+0x5/0x7
Dec 15 04:40:05 garvin 
kernel:  [<c013bd4d>] 
find_get_page+0x36/0x41
Dec 15 04:40:05 garvin 
kernel:  [<c017039e>] 
alloc_inode+0xee/0x18c
Dec 15 04:40:05 garvin 
kernel:  [<c01045b8>] do_
IRQ+0x51/0x82
Dec 15 04:40:05 garvin 
kernel:  [<c0103107>] 
error_code+0x4f/0x54
Dec 15 04:40:05 garvin 
kernel:  [<c014007b>] __
alloc_pages+0x14e/0x403
Dec 15 04:40:05 garvin 
kernel:  [<c014f97b>] 
page_remove_rmap+0x37/0x
41
Dec 15 04:40:05 garvin 
kernel:  [<c0149137>] zap
_pte_range+0xe5/0x1f5
Dec 15 04:40:05 garvin 
kernel:  [<c01492ca>] 
unmap_page_range+0x83/0xb
7
Dec 15 04:40:05 garvin 
kernel:  [<c0149401>] 
unmap_vmas+0x103/0x222
Dec 15 04:40:05 garvin 
kernel:  [<c014dc05>] 
exit_mmap+0x7c/0x14c
Dec 15 04:40:06 garvin 
kernel:  [<c01189a0>] 
mmput+0x1f/0x95
Dec 15 04:40:06 garvin 
kernel:  [<c011d33d>] do_
exit+0xe0/0x3b8
Dec 15 04:40:06 garvin 
kernel:  [<c012444e>] __
dequeue_signal+0xef/0x1b6
Dec 15 04:40:06 garvin 
kernel:  [<c011d66a>] do_
group_exit+0x29/0x90
Dec 15 04:40:06 garvin 
kernel:  [<c0125f38>] get
_signal_to_deliver+0x260/
0x36d
Dec 15 04:40:06 garvin 
kernel:  [<c030d6c0>] do_
page_fault+0x0/0x640
Dec 15 04:40:06 garvin 
kernel:  [<c0102ced>] do_
signal+0x4b/0x105
Dec 15 04:40:06 garvin 
kernel:  [<c01616a1>] vfs
_lstat+0x11/0x37
Dec 15 04:40:06 garvin 
kernel:  [<c0148c69>] pte
_alloc_map+0x29/0xab
Dec 15 04:40:06 garvin 
kernel:  [<c014ae5a>] __
handle_mm_fault+0x14a/0x
190
Dec 15 04:40:06 garvin 
kernel:  [<c0126f96>] 
notifier_call_chain+0x17/
0x27
Dec 15 04:40:06 garvin 
kernel:  [<c030d9fd>] do_
page_fault+0x33d/0x640
Dec 15 04:40:06 garvin 
kernel:  [<c030d6c0>] do_
page_fault+0x0/0x640
Dec 15 04:40:06 garvin 
kernel:  [<c0102dce>] do_
notify_resume+0x27/0x35
Dec 15 04:40:06 garvin 
kernel:  [<c0102f6e>] 
work_notifysig+0x13/0x19
Dec 15 04:40:06 garvin 
kernel: Fixing recursive 
fault but reboot is 
needed!
Dec 15 04:40:06 garvin 
kernel: scheduling while 
atomic: udev/0x00000001/
6933
Dec 15 04:40:06 garvin 
kernel:  [<c030b8b4>] 
schedule+0x504/0x5bb
Dec 15 04:40:06 garvin 
kernel:  [<c0102f6e>] 
work_notifysig+0x13/0x19
Dec 15 04:40:06 garvin 
kernel:  [<c012b1a6>] __
kernel_text_address+0x1c/
0x27
Dec 15 04:40:06 garvin 
kernel:  [<c0103329>] 
show_trace+0x2a/0x78
Dec 15 04:40:06 garvin 
kernel:  [<c0102f6e>] 
work_notifysig+0x13/0x19
Dec 15 04:40:06 garvin 
kernel:  [<c011d599>] do_
exit+0x33c/0x3b8
Dec 15 04:40:06 garvin 
kernel:  [<c0103827>] do_
divide_error+0x0/0xa8
Dec 15 04:40:06 garvin 
kernel:  [<c01039b7>] do_
invalid_op+0x0/0xab
Dec 15 04:40:06 garvin 
kernel:  [<c0103a59>] do_
invalid_op+0xa2/0xab
Dec 15 04:40:06 garvin 
kernel:  [<c014f97b>] 
page_remove_rmap+0x37/0x
41
Dec 15 04:40:06 garvin 
kernel:  [<c030cb5e>] _
read_unlock_irq+0x5/0x7
Dec 15 04:40:07 garvin 
kernel:  [<c013bd4d>] 
find_get_page+0x36/0x41
Dec 15 04:40:07 garvin 
kernel:  [<c017039e>] 
alloc_inode+0xee/0x18c
Dec 15 04:40:07 garvin 
kernel:  [<c01045b8>] do_
IRQ+0x51/0x82
Dec 15 04:40:07 garvin 
kernel:  [<c0103107>] 
error_code+0x4f/0x54
Dec 15 04:40:07 garvin 
kernel:  [<c014007b>] __
alloc_pages+0x14e/0x403
Dec 15 04:40:07 garvin 
kernel:  [<c014f97b>] 
page_remove_rmap+0x37/0x
41
Dec 15 04:40:07 garvin 
kernel:  [<c0149137>] zap
_pte_range+0xe5/0x1f5
Dec 15 04:40:07 garvin 
kernel:  [<c01492ca>] 
unmap_page_range+0x83/0xb
7
Dec 15 04:40:07 garvin 
kernel:  [<c0149401>] 
unmap_vmas+0x103/0x222
Dec 15 04:40:07 garvin 
kernel:  [<c014dc05>] 
exit_mmap+0x7c/0x14c
Dec 15 04:40:07 garvin 
kernel:  [<c01189a0>] 
mmput+0x1f/0x95
Dec 15 04:40:07 garvin 
kernel:  [<c011d33d>] do_
exit+0xe0/0x3b8
Dec 15 04:40:07 garvin 
kernel:  [<c012444e>] __
dequeue_signal+0xef/0x1b6
Dec 15 04:40:07 garvin 
kernel:  [<c011d66a>] do_
group_exit+0x29/0x90
Dec 15 04:40:07 garvin 
kernel:  [<c0125f38>] get
_signal_to_deliver+0x260/
0x36d
Dec 15 04:40:07 garvin 
kernel:  [<c030d6c0>] do_
page_fault+0x0/0x640
Dec 15 04:40:07 garvin 
kernel:  [<c0102ced>] do_
signal+0x4b/0x105
Dec 15 04:40:07 garvin 
kernel:  [<c01616a1>] vfs
_lstat+0x11/0x37
Dec 15 04:40:07 garvin 
kernel:  [<c0148c69>] pte
_alloc_map+0x29/0xab
Dec 15 04:40:07 garvin 
kernel:  [<c014ae5a>] __
handle_mm_fault+0x14a/0x
190
Dec 15 04:40:07 garvin 
kernel:  [<c0126f96>] 
notifier_call_chain+0x17/
0x27
Dec 15 04:40:07 garvin 
kernel:  [<c030d9fd>] do_
page_fault+0x33d/0x640
Dec 15 04:40:07 garvin 
kernel:  [<c030d6c0>] do_
page_fault+0x0/0x640
Dec 15 04:40:07 garvin 
kernel:  [<c0102dce>] do_
notify_resume+0x27/0x35
Dec 15 04:40:07 garvin 
kernel:  [<c0102f6e>] 
work_notifysig+0x13/0x19

Comment 1 Frode Tennebø 2005-12-16 12:16:26 UTC
The first post was almost unreadable. I have re-submitted the actual contents 
below:
 
Description of problem:

At relatively frequent intervals (app. every two weeks) the OS hangs and I'm 
forced to do a hardware reset.

Just prior to the incident I have in /var/log/messages one or two of those:

Dec 16 10:03:44 garvin kernel: Bad page state at free_hot_cold_page (in process 
'kswapd0', page c115c580)
Dec 16 10:03:44 garvin kernel: flags:0x40000000 mapping:00000000 mapcount:-1 
count:0 (Tainted: G    B)
Dec 16 10:03:44 garvin kernel: Backtrace:
Dec 16 10:03:44 garvin kernel:  [<c013f38d>] bad_page+0x8c/0xc3
Dec 16 10:03:44 garvin kernel:  [<c013fbf7>] free_hot_cold_page+0x47/0xca
Dec 16 10:03:44 garvin kernel:  [<c01403a5>] __pagevec_free+0x1f/0x2e
Dec 16 10:03:44 garvin kernel:  [<c0145123>] __pagevec_release_nonlru+0x29/0x8a
Dec 16 10:03:44 garvin kernel:  [<c01460fd>] shrink_list+0x207/0x47b
Dec 16 10:03:44 garvin kernel:  [<c014651e>] shrink_cache+0xe7/0x29a
Dec 16 10:03:44 garvin kernel:  [<c0146b41>] shrink_zone+0x88/0xd6
Dec 16 10:03:44 garvin kernel:  [<c0146f95>] balance_pgdat+0x20d/0x3e7
Dec 16 10:03:44 garvin kernel:  [<c014723a>] kswapd+0xcb/0x109
Dec 16 10:03:44 garvin kernel:  [<c012db56>] autoremove_wake_function+0x0/0x37
Dec 16 10:03:44 garvin kernel:  [<c014716f>] kswapd+0x0/0x109
Dec 16 10:03:44 garvin kernel:  [<c0101301>] kernel_thread_helper+0x5/0xb
Dec 16 10:03:44 garvin kernel: Trying to fix it up, but a reboot is needed


Version-Release number of selected component (if applicable):

It has been like this for all FC4 kernels I have tried. This includes:

kernel-2.6.11-1.1369_FC4
kernel-2.6.12-1.1398_FC4
kernel-2.6.12-1.1447_FC4
kernel-2.6.13-1.1526_FC4
kernel-2.6.13-1.1532_FC4
kernel-2.6.14-1.1637_FC4

Currently I'm running (for a few hours):

kernel-2.6.14-1.1644_FC4

...but will shortly boot into:

kernel-2.6.14-1.1653_FC4

I will confirm when/if it happens again with any of these releases.

How reproducible:

It's periodically and I have no deterministic way of reproducing it.


Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:

Prior to the actual hang I have the following in messages:

Dec 15 02:57:13 garvin kernel: ------------[ cut here ]------------
Dec 15 02:57:13 garvin kernel: kernel BUG at mm/rmap.c:487!
Dec 15 02:57:13 garvin kernel: invalid operand: 0000 [#1]
Dec 15 02:57:13 garvin kernel: Modules linked in: loop parport_pc lp parport nfs
 lockd nfs_acl autofs4 sunrpc dm_mod ipv6 uhci_hcd i2c_piix4 i2c_core snd_es18xx
 snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_pcm_oss snd_mixer_oss 
snd_pcm snd_page_alloc snd_opl3_lib snd_timer snd_hwdep snd_mpu401_uart snd_rawm
idi snd_seq_device snd soundcore tlan floppy ext3 jbd aic7xxx scsi_transport_spi
 sd_mod scsi_mod
Dec 15 02:57:13 garvin kernel: CPU:    0
Dec 15 02:57:13 garvin kernel: EIP:    0060:[<c014f97b>]    Not tainted VLI
Dec 15 02:57:13 garvin kernel: EFLAGS: 00010286   (2.6.14-1.1637_FC4) 
Dec 15 02:57:13 garvin kernel: EIP is at page_remove_rmap+0x37/0x41
Dec 15 02:57:13 garvin kernel: eax: ffffffff   ebx: c85d5e30   ecx: 00000006   e
dx: c115c580
Dec 15 02:57:13 garvin kernel: esi: c115c580   edi: 0038c000   ebp: c03f7a7c   e
sp: cd7ddec8
Dec 15 02:57:13 garvin kernel: ds: 007b   es: 007b   ss: 0068
Dec 15 02:57:13 garvin kernel: Process udev (pid: 4008, threadinfo=cd7dd000 task
=c7059ab0)
Dec 15 02:57:13 garvin kernel: Stack: c0149137 00000000 00391000 c03f7a7c c0a7d0
00 00391000 00391000 00390fff 
Dec 15 02:57:13 garvin kernel:        c01492ca 00391000 00000000 c03f7a7c 000090
00 00391000 c4ce3ddc 00391000 
Dec 15 02:57:13 garvin kernel:        c0149401 00391000 00000000 cd7dd000 cdb671
c0 cd7ddf58 002d7000 00000000 
Dec 15 02:57:13 garvin kernel: Call Trace:
Dec 15 02:57:13 garvin kernel:  [<c0149137>] zap_pte_range+0xe5/0x1f5
Dec 15 02:57:13 garvin kernel:  [<c01492ca>] unmap_page_range+0x83/0xb7
Dec 15 02:57:13 garvin kernel:  [<c0149401>] unmap_vmas+0x103/0x222
Dec 15 02:57:13 garvin kernel:  [<c014dc05>] exit_mmap+0x7c/0x14c
Dec 15 02:57:13 garvin kernel:  [<c01189a0>] mmput+0x1f/0x95
Dec 15 02:57:13 garvin kernel:  [<c011d33d>] do_exit+0xe0/0x3b8
Dec 15 02:57:13 garvin kernel:  [<c011d66a>] do_group_exit+0x29/0x90
Dec 15 02:57:13 garvin kernel:  [<c0102edd>] syscall_call+0x7/0xb
Dec 15 02:57:13 garvin kernel: Code: ff 0f 98 c0 84 c0 75 01 c3 8b 42 08 83 c0 0
1 90 78 19 ba ff ff ff ff b8 10 00 00 00 e9 43 0c ff ff 0f 0b e4 01 ad 4a 32 c0 
eb d2 <0f> 0b e7 01 ad 4a 32 c0 eb dd 55 57 56 53 83 ec 04 89 c7 89 d3 
Dec 15 02:57:13 garvin kernel:  <3>Debug: sleeping function called from invalid 
context at include/linux/rwsem.h:43
Dec 15 02:57:13 garvin kernel: in_atomic():1, irqs_disabled():0
Dec 15 02:57:13 garvin kernel:  [<c011ba33>] profile_task_exit+0x13/0x48
Dec 15 02:57:13 garvin kernel:  [<c011d278>] do_exit+0x1b/0x3b8
Dec 15 02:57:13 garvin kernel:  [<c0103827>] do_divide_error+0x0/0xa8
Dec 15 02:57:14 garvin kernel:  [<c01039b7>] do_invalid_op+0x0/0xab
Dec 15 02:57:14 garvin kernel:  [<c0103a59>] do_invalid_op+0xa2/0xab
Dec 15 02:57:14 garvin kernel:  [<c014f97b>] page_remove_rmap+0x37/0x41
Dec 15 02:57:14 garvin kernel:  [<c0148c69>] pte_alloc_map+0x29/0xab
Dec 15 02:57:14 garvin kernel:  [<c0148e3b>] copy_pte_range+0xe8/0x214
Dec 15 02:57:14 garvin kernel:  [<c0103107>] error_code+0x4f/0x54
Dec 15 02:57:14 garvin kernel:  [<c014007b>] __alloc_pages+0x14e/0x403
Dec 15 02:57:14 garvin kernel:  [<c014f97b>] page_remove_rmap+0x37/0x41
Dec 15 02:57:14 garvin kernel:  [<c0149137>] zap_pte_range+0xe5/0x1f5
Dec 15 02:57:14 garvin kernel:  [<c01492ca>] unmap_page_range+0x83/0xb7
Dec 15 02:57:14 garvin kernel:  [<c0149401>] unmap_vmas+0x103/0x222
Dec 15 02:57:14 garvin kernel:  [<c014dc05>] exit_mmap+0x7c/0x14c
Dec 15 02:57:14 garvin kernel:  [<c01189a0>] mmput+0x1f/0x95
Dec 15 02:57:14 garvin kernel:  [<c011d33d>] do_exit+0xe0/0x3b8
Dec 15 02:57:14 garvin kernel:  [<c011d66a>] do_group_exit+0x29/0x90
Dec 15 02:57:14 garvin kernel:  [<c0102edd>] syscall_call+0x7/0xb
Dec 15 02:57:14 garvin kernel: Fixing recursive fault but reboot is needed!
Dec 15 02:57:14 garvin kernel: scheduling while atomic: udev/0x00000001/4008
Dec 15 02:57:14 garvin kernel:  [<c030b8b4>] schedule+0x504/0x5bb
Dec 15 02:57:14 garvin kernel:  [<c0102edd>] syscall_call+0x7/0xb
Dec 15 02:57:14 garvin kernel:  [<c012b1a6>] __kernel_text_address+0x1c/0x27
Dec 15 02:57:14 garvin kernel:  [<c0103329>] show_trace+0x2a/0x78
Dec 15 02:57:14 garvin kernel:  [<c0102edd>] syscall_call+0x7/0xb
Dec 15 02:57:14 garvin kernel:  [<c011d599>] do_exit+0x33c/0x3b8
Dec 15 02:57:14 garvin kernel:  [<c0103827>] do_divide_error+0x0/0xa8
Dec 15 02:57:14 garvin kernel:  [<c01039b7>] do_invalid_op+0x0/0xab
Dec 15 02:57:14 garvin kernel:  [<c0103a59>] do_invalid_op+0xa2/0xab
Dec 15 02:57:14 garvin kernel:  [<c014f97b>] page_remove_rmap+0x37/0x41
Dec 15 02:57:14 garvin kernel:  [<c0148c69>] pte_alloc_map+0x29/0xab
Dec 15 02:57:14 garvin kernel:  [<c0148e3b>] copy_pte_range+0xe8/0x214
Dec 15 02:57:15 garvin kernel:  [<c0103107>] error_code+0x4f/0x54
Dec 15 02:57:15 garvin kernel:  [<c014007b>] __alloc_pages+0x14e/0x403
Dec 15 02:57:15 garvin kernel:  [<c014f97b>] page_remove_rmap+0x37/0x41
Dec 15 02:57:15 garvin kernel:  [<c0149137>] zap_pte_range+0xe5/0x1f5
Dec 15 02:57:15 garvin kernel:  [<c01492ca>] unmap_page_range+0x83/0xb7
Dec 15 02:57:15 garvin kernel:  [<c0149401>] unmap_vmas+0x103/0x222
Dec 15 02:57:15 garvin kernel:  [<c014dc05>] exit_mmap+0x7c/0x14c
Dec 15 02:57:15 garvin kernel:  [<c01189a0>] mmput+0x1f/0x95
Dec 15 02:57:15 garvin kernel:  [<c011d33d>] do_exit+0xe0/0x3b8
Dec 15 02:57:15 garvin kernel:  [<c011d66a>] do_group_exit+0x29/0x90
Dec 15 02:57:15 garvin kernel:  [<c0102edd>] syscall_call+0x7/0xb
Dec 15 04:40:04 garvin kernel: ------------[ cut here ]------------
Dec 15 04:40:04 garvin kernel: kernel BUG at mm/rmap.c:487!
Dec 15 04:40:04 garvin kernel: invalid operand: 0000 [#2]
Dec 15 04:40:04 garvin kernel: Modules linked in: loop parport_pc lp parport nfs
 lockd nfs_acl autofs4 sunrpc dm_mod ipv6 uhci_hcd i2c_piix4 i2c_core snd_es18xx
 snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_pcm_oss snd_mixer_oss 
snd_pcm snd_page_alloc snd_opl3_lib snd_timer snd_hwdep snd_mpu401_uart snd_rawm
idi snd_seq_device snd soundcore tlan floppy ext3 jbd aic7xxx scsi_transport_spi
 sd_mod scsi_mod
Dec 15 04:40:04 garvin kernel: CPU:    0
Dec 15 04:40:04 garvin kernel: EIP:    0060:[<c014f97b>]    Not tainted VLI
Dec 15 04:40:04 garvin kernel: EFLAGS: 00010286   (2.6.14-1.1637_FC4) 
Dec 15 04:40:04 garvin kernel: EIP is at page_remove_rmap+0x37/0x41
Dec 15 04:40:04 garvin kernel: eax: ffffffff   ebx: c82ffe30   ecx: 00000002   e
dx: c111b8c0
Dec 15 04:40:04 garvin kernel: esi: c111b8c0   edi: 09b8c000   ebp: c03f7a7c   e
sp: c1843de8
Dec 15 04:40:04 garvin kernel: ds: 007b   es: 007b   ss: 0068
Dec 15 04:40:04 garvin kernel: Process udev (pid: 6933, threadinfo=c1843000 task
=cd943570)
Dec 15 04:40:04 garvin kernel: Stack: c0149137 00000000 09bec000 c03f7a7c c80430
98 09bec000 09bec000 09bebfff 
Dec 15 04:40:04 garvin kernel:        c01492ca 09bec000 00000000 c03f7a7c 000c60
00 09bec000 cf329284 09bec000 
Dec 15 04:40:04 garvin kernel:        c0149401 09bec000 00000000 c1843000 cdb661
40 c1843e78 00298000 00000000 
Dec 15 04:40:04 garvin kernel: Call Trace:
Dec 15 04:40:04 garvin kernel:  [<c0149137>] zap_pte_range+0xe5/0x1f5
Dec 15 04:40:04 garvin kernel:  [<c01492ca>] unmap_page_range+0x83/0xb7
Dec 15 04:40:04 garvin kernel:  [<c0149401>] unmap_vmas+0x103/0x222
Dec 15 04:40:04 garvin kernel:  [<c014dc05>] exit_mmap+0x7c/0x14c
Dec 15 04:40:04 garvin kernel:  [<c01189a0>] mmput+0x1f/0x95
Dec 15 04:40:04 garvin kernel:  [<c011d33d>] do_exit+0xe0/0x3b8
Dec 15 04:40:04 garvin kernel:  [<c012444e>] __dequeue_signal+0xef/0x1b6
Dec 15 04:40:04 garvin kernel:  [<c011d66a>] do_group_exit+0x29/0x90
Dec 15 04:40:04 garvin kernel:  [<c0125f38>] get_signal_to_deliver+0x260/0x36d
Dec 15 04:40:05 garvin kernel:  [<c030d6c0>] do_page_fault+0x0/0x640
Dec 15 04:40:05 garvin kernel:  [<c0102ced>] do_signal+0x4b/0x105
Dec 15 04:40:05 garvin kernel:  [<c01616a1>] vfs_lstat+0x11/0x37
Dec 15 04:40:05 garvin kernel:  [<c0148c69>] pte_alloc_map+0x29/0xab
Dec 15 04:40:05 garvin kernel:  [<c014ae5a>] __handle_mm_fault+0x14a/0x190
Dec 15 04:40:05 garvin kernel:  [<c0126f96>] notifier_call_chain+0x17/0x27
Dec 15 04:40:05 garvin kernel:  [<c030d9fd>] do_page_fault+0x33d/0x640
Dec 15 04:40:05 garvin kernel:  [<c030d6c0>] do_page_fault+0x0/0x640
Dec 15 04:40:05 garvin kernel:  [<c0102dce>] do_notify_resume+0x27/0x35
Dec 15 04:40:05 garvin kernel:  [<c0102f6e>] work_notifysig+0x13/0x19
Dec 15 04:40:05 garvin kernel: Code: ff 0f 98 c0 84 c0 75 01 c3 8b 42 08 83 c0 0
1 90 78 19 ba ff ff ff ff b8 10 00 00 00 e9 43 0c ff ff 0f 0b e4 01 ad 4a 32 c0 
eb d2 <0f> 0b e7 01 ad 4a 32 c0 eb dd 55 57 56 53 83 ec 04 89 c7 89 d3 
Dec 15 04:40:05 garvin kernel:  <3>Debug: sleeping function called from invalid 
context at include/linux/rwsem.h:43
Dec 15 04:40:05 garvin kernel: in_atomic():1, irqs_disabled():0
Dec 15 04:40:05 garvin kernel:  [<c011ba33>] profile_task_exit+0x13/0x48
Dec 15 04:40:05 garvin kernel:  [<c011d278>] do_exit+0x1b/0x3b8
Dec 15 04:40:05 garvin kernel:  [<c0103827>] do_divide_error+0x0/0xa8
Dec 15 04:40:05 garvin kernel:  [<c01039b7>] do_invalid_op+0x0/0xab
Dec 15 04:40:05 garvin kernel:  [<c0103a59>] do_invalid_op+0xa2/0xab
Dec 15 04:40:05 garvin kernel:  [<c014f97b>] page_remove_rmap+0x37/0x41
Dec 15 04:40:05 garvin kernel:  [<c030cb5e>] _read_unlock_irq+0x5/0x7
Dec 15 04:40:05 garvin kernel:  [<c013bd4d>] find_get_page+0x36/0x41
Dec 15 04:40:05 garvin kernel:  [<c017039e>] alloc_inode+0xee/0x18c
Dec 15 04:40:05 garvin kernel:  [<c01045b8>] do_IRQ+0x51/0x82
Dec 15 04:40:05 garvin kernel:  [<c0103107>] error_code+0x4f/0x54
Dec 15 04:40:05 garvin kernel:  [<c014007b>] __alloc_pages+0x14e/0x403
Dec 15 04:40:05 garvin kernel:  [<c014f97b>] page_remove_rmap+0x37/0x41
Dec 15 04:40:05 garvin kernel:  [<c0149137>] zap_pte_range+0xe5/0x1f5
Dec 15 04:40:05 garvin kernel:  [<c01492ca>] unmap_page_range+0x83/0xb7
Dec 15 04:40:05 garvin kernel:  [<c0149401>] unmap_vmas+0x103/0x222
Dec 15 04:40:05 garvin kernel:  [<c014dc05>] exit_mmap+0x7c/0x14c
Dec 15 04:40:06 garvin kernel:  [<c01189a0>] mmput+0x1f/0x95
Dec 15 04:40:06 garvin kernel:  [<c011d33d>] do_exit+0xe0/0x3b8
Dec 15 04:40:06 garvin kernel:  [<c012444e>] __dequeue_signal+0xef/0x1b6
Dec 15 04:40:06 garvin kernel:  [<c011d66a>] do_group_exit+0x29/0x90
Dec 15 04:40:06 garvin kernel:  [<c0125f38>] get_signal_to_deliver+0x260/0x36d
Dec 15 04:40:06 garvin kernel:  [<c030d6c0>] do_page_fault+0x0/0x640
Dec 15 04:40:06 garvin kernel:  [<c0102ced>] do_signal+0x4b/0x105
Dec 15 04:40:06 garvin kernel:  [<c01616a1>] vfs_lstat+0x11/0x37
Dec 15 04:40:06 garvin kernel:  [<c0148c69>] pte_alloc_map+0x29/0xab
Dec 15 04:40:06 garvin kernel:  [<c014ae5a>] __handle_mm_fault+0x14a/0x190
Dec 15 04:40:06 garvin kernel:  [<c0126f96>] notifier_call_chain+0x17/0x27
Dec 15 04:40:06 garvin kernel:  [<c030d9fd>] do_page_fault+0x33d/0x640
Dec 15 04:40:06 garvin kernel:  [<c030d6c0>] do_page_fault+0x0/0x640
Dec 15 04:40:06 garvin kernel:  [<c0102dce>] do_notify_resume+0x27/0x35
Dec 15 04:40:06 garvin kernel:  [<c0102f6e>] work_notifysig+0x13/0x19
Dec 15 04:40:06 garvin kernel: Fixing recursive fault but reboot is needed!
Dec 15 04:40:06 garvin kernel: scheduling while atomic: udev/0x00000001/6933
Dec 15 04:40:06 garvin kernel:  [<c030b8b4>] schedule+0x504/0x5bb
Dec 15 04:40:06 garvin kernel:  [<c0102f6e>] work_notifysig+0x13/0x19
Dec 15 04:40:06 garvin kernel:  [<c012b1a6>] __kernel_text_address+0x1c/0x27
Dec 15 04:40:06 garvin kernel:  [<c0103329>] show_trace+0x2a/0x78
Dec 15 04:40:06 garvin kernel:  [<c0102f6e>] work_notifysig+0x13/0x19
Dec 15 04:40:06 garvin kernel:  [<c011d599>] do_exit+0x33c/0x3b8
Dec 15 04:40:06 garvin kernel:  [<c0103827>] do_divide_error+0x0/0xa8
Dec 15 04:40:06 garvin kernel:  [<c01039b7>] do_invalid_op+0x0/0xab
Dec 15 04:40:06 garvin kernel:  [<c0103a59>] do_invalid_op+0xa2/0xab
Dec 15 04:40:06 garvin kernel:  [<c014f97b>] page_remove_rmap+0x37/0x41
Dec 15 04:40:06 garvin kernel:  [<c030cb5e>] _read_unlock_irq+0x5/0x7
Dec 15 04:40:07 garvin kernel:  [<c013bd4d>] find_get_page+0x36/0x41
Dec 15 04:40:07 garvin kernel:  [<c017039e>] alloc_inode+0xee/0x18c
Dec 15 04:40:07 garvin kernel:  [<c01045b8>] do_IRQ+0x51/0x82
Dec 15 04:40:07 garvin kernel:  [<c0103107>] error_code+0x4f/0x54
Dec 15 04:40:07 garvin kernel:  [<c014007b>] __alloc_pages+0x14e/0x403
Dec 15 04:40:07 garvin kernel:  [<c014f97b>] page_remove_rmap+0x37/0x41
Dec 15 04:40:07 garvin kernel:  [<c0149137>] zap_pte_range+0xe5/0x1f5
Dec 15 04:40:07 garvin kernel:  [<c01492ca>] unmap_page_range+0x83/0xb7
Dec 15 04:40:07 garvin kernel:  [<c0149401>] unmap_vmas+0x103/0x222
Dec 15 04:40:07 garvin kernel:  [<c014dc05>] exit_mmap+0x7c/0x14c
Dec 15 04:40:07 garvin kernel:  [<c01189a0>] mmput+0x1f/0x95
Dec 15 04:40:07 garvin kernel:  [<c011d33d>] do_exit+0xe0/0x3b8
Dec 15 04:40:07 garvin kernel:  [<c012444e>] __dequeue_signal+0xef/0x1b6
Dec 15 04:40:07 garvin kernel:  [<c011d66a>] do_group_exit+0x29/0x90
Dec 15 04:40:07 garvin kernel:  [<c0125f38>] get_signal_to_deliver+0x260/0x36d
Dec 15 04:40:07 garvin kernel:  [<c030d6c0>] do_page_fault+0x0/0x640
Dec 15 04:40:07 garvin kernel:  [<c0102ced>] do_signal+0x4b/0x105
Dec 15 04:40:07 garvin kernel:  [<c01616a1>] vfs_lstat+0x11/0x37
Dec 15 04:40:07 garvin kernel:  [<c0148c69>] pte_alloc_map+0x29/0xab
Dec 15 04:40:07 garvin kernel:  [<c014ae5a>] __handle_mm_fault+0x14a/0x190
Dec 15 04:40:07 garvin kernel:  [<c0126f96>] notifier_call_chain+0x17/0x27
Dec 15 04:40:07 garvin kernel:  [<c030d9fd>] do_page_fault+0x33d/0x640
Dec 15 04:40:07 garvin kernel:  [<c030d6c0>] do_page_fault+0x0/0x640
Dec 15 04:40:07 garvin kernel:  [<c0102dce>] do_notify_resume+0x27/0x35
Dec 15 04:40:07 garvin kernel:  [<c0102f6e>] work_notifysig+0x13/0x19




Comment 2 Frode Tennebø 2005-12-16 12:18:20 UTC
*** Bug 175924 has been marked as a duplicate of this bug. ***

Comment 3 Dave Jones 2005-12-16 20:58:05 UTC
Can you try the test kernels at
http://people.redhat.com/davej/kernels/Fedora/FC4 ?  There's recently been quite
a bit of churn upstream in this area, and it'll be very interesting to know how
that behaves.


Comment 4 Frode Tennebø 2005-12-22 10:37:34 UTC
I can report that kernel-2.6.14-1.1644_FC4 exhibit the same problem as 
previously reported.

I also tried kernel-2.6.14-1.1769_FC4, and it worked as expected for some time. 
Then I did a 'find / -name "*.mp3"' and a few minutes later everything froze. 
The machine continued to answer ping, but did not respond to either already 
logged in sessions or the console. And there was no indications in the /var/log/
messages as to what happened.



Comment 5 Frode Tennebø 2006-01-02 11:07:26 UTC
Created attachment 122685 [details]
/var/log/messages from the machine in trouble

During xmas (with very little activity) it has happened again. This time it has
logged various kernel bugs, bad page and other ooops.

Comment 6 Dave Jones 2006-02-03 06:21:41 UTC
This is a mass-update to all currently open kernel bugs.

A new kernel update has been released (Version: 2.6.15-1.1830_FC4)
based upon a new upstream kernel release.

Please retest against this new kernel, as a large number of patches
go into each upstream release, possibly including changes that
may address this problem.

This bug has been placed in NEEDINFO_REPORTER state.
Due to the large volume of inactive bugs in bugzilla, if this bug is
still in this state in two weeks time, it will be closed.

Should this bug still be relevant after this period, the reporter
can reopen the bug at any time. Any other users on the Cc: list
of this bug can request that the bug be reopened by adding a
comment to the bug.

If this bug is a problem preventing you from installing the
release this version is filed against, please see bug 169613.

Thank you.


Comment 7 Frode Tennebø 2006-02-07 08:05:32 UTC
It happened again, the printouts were a bit different, but still appears to be 
the same (or related).

[root@garvin log]# uptime
 09:18:09 up 1 day, 29 min,  4 users,  load average: 1.06, 1.10, 0.76
[root@garvin log]# uname -a
Linux garvin 2.6.15-1.1830_FC4 #1 Thu Feb 2 17:23:41 EST 2006 i686 i686 i386 GNU
/Linux

/var/log/messages:

Feb  7 09:04:13 garvin kernel: Eeek! page_mapcount(page) went negative! (-1)
Feb  7 09:04:13 garvin kernel:   page->flags = 80000864
Feb  7 09:04:13 garvin kernel:   page->count = 2
Feb  7 09:04:13 garvin kernel:   page->mapping = cfebb194
Feb  7 09:04:13 garvin kernel: ------------[ cut here ]------------
Feb  7 09:04:13 garvin kernel: kernel BUG at mm/rmap.c:493!
Feb  7 09:04:13 garvin kernel: invalid operand: 0000 [#1]
Feb  7 09:04:13 garvin kernel: last sysfs file: /class/vc/vcs8/dev
Feb  7 09:04:13 garvin kernel: Modules linked in: parport_pc lp parport nfs 
lockd nfs_acl autofs4 sunrpc dm_mod ipv6 uhci_hcd i2c_piix4 i2c_core snd_es18xx 
snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_pcm_oss snd_mixer_oss 
snd_pcm snd_page_alloc snd_opl3_lib snd_timer snd_hwdep snd_mpu401_uart snd_
rawmidi snd_seq_device snd soundcore tlan floppy ext3 jbd aic7xxx scsi_transport
_spi sd_mod scsi_mod
Feb  7 09:04:13 garvin kernel: CPU:    0
Feb  7 09:04:13 garvin kernel: EIP:    0060:[<c0151d57>]    Not tainted VLI
Feb  7 09:04:13 garvin kernel: EFLAGS: 00010286   (2.6.15-1.1830_FC4) 
Feb  7 09:04:13 garvin kernel: EIP is at page_remove_rmap+0x9a/0xa8
Feb  7 09:04:14 garvin kernel: eax: ffffffff   ebx: c115daa0   ecx: ffffffff   
edx: 00000000
Feb  7 09:04:14 garvin kernel: esi: 00b8c000   edi: c115daa0   ebp: 00000020   
esp: c9851ea4
Feb  7 09:04:14 garvin kernel: ds: 007b   es: 007b   ss: 0068
Feb  7 09:04:14 garvin kernel: Process udev (pid: 24088, threadinfo=c9851000 
task=cc5fdab0)
Feb  7 09:04:14 garvin kernel: Stack: c0332b94 cfebb194 c856fe30 c014b4c0 c4cc6c
24 c041dba0 c52a5140 fffffffc 
Feb  7 09:04:14 garvin kernel:        00000000 c52a5190 c81ba008 00ba3000 c9851f
34 c81ba008 c014b6fb 00b89000 
Feb  7 09:04:14 garvin kernel:        00ba3000 c9851f34 00000000 c4cc6c24 c041
dba0 c81ba008 00ba2fff 00b89000 
Feb  7 09:04:14 garvin kernel: Call Trace:
Feb  7 09:04:14 garvin kernel:  [<c014b4c0>] zap_pte_range+0x105/0x25a     [<c
014b6fb>] unmap_page_range+0xe6/0x110
Feb  7 09:04:14 garvin kernel:  [<c014b7f7>] unmap_vmas+0xd2/0x1f1     [<c
0150022>] exit_mmap+0x5f/0xda
Feb  7 09:04:14 garvin kernel:  [<c011ad09>] mmput+0x1f/0x95     [<c011f647>] do
_exit+0xfc/0x3cf
Feb  7 09:04:14 garvin kernel:  [<c011f96f>] do_group_exit+0x29/0x90     [<c0102
e75>] syscall_call+0x7/0xb
Feb  7 09:04:14 garvin kernel: Code: 01 89 44 24 04 c7 04 24 7d 2b 33 c0 e8 b2 b
7 fc ff 8b 43 10 89 44 24 04 c7 04 24 94 2b 33 c0 e8 9f b7 fc ff eb 84 8b 53 0c 
eb d0 <0f> 0b ed 01 52 2b 33 c0 90 e9 79 ff ff ff 55 57 56 53 83 ec 0c 
Feb  7 09:04:14 garvin kernel: Continuing in 120 seconds. ^MContinuing in 119 
seconds. ^MContinuing in 118 seconds. ^MContinuing in 117 seconds. ^MContinuing 
in 116 seconds. ^MContinuing in 115 seconds. ^MContinuing in 114 seconds. ^
MContinuing in 113 seconds. ^MContinuing in 112 seconds. ^MContinuing in 111 
seconds. ^MContinuing in 110 seconds. ^MContinuing in 109 seconds. ^MContinuing 
in 108 seconds. ^MContinuing in 107 seconds. ^MContinuing in 106 seconds. ^
MContinuing in 105 seconds. ^MContinuing in 104 seconds. ^MContinuing in 103 
seconds. ^MContinuing in 102 seconds. ^MContinuing in 101 seconds. ^MContinuing 
in 100 seconds. ^MContinuing in 99 seconds. ^MContinuing in 98 seconds. ^
MContinuing in 97 seconds. ^MContinuing in 96 seconds. ^MContinuing in 95 
seconds. ^MContinuing in 94 seconds. ^MContinuing in 93 seconds. ^MContinuing in
 92 seconds. ^MContinuing in 91 seconds. ^MContinuing in 90 seconds. ^
MContinuing in 89 seconds. ^MContinuing in 88 seconds. ^MContinuing in 87 
seconds. ^MContinuing in 86 seconds. ^MContinuing in 85 seconds. ^MCo
Feb  7 09:04:14 garvin kernel: tinuing in 84 seconds. ^MContinuing in 83 
seconds. ^MContinuing in 82 seconds. ^MContinuing in 81 seconds. ^MContinuing in
 80 seconds. ^MContinuing in 79 seconds. ^MContinuing in 78 seconds. ^
MContinuing in 77 seconds. ^MContinuing in 76 seconds. ^MContinuing in 75 
seconds. ^MContinuing in 74 seconds. ^MContinuing in 73 seconds. ^MContinuing in
 72 seconds. ^MContinuing in 71 seconds. ^MContinuing in 70 seconds. ^
MContinuing in 69 seconds. ^MContinuing in 68 seconds. ^MContinuing in 67 
seconds. ^MContinuing in 66 seconds. ^MContinuing in 65 seconds. ^MContinuing in
 64 seconds. ^MContinuing in 63 seconds. ^M^MContinuing in 62 seconds. ^
MContinuing in 61 seconds. ^MContinuing in 60 seconds. ^MContinuing in 59 
seconds. ^MContinuing in 58 seconds. ^MContinuing in 57 seconds. ^MContinuing in
 56 seconds. ^MContinuing in 55 seconds. ^MContinuing in 54 seconds. ^
MContinuing in 53 seconds. ^MContinuing in 52 seconds. ^MContinuing in 51 
seconds. ^MContinuing in 50 seconds. ^MContinuing in 49 seconds. ^MContinuing in
 48 seconds. 
Feb  7 09:04:14 garvin kernel: tinuing in 47 seconds. ^MContinuing in 46 
seconds. ^MContinuing in 45 seconds. ^MContinuing in 44 seconds. ^MContinuing in
 43 seconds. ^MContinuing in 42 seconds. ^MContinuing in 41 seconds. ^
MContinuing in 40 seconds. ^MContinuing in 39 seconds. ^MContinuing in 38 
seconds. ^MContinuing in 37 seconds. ^MContinuing in 36 seconds. ^MContinuing in
 35 seconds. ^MContinuing in 34 seconds. ^MContinuing in 33 seconds. ^
MContinuing in 32 seconds. ^MContinuing in 31 seconds. ^MContinuing in 30 
seconds. ^MContinuing in 29 seconds. ^MContinuing in 28 seconds. ^MContinuing in
 27 seconds. ^MContinuing in 26 seconds. ^M^MContinuing in 25 seconds. ^
MContinuing in 24 seconds. ^MContinuing in 23 seconds. ^MContinuing in 22 
seconds. ^MContinuing in 21 seconds. ^MContinuing in 20 seconds. ^MContinuing in
 19 seconds. ^MContinuing in 18 seconds. ^MContinuing in 17 seconds. ^
MContinuing in 16 seconds. ^MContinuing in 15 seconds. ^MContinuing in 14 
seconds. ^MContinuing in 13 seconds. ^MContinuing in 12 seconds. ^MContinuing in
 11 seconds. 
Feb  7 09:04:14 garvin kernel: tinuing in 10 seconds. ^MContinuing in 9 seconds.
 ^MContinuing in 8 seconds. ^MContinuing in 7 seconds. ^MContinuing in 6 
seconds. ^MContinuing in 5 seconds. ^MContinuing in 4 seconds. ^MContinuing in 3
 seconds. ^MContinuing in 2 seconds. ^MContinuing in 1 seconds. 
Feb  7 09:04:14 garvin kernel:  <3>Debug: sleeping function called from invalid 
context at include/linux/rwsem.h:43
Feb  7 09:04:14 garvin kernel: in_atomic():1, irqs_disabled():0
Feb  7 09:04:14 garvin kernel:  [<c011dda3>] profile_task_exit+0x13/0x43     [<c
011f566>] do_exit+0x1b/0x3cf
Feb  7 09:04:14 garvin kernel:  [<c01041ab>] do_divide_error+0x0/0xa8     [<c
010433b>] do_invalid_op+0x0/0xab
Feb  7 09:04:14 garvin kernel:  [<c01043dd>] do_invalid_op+0xa2/0xab     [<c0151
d57>] page_remove_rmap+0x9a/0xa8
Feb  7 09:04:14 garvin kernel:  [<c011d2ff>] call_console_drivers+0x80/0x14c
     [<c011d8ac>] release_console_sem+0x77/0xb4
Feb  7 09:04:14 garvin kernel:  [<c011d6f5>] vprintk+0x1e7/0x2a9     [<c014c569
>] do_wp_page+0x204/0x311
Feb  7 09:04:14 garvin kernel:  [<c01039a7>] error_code+0x4f/0x54     [<c0151d57
>] page_remove_rmap+0x9a/0xa8
Feb  7 09:04:15 garvin kernel:  [<c014b4c0>] zap_pte_range+0x105/0x25a     [<c
014b6fb>] unmap_page_range+0xe6/0x110
Feb  7 09:04:15 garvin kernel:  [<c014b7f7>] unmap_vmas+0xd2/0x1f1     [<c
0150022>] exit_mmap+0x5f/0xda
Feb  7 09:04:15 garvin kernel:  [<c011ad09>] mmput+0x1f/0x95     [<c011f647>] do
_exit+0xfc/0x3cf
Feb  7 09:04:15 garvin kernel:  [<c011f96f>] do_group_exit+0x29/0x90     [<c0102
e75>] syscall_call+0x7/0xb
Feb  7 09:04:15 garvin kernel: Fixing recursive fault but reboot is needed!
Feb  7 09:04:15 garvin kernel: scheduling while atomic: udev/0x00000001/24088
Feb  7 09:04:15 garvin kernel:  [<c03159d4>] schedule+0x504/0x5bb     [<c012d3a6
>] __kernel_text_address+0x1c/0x27
Feb  7 09:04:15 garvin kernel:  [<c0103bcc>] show_trace+0x2d/0xb5     [<c0102e75
>] syscall_call+0x7/0xb
Feb  7 09:04:15 garvin kernel:  [<c011f89e>] do_exit+0x353/0x3cf     [<c01041ab
>] do_divide_error+0x0/0xa8
Feb  7 09:04:15 garvin kernel:  [<c010433b>] do_invalid_op+0x0/0xab     [<c01043
dd>] do_invalid_op+0xa2/0xab
Feb  7 09:04:15 garvin kernel:  [<c0151d57>] page_remove_rmap+0x9a/0xa8     [<c
011d2ff>] call_console_drivers+0x80/0x14c
Feb  7 09:04:15 garvin kernel:  [<c011d8ac>] release_console_sem+0x77/0xb4     
[<c011d6f5>] vprintk+0x1e7/0x2a9
Feb  7 09:04:15 garvin kernel:  [<c014c569>] do_wp_page+0x204/0x311     [<c01039
a7>] error_code+0x4f/0x54
Feb  7 09:04:15 garvin kernel:  [<c0151d57>] page_remove_rmap+0x9a/0xa8     [<c
014b4c0>] zap_pte_range+0x105/0x25a
Feb  7 09:04:15 garvin kernel:  [<c014b6fb>] unmap_page_range+0xe6/0x110     [<c
014b7f7>] unmap_vmas+0xd2/0x1f1
Feb  7 09:04:15 garvin kernel:  [<c0150022>] exit_mmap+0x5f/0xda     [<c011ad09
>] mmput+0x1f/0x95
Feb  7 09:04:15 garvin kernel:  [<c011f647>] do_exit+0xfc/0x3cf     [<c011f96f>]
 do_group_exit+0x29/0x90
Feb  7 09:04:15 garvin kernel:  [<c0102e75>] syscall_call+0x7/0xb    



Comment 8 Frode Tennebø 2006-02-09 17:23:15 UTC
Created attachment 124444 [details]
/var/log/messages

This happens quite regularly now. I have attached a copy of /var/log/messages.

Also note that udev is behaving unexpectedly:
top - 18:27:37 up 3 days,  9:39,  4 users,  load average: 7.34, 7.61, 7.26
Tasks:	68 total,   1 running,	67 sleeping,   0 stopped,   0 zombie
Cpu(s): 45.4% us,  5.2% sy,  0.0% ni, 49.3% id,  0.0% wa,  0.0% hi,  0.0% si
Mem:	255940k total,	 250736k used,	   5204k free,	   8408k buffers
Swap:	522104k total,	    104k used,	 522000k free,	 151500k cached

  PID USER	PR  NI	VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
		       
  465 root	13  -4 45104  42m  336 S 42.2 17.1 877:49.38 udevd

Comment 9 Dave Jones 2006-09-17 03:28:29 UTC
[This comment added as part of a mass-update to all open FC4 kernel bugs]

FC4 has now transitioned to the Fedora legacy project, which will continue to
release security related updates for the kernel.  As this bug is not security
related, it is unlikely to be fixed in an update for FC4, and has been migrated
to FC5.

Please retest with Fedora Core 5.

Thank you.


Comment 10 Dave Jones 2006-10-16 19:55:14 UTC
A new kernel update has been released (Version: 2.6.18-1.2200.fc5)
based upon a new upstream kernel release.

Please retest against this new kernel, as a large number of patches
go into each upstream release, possibly including changes that
may address this problem.

This bug has been placed in NEEDINFO state.
Due to the large volume of inactive bugs in bugzilla, if this bug is
still in this state in two weeks time, it will be closed.

Should this bug still be relevant after this period, the reporter
can reopen the bug at any time. Any other users on the Cc: list
of this bug can request that the bug be reopened by adding a
comment to the bug.

In the last few updates, some users upgrading from FC4->FC5
have reported that installing a kernel update has left their
systems unbootable. If you have been affected by this problem
please check you only have one version of device-mapper & lvm2
installed.  See bug 207474 for further details.

If this bug is a problem preventing you from installing the
release this version is filed against, please see bug 169613.

If this bug has been fixed, but you are now experiencing a different
problem, please file a separate bug for the new problem.

Thank you.

Comment 11 Trevor Cordes 2007-03-16 18:46:51 UTC
Never had this happen before, until today.  I had recently updated this box to
the newest FC5 kernel 2.6.20-1.2300.fc5smp.  Within 24 hours I get the below and
system crash/hang.  If it happens again, will switch back to 2288, which did not
do this (at least not in the few weeks it was out).

Oddities about this system: Arco IDE DupliDisk1 hardware RAID 1.  Everything
else is pretty standard.

Message from syslogd@firewall at Fri Mar 16 04:35:37 2007 ...
firewall kernel: Bad page state in process 'cat'

Message from syslogd@firewall at Fri Mar 16 04:35:37 2007 ...
firewall kernel: page:c1307ff0 flags:0x40000000 mapping:c1af05c8 mapcount:0
count:0 (Not tainted)

Message from syslogd@firewall at Fri Mar 16 04:35:37 2007 ...
firewall kernel: Trying to fix it up, but a reboot is needed


Comment 12 Hamish Waterer 2007-03-27 02:49:11 UTC
Since upgrading to the FC5 kernels 2.6.20-1.2300.fc5smp and 2.6.20-1.2307.fc5smp
my system periodically hangs requiring a hardware reset. I find the following
comments in /var/log/messages:

Mar 21 14:29:30 des119 kernel: Bad page state in process 'grep'

Mar 22 15:28:44 des119 kernel: Bad page state in process 'grep'
Mar 22 15:28:44 des119 kernel: page:c1307ff0 flags:0x40000000 mapping:f7ec35c8
mapcount:0 count:0 (Tainted: PF    )
Mar 22 15:28:44 des119 kernel: Trying to fix it up, but a reboot is needed

Mar 24 10:58:08 des119 kernel: Bad page state in process 'apt-get'

Mar 26 12:42:59 des119 kernel: Bad page state in process 'apt-cache'


Comment 13 Trevor Cordes 2007-03-27 03:41:16 UTC
Hamish, you got this behaviour for sure in 2307?  The box this happened on for
me is back to 2288 and 100% stable.  2300 crashed 3 times before I gave up.  I
was hoping 2307 solved it.  2.6.20.* seems very buggy so far (this and other
problems).

Comment 14 Angelyn W. Moore 2007-04-27 00:10:19 UTC
I believe I have had this same problem with 2.6.19-1.2288.fc5 and
2.6.20-1.2312.fc5 but NOT with 2.6.19-1.2288.2.4.fc5.  The machine never hung
during the month 2.6.19-1.2288.2.4.fc5 was booted, but hung frequently while
running 2.6.19-1.2288.fc5 or 2.6.20-1.2312.fc5.  Most of the time it did not log
anything useful, but last night finally gave me 
"kernel: Bad page state in process 'apple2'" when it hung running 2.6.20-1.2312.fc5.


Comment 15 Dave Allan 2007-05-31 02:06:24 UTC
I started seeing this problem with 2.6.20-1.2316.fc5smp  I had forgotten to turn
on yum nightly updates, so I went directly from 2.6.19-1.2288.2.4.fc5smp to
2.6.20-1.2316.fc5smp  I have reverted to 2.6.19-1.2288.2.4.fc5smp, which seems
to have been stable for me.  

Comment 16 Dave Allan 2007-05-31 02:09:10 UTC
I should add that based on the thread above, I would hazard a guess that we're
looking at a new bug, not the one against which this ticket was originally opened.

Comment 17 Jon Stanley 2008-01-20 04:42:00 UTC
(this is a mass-close to kernel bugs in NEEDINFO state)

As indicated previously there has been no update on the progress of this bug
therefore I am closing it as INSUFFICIENT_DATA. Please re-open if the issue
still occurs for you and I will try to assist in its resolution. Thank you for
taking the time to report the initial bug.

If you believe that this bug was closed in error, please feel free to reopen
this bug.