Bug 193269

Summary:	kernel lock ups - Eeek! page_mapcount(page) went negative! (-1)
Product:	[Fedora] Fedora	Reporter:	Matt Olson <redhat>
Component:	kernel	Assignee:	Dave Jones <davej>
Status:	CLOSED NOTABUG	QA Contact:	Brian Brock <bbrock>
Severity:	high	Docs Contact:
Priority:	medium
Version:	5	CC:	pfrields, wtogami
Target Milestone:	---
Target Release:	---
Hardware:	x86_64
OS:	Linux
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2006-05-27 02:56:12 UTC	Type:	---
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Matt Olson 2006-05-26 18:25:06 UTC

Description of problem:
System will randomly lock up (kernel panic).  Seeing this with 2.6.16-1.2111_FC5
and 2.6.16-1.2122_FC5.  This is on a laptop that has been a stable platform for
at least 18 months.  Unless I have a memory module that spontaneously went bad
then this may be a bug.  I'll pull the binary nvidia driver and vmware driver. 
Next I'll revert to 2.6.9 and see if the problem goes away.  No changes to
system aside from the normal FC5 updates.  


Version-Release number of selected component (if applicable):
2.6.16-1.2111_FC5 and 2.6.16-1.2122_FC5

How reproducible:
Random.  Happening 3-4 per day.


Steps to Reproduce:

Unknown.

1.
2.
3.
  
Actual results:


Expected results:


Additional info:
Syslog:

May 26 10:56:19 seti kernel: Eeek! page_mapcount(page) went negative! (-1)
May 26 10:56:19 seti kernel:   page->flags = 400
May 26 10:56:19 seti kernel:   page->count = 1
May 26 10:56:19 seti kernel:   page->mapping = 0000000000000000
May 26 10:56:19 seti kernel: ----------- [cut here ] --------- [please bite here
] ---------
May 26 10:56:19 seti kernel: Kernel BUG at mm/rmap.c:560
May 26 10:56:19 seti kernel: invalid opcode: 0000 [1] SMP
May 26 10:56:19 seti kernel: last sysfs file:
/devices/system/cpu/cpu0/cpufreq/scaling_cur_freq
May 26 10:56:19 seti kernel: CPU 0
May 26 10:56:19 seti kernel: Modules linked in: autofs4 vmnet(U) vmmon(U)
orinoco hermes ipt_REJECT xt_tcpudp xt_state ip_conntrack nfnetlink
iptable_filter ip_tables x_tables dm_mirror dm_mod video button battery ac ipv6
lp parport_pc parport nvram ohci1394 ehci_hcd ieee1394 ohci_hcd snd_intel8x0m
snd_intel8x0 snd_ac97_codec snd_ac97_bus snd_seq_dummy snd_seq_oss
snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss nvidia(U)
snd_pcm snd_timer 8139cp snd 8139too mii i2c_nforce2 soundcore snd_page_alloc
i2c_core ext3 jbd sata_nv libata sd_mod scsi_mod
May 26 10:56:19 seti kernel: Pid: 3175, comm: gij Tainted: P     
2.6.16-1.2111_FC5 #1
May 26 10:56:19 seti kernel: RIP: 0010:[<ffffffff8016ed45>]
<ffffffff8016ed45>{page_remove_rmap+123}
May 26 10:56:19 seti kernel: RSP: 0018:ffff810015609b98  EFLAGS: 00010286
May 26 10:56:19 seti kernel: RAX: 00000000ffffffff RBX: ffff810001000000 RCX:
0000000000020000
May 26 10:56:19 seti kernel: RDX: 0000000000000000 RSI: 0000000000000246 RDI:
ffffffff803c0a30
May 26 10:56:19 seti kernel: RBP: 0000000000000000 R08: 00000000ffffffff R09:
ffff8100156098e8
May 26 10:56:19 seti kernel: R10: 0000000000000001 R11: 0000000000000000 R12:
00000000591a1000
May 26 10:56:19 seti kernel: R13: ffff81000ff37d08 R14: 0000000059200000 R15:
ffff810002416400
May 26 10:56:19 seti kernel: FS:  000000004e616940(0000)
GS:ffffffff8050c000(0000) knlGS:00000000f7fd86c0
May 26 10:56:19 seti kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
May 26 10:56:19 seti kernel: CR2: 0000003801b76900 CR3: 00000000155bc000 CR4:
00000000000006e0
May 26 10:56:19 seti kernel: Process gij (pid: 3175, threadinfo
ffff810015608000, task ffff8100158727e0)
May 26 10:56:19 seti kernel: Stack: ffff810001000000 ffffffff80167647
0000000000000000 ffff810015609c78
May 26 10:56:19 seti kernel:        ffffffffffffffff 0000000000000000
ffff81000e52eb50 ffff810015609c80
May 26 10:56:19 seti kernel:        000000000002b427 0000000000000000
May 26 10:56:19 seti kernel: Call Trace: <ffffffff80167647>{unmap_vmas+1069}
<ffffffff8016ae93>{exit_mmap+120}
May 26 10:56:19 seti kernel:        <ffffffff8012fa60>{mmput+37}
<ffffffff80188a1c>{flush_old_exec+2438}
May 26 10:56:19 seti kernel:        <ffffffff8017f080>{vfs_read+313}
<ffffffff801abb9f>{load_elf_binary+0}
May 26 10:56:19 seti kernel:        <ffffffff801abfe0>{load_elf_binary+1089}
<ffffffff80202e12>{find_next_bit+89}
May 26 10:56:19 seti kernel:        <ffffffff8017e6ae>{do_sync_read+199}
<ffffffff80160381>{__alloc_pages+113}
May 26 10:56:19 seti kernel:        <ffffffff801abb9f>{load_elf_binary+0}
<ffffffff80187af5>{search_binary_handler+177}
May 26 10:56:19 seti kernel:        <ffffffff80189b11>{do_execve+396}
<ffffffff80109464>{sys_execve+54}
May 26 10:56:19 seti kernel:        <ffffffff8010abdb>{stub_execve+103}
May 26 10:56:19 seti kernel:
May 26 10:56:19 seti kernel: Code: 0f 0b 68 64 73 36 80 c2 30 02 48 83 ce ff bf
20 00 00 00 5b
May 26 10:56:19 seti kernel: RIP <ffffffff8016ed45>{page_remove_rmap+123} RSP
<ffff810015609b98>

Comment 1 Dave Jones 2006-05-27 02:56:12 UTC

The only times I've seen this reported recently, those binary modules have been
loaded. I'm convinced this is a problem in those drivers, not the core kernel.

2.6.9 won't exhibit this problem as it has a different VM which lacks this
safety check.

Comment 2 Matt Olson 2006-06-14 21:25:30 UTC

Thanks Dave.  Just wanted to confirm that no problems experienced after pulling
the forgien modules.  Here's some more info. for searchability.  I was running a
patched:

NVIDIA-Linux-x86_64-1.0-8178-pkg2.run  using patch
NVIDIA_kernel-1.0-8178-U012206.diff.txt

I installed the new nvidia binary driver this morning:

NVIDIA-Linux-x86_64-1.0-8762-pkg2.run

and it seems to been running fine (for now).

I'm currently running 2.6.16-1.2133_FC5 #1 SMP Tue Jun 6 00:51:53 EDT 2006
x86_64 x86_64 x86_64 GNU/Linux.