Bug 748546

Summary: [abrt] kernel: [679635.081940] BUG: Bad page state in process baobab pfn:12715b: TAINTED P----B
Product: [Fedora] Fedora Reporter: joshua
Component: kernelAssignee: Kernel Maintainer List <kernel-maint>
Status: CLOSED INSUFFICIENT_DATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 15CC: gansalmon, itamar, jonathan, kernel-maint, madhu.chinakonda
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Unspecified   
Whiteboard: abrt_hash:e0e611ad311c1e12155e86928c6ddcfd98e7f861
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-11-29 19:53:02 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description joshua 2011-10-24 18:54:40 UTC
abrt version: 2.0.3
architecture:   x86_64
cmdline:        ro root=/dev/mapper/VolGrp00-f15_root rd_LVM_LV=VolGrp00/f15_root rd_LUKS_UUID=luks-8b11dfff-048a-4f4d-8b42-839d17404a06 rd_LVM_LV=VolGrp00/swap rd_NO_MD rd_NO_DM LANG=en_US.UTF-8 SYSFONT=latarcyrheb-sun16 KEYTABLE=us rhgb quiet
component:      kernel
kernel:         2.6.40.6-0.fc15.x86_64
kernel_tainted: 33
os_release:     Fedora release 15 (Lovelock)
package:        kernel
reason:         [679635.081940] BUG: Bad page state in process baobab  pfn:12715b
time:           Fri Oct 21 12:14:54 2011

backtrace:
:[679635.081940] BUG: Bad page state in process baobab  pfn:12715b
:[679635.081947] page:ffffea000408cbe8 count:0 mapcount:0 mapping:          (null) index:0x6695
:[679635.081949] page flags: 0x40000000000004(referenced)
:[679635.081956] Pid: 4195, comm: baobab Tainted: P    B       2.6.40.6-0.fc15.x86_64 #1
:[679635.081958] Call Trace:
:[679635.081971]  [<ffffffff810e158f>] ? dump_page+0xbe/0xc3
:[679635.081975]  [<ffffffff810e1684>] bad_page+0xf0/0x106
:[679635.081979]  [<ffffffff810e2571>] get_page_from_freelist+0x4b3/0x638
:[679635.081984]  [<ffffffff81041345>] ? should_resched+0xe/0x2d
:[679635.081988]  [<ffffffff810e29be>] __alloc_pages_nodemask+0x15b/0x756
:[679635.081996]  [<ffffffff8110fa8b>] alloc_pages_vma+0xf5/0xfa
:[679635.081999]  [<ffffffff810f9169>] handle_pte_fault+0x16f/0x7ab
:[679635.082002]  [<ffffffff81041345>] ? should_resched+0xe/0x2d
:[679635.082007]  [<ffffffff810f6717>] ? pmd_offset+0x19/0x3f
:[679635.082010]  [<ffffffff810f9b1d>] handle_mm_fault+0x1c8/0x1db
:[679635.082015]  [<ffffffff8148b8c0>] do_page_fault+0x354/0x39b
:[679635.082032]  [<ffffffffa04e083f>] ? hfsplus_lookup+0x228/0x271 [hfsplus]
:[679635.082036]  [<ffffffff81041345>] ? should_resched+0xe/0x2d
:[679635.082040]  [<ffffffff81486eb5>] ? _cond_resched+0xe/0x22
:[679635.082046]  [<ffffffffa04e1156>] ? kmap+0x26/0x5a [hfsplus]
:[679635.082049]  [<ffffffff81041345>] ? should_resched+0xe/0x2d
:[679635.082052]  [<ffffffff81486eb5>] ? _cond_resched+0xe/0x22
:[679635.082057]  [<ffffffff811351bb>] ? might_fault+0x23/0x23
:[679635.082060]  [<ffffffff81488c55>] page_fault+0x25/0x30
:[679635.082063]  [<ffffffff811351bb>] ? might_fault+0x23/0x23
:[679635.082066]  [<ffffffff81135214>] ? filldir+0x59/0xc7
:[679635.082072]  [<ffffffffa04e11dc>] ? hfsplus_bnode_read+0x52/0xa2 [hfsplus]
:[679635.082077]  [<ffffffffa04dfd3e>] hfsplus_readdir+0x15f/0x3a9 [hfsplus]
:[679635.082083]  [<ffffffff8111f5e8>] ? signal_pending+0x17/0x21
:[679635.082086]  [<ffffffff8111f604>] ? fatal_signal_pending+0x12/0x29
:[679635.082089]  [<ffffffff81122057>] ? __mem_cgroup_try_charge+0x104/0x408
:[679635.082095]  [<ffffffff81454dab>] ? unix_stream_recvmsg+0x516/0x536
:[679635.082099]  [<ffffffff8112064b>] ? __mem_cgroup_commit_charge+0x9b/0xa6
:[679635.082102]  [<ffffffff81122856>] ? mem_cgroup_charge_common+0xb1/0xc3
:[679635.082106]  [<ffffffff810e57b5>] ? __lru_cache_add+0x34/0x5b
:[679635.082108]  [<ffffffff810e59d3>] ? lru_cache_add_lru+0x3e/0x40
:[679635.082113]  [<ffffffff811018b3>] ? page_add_new_anon_rmap+0x71/0x84
:[679635.082116]  [<ffffffff810f65ba>] ? set_pte_at+0xe/0x12
:[679635.082119]  [<ffffffff810f9242>] ? handle_pte_fault+0x248/0x7ab
:[679635.082122]  [<ffffffff810f6717>] ? pmd_offset+0x19/0x3f
:[679635.082125]  [<ffffffff810f9b1d>] ? handle_mm_fault+0x1c8/0x1db
:[679635.082129]  [<ffffffff811351bb>] ? might_fault+0x23/0x23
:[679635.082132]  [<ffffffff811351bb>] ? might_fault+0x23/0x23
:[679635.082135]  [<ffffffff8113547a>] vfs_readdir+0x76/0xac
:[679635.082138]  [<ffffffff81135596>] sys_getdents+0x7e/0xce
:[679635.082142]  [<ffffffff8148ed02>] system_call_fastpath+0x16/0x1b

comment:
:sudo yum -y install luci
:sudo service luci start
:crash!!!

kernel_tainted_long:
:Proprietary module has been loaded.
:System has hit bad_page.

Comment 1 joshua 2011-10-24 19:26:47 UTC
This coincides with this, not sure if it is related:

Raw Audit Messages
type=AVC msg=audit(1319482070.930:685): avc:  denied  { write } for  pid=21425 comm="paster" name="etc" dev=dm-2 ino=1019341 scontext=system_u:system_r:piranha_web_t:s0 tcontext=system_u:object_r:piranha_web_conf_t:s0 tclass=dir


type=SYSCALL msg=audit(1319482070.930:685): arch=x86_64 syscall=open success=no exit=EACCES a0=22cb0b0 a1=241 a2=1b6 a3=9 items=0 ppid=21404 pid=21425 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm=paster exe=/usr/bin/python subj=system_u:system_r:piranha_web_t:s0 key=(null)

Hash: paster,piranha_web_t,piranha_web_conf_t,dir,write

Comment 2 Dave Jones 2011-10-24 19:39:10 UTC
unless you can reproduce this without the proprietary module, there's not much we can do with this.

Comment 3 joshua 2011-10-24 19:45:47 UTC
What proprietary module?  The only thing that I think I have would be kmod-wl for the Broadcom wifi NIC.

$ rpm -qa | grep kmod
kmod-wl-5.60.48.36-2.fc15.9.x86_64
kmod-wl-2.6.40.6-0.fc15.x86_64-5.60.48.36-2.fc15.9.x86_64

Comment 4 joshua 2011-10-24 19:46:58 UTC
I'm confused as kmod-wl doesn't seem to be related to this kernel dump.  hfsplus seem to be the root of this problem, not kmod-wl

Comment 5 Dave Jones 2011-10-24 21:14:48 UTC
:[679635.081949] page flags: 0x40000000000004(referenced)

Something has set that high bit, which is why the kernel is freaking out.
It looks like memory corruption of some kind.  That hfsplus blew up may just be a side-effect, that that was the module that owned the memory those page-tables pointed to.

Which is why we want to rule out the modules we don't ship.