Bug 839700

Summary: kernel: bad_page
Product: [Fedora] Fedora Reporter: Alon Levy <alevy>
Component: kernelAssignee: Kernel Maintainer List <kernel-maint>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 17CC: dblechte, gansalmon, itamar, jforbes, jonathan, kernel-maint, madhu.chinakonda
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-01-08 15:48:05 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Alon Levy 2012-07-12 15:15:05 UTC
Description of problem:
I get multiple bad_page on an untainted kernel-3.4.4-5.fc17.x86_64

Version-Release number of selected component (if applicable):
kernel-3.4.4-5.fc17.x86_64

How reproducible:
Can't reliably reproduce. However only been running this kernel for a day, and it's already appeared a number of times in this boot session (60 times).

Additional Info:

I was going to check for the "page allocation failure" and then noticed those. They first happened a segfault in unoconv (libreoffice command line util), not sure if it's related. I also have a "page allocation failure" stacktrace but since it's after the bad_page taint I'm not sure if it's interesting to report.

Jul 12 00:59:59 garlic kernel: [ 6780.694644] Pid: 5311, comm: pool Not tainted 3.4.4-5.fc17.x86_64 #1
Jul 12 00:59:59 garlic kernel: [ 6780.694646] Call Trace:
Jul 12 00:59:59 garlic kernel: [ 6780.694655]  [<ffffffff815ebc8b>] bad_page+0xe6/0xfb
Jul 12 00:59:59 garlic kernel: [ 6780.694661]  [<ffffffff8112904e>] get_page_from_freelist+0x74e/0x8f0
Jul 12 00:59:59 garlic kernel: [ 6780.694665]  [<ffffffff8112939d>] __alloc_pages_nodemask+0x1ad/0x950
Jul 12 00:59:59 garlic kernel: [ 6780.694671]  [<ffffffff81161ca0>] alloc_pages_current+0xb0/0x120
Jul 12 00:59:59 garlic kernel: [ 6780.694677]  [<ffffffff811206d7>] __page_cache_alloc+0xb7/0xf0
Jul 12 00:59:59 garlic kernel: [ 6780.694682]  [<ffffffff8112c272>] __do_page_cache_readahead+0xf2/0x240
Jul 12 00:59:59 garlic kernel: [ 6780.694685]  [<ffffffff81120330>] ? sleep_on_page+0x20/0x20
Jul 12 00:59:59 garlic kernel: [ 6780.694688]  [<ffffffff8112c6e1>] ra_submit+0x21/0x30
Jul 12 00:59:59 garlic kernel: [ 6780.694691]  [<ffffffff8112c805>] ondemand_readahead+0x115/0x240
Jul 12 00:59:59 garlic kernel: [ 6780.694694]  [<ffffffff8112c9b0>] page_cache_async_readahead+0x80/0xa0
Jul 12 00:59:59 garlic kernel: [ 6780.694696]  [<ffffffff81122743>] generic_file_aio_read+0x543/0x720
Jul 12 00:59:59 garlic kernel: [ 6780.694733]  [<ffffffffa0156875>] xfs_file_aio_read+0x155/0x320 [xfs]
Jul 12 00:59:59 garlic kernel: [ 6780.694738]  [<ffffffff81180a0e>] do_sync_read+0xde/0x120
Jul 12 00:59:59 garlic kernel: [ 6780.694743]  [<ffffffff81269dd2>] ? security_file_permission+0x92/0xb0
Jul 12 00:59:59 garlic kernel: [ 6780.694746]  [<ffffffff81180eb1>] ? rw_verify_area+0x61/0xf0
Jul 12 00:59:59 garlic kernel: [ 6780.694748]  [<ffffffff81181349>] vfs_read+0xa9/0x180
Jul 12 00:59:59 garlic kernel: [ 6780.694751]  [<ffffffff8118146a>] sys_read+0x4a/0x90
Jul 12 00:59:59 garlic kernel: [ 6780.694756]  [<ffffffff815fc9a9>] system_call_fastpath+0x16/0x1b
Jul 12 00:59:59 garlic kernel: [ 6780.694758] Disabling lock debugging due to kernel taint
Jul 12 00:59:59 garlic kernel: [ 6780.696384] BUG: Bad page state in process pool  pfn:3b5ed
Jul 12 00:59:59 garlic kernel: [ 6780.696388] page:ffffea0000ed7b40 count:0 mapcount:0 mapping:          (null) index:0x7fd08b029
Jul 12 00:59:59 garlic kernel: [ 6780.696390] page flags: 0x20000000000014(referenced|dirty)
Jul 12 00:59:59 garlic kernel: [ 6780.696392] Modules linked in: hidp fuse ebtable_nat ebtables rfcomm bnep be2iscsi iscsi_boot_sysfs bnx2i cnic uio cxgb4i cxgb4 ipt_MASQUERADE cxgb3i iptable_nat nf_nat cxgb3 md
io libcxgbi ib_iser rdma_cm ib_addr iw_cm ib_cm ib_sa ib_mad xt_CHECKSUM ib_core iptable_mangle iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi bridge stp llc ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 ip
6table_filter nf_conntrack_ipv4 nf_defrag_ipv4 ip6_tables xt_state nf_conntrack xts gf128mul dm_crypt arc4 snd_hda_codec_hdmi uvcvideo snd_hda_codec_conexant videobuf2_vmalloc videobuf2_memops videobuf2_core vid
eodev media btusb bluetooth coretemp microcode iwlwifi mac80211 cfg80211 rfkill snd_hda_intel intel_ips i2c_i801 snd_hda_codec snd_hwdep snd_pcm iTCO_wdt snd_page_alloc iTCO_vendor_support snd_timer snd soundcor
e e1000e binfmt_misc nfsd vhost_net nfs_acl auth_rpcgss tun lockd macvtap macvlan sunrpc kvm_intel kvm uinput xfs crc32c_intel ghash_clmulni_intel firewire_ohci sdhci_pci sdhci firewi
Jul 12 00:59:59 garlic kernel: re_core mmc_core crc_itu_t mxm_wmi wmi i915 video i2c_algo_bit drm_kms_helper drm i2c_core [last unloaded: scsi_wait_scan]

Comment 1 Dave Jones 2012-07-12 15:54:10 UTC
Just to rule it out, can you give memtest86 a run for while ?

Comment 2 Alon Levy 2012-07-22 10:45:34 UTC
(In reply to comment #1)
> Just to rule it out, can you give memtest86 a run for while ?

Ran it for 30 minutes, no problems.

Comment 3 Justin M. Forbes 2012-09-11 14:51:02 UTC
Are you still having an issue with this using newer 3.5.3 kernels?

Comment 4 Alon Levy 2012-09-11 15:34:28 UTC
x86_64 garlic:~ alon$ dmesg | grep "Bad page" | wc -l
0

Looks like none. I don't remember seeing it. Just note that
a) I'm running 3.6.0-0.rc4.git2.3.fc18.x86_64
b) it's a release build I created from the latest kernel fedpkg of above version.