Bug 1546709

Summary: BUG at mm/khugepaged.c:533
Product: [Fedora] Fedora Reporter: Jeremy Harris <jeharris>
Component: kernelAssignee: Kernel Maintainer List <kernel-maint>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: unspecified    
Version: 27CC: airlied, ajax, bskeggs, ewk, hdegoede, ichavero, itamar, jarodwilson, jglisse, john.j5live, jonathan, josef, kernel-maint, labbott, linville, mchehab, mjg59, rheron, steved, vtgoal
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: kernel-4.15.10-300.fc27 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-03-20 18:21:55 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Jeremy Harris 2018-02-19 11:39:29 UTC
Description of problem:

 Occasional process hang; messages broadcast to ttys via syslog:


  Message from syslogd@lap at Feb 19 10:59:10 ...
   kernel:page:fffffac1800a0000 count:513 mapcount:1
    mapping:ffff95657ef359a1 index:0x7f95d3400 compound_mapcount: 0

  Message from syslogd@lap at Feb 19 10:59:10 ...
   kernel:flags:
    0xffffe00048268(uptodate|lru|active|owner_priv_1|head|swapbacked)


Seen to affect firefox and virt-machine-manager at least; one it has occurred many programs cannot be started afresh  (the WM cursor spins for ~20s).

Version-Release number of selected component (if applicable):

  4.15.3-300.fc27.x86_64


How reproducible:

  Seen about 3 times over the last week.


Steps to Reproduce:
1.  Normal system operations (laptop; multiple VMs running)



Additional info:

  /var/log/messages:

Feb 19 10:59:10 lap kernel: page:fffffac1800a0000 count:513 mapcount:1 mapping:ffff95657ef359a1 index:0x7f95d3400 compound_mapcount: 0
Feb 19 10:59:10 lap kernel: flags: 0xffffe00048268(uptodate|lru|active|owner_priv_1|head|swapbacked)
Feb 19 10:59:10 lap kernel: raw: 000ffffe00048268 ffff95657ef359a1 00000007f95d3400 0000020100000000
Feb 19 10:59:10 lap kernel: raw: fffffac18edea9a0 fffffac18e7085a0 00000000000db400 ffff9567a3269800
Feb 19 10:59:10 lap kernel: page dumped because: VM_BUG_ON_PAGE(PageCompound(page))
Feb 19 10:59:10 lap kernel: page->mem_cgroup:ffff9567a3269800
Feb 19 10:59:10 lap kernel: ------------[ cut here ]------------
Feb 19 10:59:10 lap kernel: kernel BUG at mm/khugepaged.c:533!
Feb 19 10:59:10 lap kernel: invalid opcode: 0000 [#1] SMP PTI
Feb 19 10:59:10 lap kernel: Modules linked in: vhost_net vhost tap fuse xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 tun ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack libcrc32c iptable_mangle iptable_raw iptable_security ebtable_filter ebtables ip6table_filter ip6_tables sunrpc vfat fat rmi_smbus rmi_core arc4 intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp snd_hda_codec_hdmi iwlmvm kvm_intel iTCO_wdt mac80211 snd_hda_codec_realtek iTCO_vendor_support mei_wdt kvm snd_hda_codec_generic irqbypass intel_cstate snd_hda_intel intel_uncore iwlwifi uvcvideo intel_rapl_perf
Feb 19 10:59:10 lap kernel: snd_hda_codec videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_core snd_hda_core cfg80211 videodev snd_hwdep snd_seq snd_seq_device snd_pcm media mei_me snd_timer thinkpad_acpi wmi_bmof rtsx_pci_ms joydev tpm_tis memstick i2c_i801 mei tpm_tis_core snd soundcore intel_pch_thermal tpm shpchp rfkill dm_crypt hid_logitech_hidpp hid_logitech_dj mmc_block nouveau i915 rtsx_pci_sdmmc mmc_core mxm_wmi ttm e1000e i2c_algo_bit drm_kms_helper crct10dif_pclmul crc32_pclmul crc32c_intel ptp drm ghash_clmulni_intel serio_raw rtsx_pci pps_core wmi video
Feb 19 10:59:10 lap kernel: CPU: 2 PID: 66 Comm: khugepaged Not tainted 4.15.3-300.fc27.x86_64 #1
Feb 19 10:59:10 lap kernel: Hardware name: LENOVO 20FXS0BB14/20FXS0BB14, BIOS R07ET63W (2.03 ) 03/15/2016
Feb 19 10:59:10 lap kernel: RIP: 0010:khugepaged+0x1af6/0x2130
Feb 19 10:59:10 lap kernel: RSP: 0018:ffffacacc1b4bdc0 EFLAGS: 00010282
Feb 19 10:59:10 lap kernel: RAX: 0000000000000021 RBX: fffffac1800a0000 RCX: 0000000000000006
Feb 19 10:59:10 lap kernel: RDX: 0000000000000000 RSI: 0000000000000086 RDI: ffff9567c14968f0
Feb 19 10:59:10 lap kernel: RBP: fffffac18e3a5b40 R08: 00000000000004a8 R09: 0000000000000004
Feb 19 10:59:10 lap kernel: R10: ffffacacc1b4bd70 R11: ffffffffb995b1ed R12: 00007f95f7e00000
Feb 19 10:59:10 lap kernel: R13: ffff95661113eaf0 R14: ffff9567a9ea0000 R15: 8000000002800825
Feb 19 10:59:10 lap kernel: FS:  0000000000000000(0000) GS:ffff9567c1480000(0000) knlGS:0000000000000000
Feb 19 10:59:10 lap kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Feb 19 10:59:10 lap kernel: CR2: 0000000000481046 CR3: 00000002ee20a005 CR4: 00000000003626e0
Feb 19 10:59:10 lap kernel: Call Trace:
Feb 19 10:59:10 lap kernel: ? finish_wait+0x80/0x80
Feb 19 10:59:10 lap kernel: ? collapse_shmem+0xdd0/0xdd0
Feb 19 10:59:10 lap kernel: kthread+0x113/0x130
Feb 19 10:59:10 lap kernel: ? kthread_create_worker_on_cpu+0x70/0x70
Feb 19 10:59:10 lap kernel: ret_from_fork+0x35/0x40
Feb 19 10:59:10 lap kernel: Code: ff e9 e7 fd ff ff bb 07 00 00 00 49 89 c7 e9 20 fb ff ff 48 83 ea 01 e9 66 fc ff ff 48 c7 c6 d8 3f 0a b9 48 89 df e8 0a 82 fa ff <0f> 0b 31 c9 4c 89 fa 48 89 de 4c 89 f7 e8 58 f1 fd ff e9 2e fa 
Feb 19 10:59:10 lap kernel: RIP: khugepaged+0x1af6/0x2130 RSP: ffffacacc1b4bdc0
Feb 19 10:59:10 lap kernel: ---[ end trace a734c2f4d682e3bd ]---
Feb 19 10:59:11 lap abrt-dump-journal-oops[1080]: abrt-dump-journal-oops: Found oopses: 1
Feb 19 10:59:11 lap abrt-dump-journal-oops[1080]: abrt-dump-journal-oops: Creating problem directories
Feb 19 10:59:11 lap abrt-server[17302]: Deleting problem directory oops-2018-02-19-10:59:11-1080-0 (dup of oops-2018-02-13-14:17:42-1042-0)
Feb 19 10:59:12 lap abrt-notification[17309]: System encountered a non-fatal error in finish_wait()
Feb 19 10:59:12 lap abrt-dump-journal-oops[1080]: Reported 1 kernel oopses to Abrt

Comment 1 vt 2018-02-22 02:43:35 UTC
*** Bug 1538411 has been marked as a duplicate of this bug. ***

Comment 2 Laura Abbott 2018-02-22 17:31:07 UTC
Upstream gave a fix, can you try this scratch build when it finishes? https://koji.fedoraproject.org/koji/taskinfo?taskID=25238698

Comment 3 vt 2018-03-13 01:06:10 UTC
Running the scratch build kernel (kernel-4.15.4-301.rhbz1546709.fc27.x86_64) in the past 2 weeks, didn't see the original kernel call trace.

Comment 4 Fedora Update System 2018-03-16 00:35:47 UTC
kernel-4.15.10-300.fc27 has been submitted as an update to Fedora 27. https://bodhi.fedoraproject.org/updates/FEDORA-2018-959aac67a3

Comment 5 Fedora Update System 2018-03-16 00:37:13 UTC
kernel-4.15.10-200.fc26 has been submitted as an update to Fedora 26. https://bodhi.fedoraproject.org/updates/FEDORA-2018-296bf0c332

Comment 6 Fedora Update System 2018-03-16 17:25:11 UTC
kernel-4.15.10-200.fc26 has been pushed to the Fedora 26 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2018-296bf0c332

Comment 7 Fedora Update System 2018-03-16 17:56:08 UTC
kernel-4.15.10-300.fc27 has been pushed to the Fedora 27 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2018-959aac67a3

Comment 8 Fedora Update System 2018-03-20 17:35:37 UTC
kernel-4.15.10-200.fc26 has been pushed to the Fedora 26 stable repository. If problems still persist, please make note of it in this bug report.

Comment 9 Fedora Update System 2018-03-20 18:21:55 UTC
kernel-4.15.10-300.fc27 has been pushed to the Fedora 27 stable repository. If problems still persist, please make note of it in this bug report.