Bug 1546709 - BUG at mm/khugepaged.c:533
Summary: BUG at mm/khugepaged.c:533
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 27
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
: 1538411 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-02-19 11:39 UTC by Jeremy Harris
Modified: 2018-03-20 18:21 UTC (History)
20 users (show)

Fixed In Version: kernel-4.15.10-300.fc27
Clone Of:
Environment:
Last Closed: 2018-03-20 18:21:55 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)

Description Jeremy Harris 2018-02-19 11:39:29 UTC
Description of problem:

 Occasional process hang; messages broadcast to ttys via syslog:


  Message from syslogd@lap at Feb 19 10:59:10 ...
   kernel:page:fffffac1800a0000 count:513 mapcount:1
    mapping:ffff95657ef359a1 index:0x7f95d3400 compound_mapcount: 0

  Message from syslogd@lap at Feb 19 10:59:10 ...
   kernel:flags:
    0xffffe00048268(uptodate|lru|active|owner_priv_1|head|swapbacked)


Seen to affect firefox and virt-machine-manager at least; one it has occurred many programs cannot be started afresh  (the WM cursor spins for ~20s).

Version-Release number of selected component (if applicable):

  4.15.3-300.fc27.x86_64


How reproducible:

  Seen about 3 times over the last week.


Steps to Reproduce:
1.  Normal system operations (laptop; multiple VMs running)



Additional info:

  /var/log/messages:

Feb 19 10:59:10 lap kernel: page:fffffac1800a0000 count:513 mapcount:1 mapping:ffff95657ef359a1 index:0x7f95d3400 compound_mapcount: 0
Feb 19 10:59:10 lap kernel: flags: 0xffffe00048268(uptodate|lru|active|owner_priv_1|head|swapbacked)
Feb 19 10:59:10 lap kernel: raw: 000ffffe00048268 ffff95657ef359a1 00000007f95d3400 0000020100000000
Feb 19 10:59:10 lap kernel: raw: fffffac18edea9a0 fffffac18e7085a0 00000000000db400 ffff9567a3269800
Feb 19 10:59:10 lap kernel: page dumped because: VM_BUG_ON_PAGE(PageCompound(page))
Feb 19 10:59:10 lap kernel: page->mem_cgroup:ffff9567a3269800
Feb 19 10:59:10 lap kernel: ------------[ cut here ]------------
Feb 19 10:59:10 lap kernel: kernel BUG at mm/khugepaged.c:533!
Feb 19 10:59:10 lap kernel: invalid opcode: 0000 [#1] SMP PTI
Feb 19 10:59:10 lap kernel: Modules linked in: vhost_net vhost tap fuse xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 tun ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack libcrc32c iptable_mangle iptable_raw iptable_security ebtable_filter ebtables ip6table_filter ip6_tables sunrpc vfat fat rmi_smbus rmi_core arc4 intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp snd_hda_codec_hdmi iwlmvm kvm_intel iTCO_wdt mac80211 snd_hda_codec_realtek iTCO_vendor_support mei_wdt kvm snd_hda_codec_generic irqbypass intel_cstate snd_hda_intel intel_uncore iwlwifi uvcvideo intel_rapl_perf
Feb 19 10:59:10 lap kernel: snd_hda_codec videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_core snd_hda_core cfg80211 videodev snd_hwdep snd_seq snd_seq_device snd_pcm media mei_me snd_timer thinkpad_acpi wmi_bmof rtsx_pci_ms joydev tpm_tis memstick i2c_i801 mei tpm_tis_core snd soundcore intel_pch_thermal tpm shpchp rfkill dm_crypt hid_logitech_hidpp hid_logitech_dj mmc_block nouveau i915 rtsx_pci_sdmmc mmc_core mxm_wmi ttm e1000e i2c_algo_bit drm_kms_helper crct10dif_pclmul crc32_pclmul crc32c_intel ptp drm ghash_clmulni_intel serio_raw rtsx_pci pps_core wmi video
Feb 19 10:59:10 lap kernel: CPU: 2 PID: 66 Comm: khugepaged Not tainted 4.15.3-300.fc27.x86_64 #1
Feb 19 10:59:10 lap kernel: Hardware name: LENOVO 20FXS0BB14/20FXS0BB14, BIOS R07ET63W (2.03 ) 03/15/2016
Feb 19 10:59:10 lap kernel: RIP: 0010:khugepaged+0x1af6/0x2130
Feb 19 10:59:10 lap kernel: RSP: 0018:ffffacacc1b4bdc0 EFLAGS: 00010282
Feb 19 10:59:10 lap kernel: RAX: 0000000000000021 RBX: fffffac1800a0000 RCX: 0000000000000006
Feb 19 10:59:10 lap kernel: RDX: 0000000000000000 RSI: 0000000000000086 RDI: ffff9567c14968f0
Feb 19 10:59:10 lap kernel: RBP: fffffac18e3a5b40 R08: 00000000000004a8 R09: 0000000000000004
Feb 19 10:59:10 lap kernel: R10: ffffacacc1b4bd70 R11: ffffffffb995b1ed R12: 00007f95f7e00000
Feb 19 10:59:10 lap kernel: R13: ffff95661113eaf0 R14: ffff9567a9ea0000 R15: 8000000002800825
Feb 19 10:59:10 lap kernel: FS:  0000000000000000(0000) GS:ffff9567c1480000(0000) knlGS:0000000000000000
Feb 19 10:59:10 lap kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Feb 19 10:59:10 lap kernel: CR2: 0000000000481046 CR3: 00000002ee20a005 CR4: 00000000003626e0
Feb 19 10:59:10 lap kernel: Call Trace:
Feb 19 10:59:10 lap kernel: ? finish_wait+0x80/0x80
Feb 19 10:59:10 lap kernel: ? collapse_shmem+0xdd0/0xdd0
Feb 19 10:59:10 lap kernel: kthread+0x113/0x130
Feb 19 10:59:10 lap kernel: ? kthread_create_worker_on_cpu+0x70/0x70
Feb 19 10:59:10 lap kernel: ret_from_fork+0x35/0x40
Feb 19 10:59:10 lap kernel: Code: ff e9 e7 fd ff ff bb 07 00 00 00 49 89 c7 e9 20 fb ff ff 48 83 ea 01 e9 66 fc ff ff 48 c7 c6 d8 3f 0a b9 48 89 df e8 0a 82 fa ff <0f> 0b 31 c9 4c 89 fa 48 89 de 4c 89 f7 e8 58 f1 fd ff e9 2e fa 
Feb 19 10:59:10 lap kernel: RIP: khugepaged+0x1af6/0x2130 RSP: ffffacacc1b4bdc0
Feb 19 10:59:10 lap kernel: ---[ end trace a734c2f4d682e3bd ]---
Feb 19 10:59:11 lap abrt-dump-journal-oops[1080]: abrt-dump-journal-oops: Found oopses: 1
Feb 19 10:59:11 lap abrt-dump-journal-oops[1080]: abrt-dump-journal-oops: Creating problem directories
Feb 19 10:59:11 lap abrt-server[17302]: Deleting problem directory oops-2018-02-19-10:59:11-1080-0 (dup of oops-2018-02-13-14:17:42-1042-0)
Feb 19 10:59:12 lap abrt-notification[17309]: System encountered a non-fatal error in finish_wait()
Feb 19 10:59:12 lap abrt-dump-journal-oops[1080]: Reported 1 kernel oopses to Abrt

Comment 1 vt 2018-02-22 02:43:35 UTC
*** Bug 1538411 has been marked as a duplicate of this bug. ***

Comment 2 Laura Abbott 2018-02-22 17:31:07 UTC
Upstream gave a fix, can you try this scratch build when it finishes? https://koji.fedoraproject.org/koji/taskinfo?taskID=25238698

Comment 3 vt 2018-03-13 01:06:10 UTC
Running the scratch build kernel (kernel-4.15.4-301.rhbz1546709.fc27.x86_64) in the past 2 weeks, didn't see the original kernel call trace.

Comment 4 Fedora Update System 2018-03-16 00:35:47 UTC
kernel-4.15.10-300.fc27 has been submitted as an update to Fedora 27. https://bodhi.fedoraproject.org/updates/FEDORA-2018-959aac67a3

Comment 5 Fedora Update System 2018-03-16 00:37:13 UTC
kernel-4.15.10-200.fc26 has been submitted as an update to Fedora 26. https://bodhi.fedoraproject.org/updates/FEDORA-2018-296bf0c332

Comment 6 Fedora Update System 2018-03-16 17:25:11 UTC
kernel-4.15.10-200.fc26 has been pushed to the Fedora 26 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2018-296bf0c332

Comment 7 Fedora Update System 2018-03-16 17:56:08 UTC
kernel-4.15.10-300.fc27 has been pushed to the Fedora 27 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2018-959aac67a3

Comment 8 Fedora Update System 2018-03-20 17:35:37 UTC
kernel-4.15.10-200.fc26 has been pushed to the Fedora 26 stable repository. If problems still persist, please make note of it in this bug report.

Comment 9 Fedora Update System 2018-03-20 18:21:55 UTC
kernel-4.15.10-300.fc27 has been pushed to the Fedora 27 stable repository. If problems still persist, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.