Bug 1635060 - kernel BUG at mm/page_alloc.c:2019!
Summary: kernel BUG at mm/page_alloc.c:2019!
Keywords:
Status: CLOSED WORKSFORME
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 27
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-10-01 23:00 UTC by Ken Booth
Modified: 2018-10-22 15:11 UTC (History)
33 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1598989
Environment:
Last Closed: 2018-10-22 15:11:18 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)

Description Ken Booth 2018-10-01 23:00:48 UTC
+++ This bug was initially created as a clone of Bug #1598989 +++

Description of problem:
While compiling an android rom, the kernel suddenly dies with the message in summary.
Next line is:
invalid opcode: 0000 [#1] SMP PTI

The system gets halted and I'm not able to do anything but a hardreset.
I uploaded a screenshot (photo) to my server:
https://rpm.jenslody.de/kernel-bug/IMG_20180707_150721.jpg

It seems not to happen with 4.17.2, did not (yet) test with 4.17.3

Version-Release number of selected component (if applicable):
4.17.4-200.fc28

How reproducible:
When compiling large sources, needs nearly half of the memory on my laptop (32 GB). All (eight virtual) kernels on nearly 100 %, it happened at least two times with the same error-message after some time of compiling.
Gnome-shell was not started (just gdm was running).
I also had system halts when not having the shell open while compiling (and gnome-shell running), so it might have happen in this cases, too. But that's not sure.
Hardreset was also necessary in this cases.

Steps to Reproduce:
See above

Actual results:
Completely halts system

Expected results:
Just work

Additional info:
I also interchanged the ram-modules (first with second bank), but the error was at the same line in "mm/page_alloc.c" .

--- Additional comment from Didier G on 2018-07-10 18:40:08 EDT ---

See https://bugzilla.redhat.com/show_bug.cgi?id=1598462

--- Additional comment from Edgar Hoch on 2018-07-22 13:27:02 EDT ---

The error occured also on kernel-4.17.6-200.fc28.x86_64. I did nothing special - it happens suddenly.

abrt-cli tells me that the crash is not reportable, because it contains too less data.

:kernel BUG at mm/page_alloc.c:2019!
:invalid opcode: 0000 [#1] SMP PTI
:Modules linked in: nfnetlink_queue nfnetlink_log bluetooth ecdh_generic xt_set xt_multiport ip_set_hash_ip rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 tun ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack devlink ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack libcrc32c iptable_mangle iptable_raw iptable_security ebtable_filter ebtables ip6table_filter ip6_tables toshiba_acpi sparse_keymap industrialio rfkill video toshiba_haps hp_accel lis3lv02d input_polldev joydev intel_rapl sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel
: kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel intel_cstate intel_uncore ipmi_ssif intel_rapl_perf ses enclosure scsi_transport_sas iTCO_wdt iTCO_vendor_support mei_me lpc_ich mei ioatdma i2c_i801 wmi shpchp nfsd ipmi_si ipmi_devintf auth_rpcgss ipmi_msghandler nfs_acl lockd ecryptfs grace acpi_power_meter vboxpci(OE) vboxnetadp(OE) vboxnetflt(OE) sunrpc acpi_pad vboxdrv(OE) binfmt_misc ast i2c_algo_bit drm_kms_helper ttm drm ixgbe crc32c_intel megaraid_sas mdio dca
:CPU: 0 PID: 18128 Comm: dsmc Tainted: G           OE     4.17.6-200.fc28.x86_64 #1
:Hardware name: Supermicro X10DRi/X10DRI-T, BIOS 1.1 04/14/2015
:RIP: 0010:move_freepages_block+0x167/0x2e0
:RSP: 0018:ffffa123c6b1f650 EFLAGS: 00010002
:RAX: ffff8ac87ffd5000 RBX: 0000000000100000 RCX: ffff8ac87ffd5680
:RDX: 0000000000000002 RSI: 0000000000000000 RDI: 0000000000000000
:RBP: 0000000000000002 R08: 0000000000000000 R09: ffff8ac87ffd5888
:R10: ffffffff95e28844 R11: ffff8ad87ffd0000 R12: ffffda8d01e6b200
:R13: ffff8ac87ffd5680 R14: ffffa123c6b1f6b4 R15: ffffda8d01e68000
:FS:  00007f56affff700(0000) GS:ffff8ac83f800000(0000) knlGS:0000000000000000
:CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
:CR2: 00007f55d23cb000 CR3: 00000007b0ec0001 CR4: 00000000001606f0
:Call Trace:



From journalctl:

Jul 22 18:41:47 kernel: ------------[ cut here ]------------
Jul 22 18:41:47 kernel: kernel BUG at mm/page_alloc.c:2019!
Jul 22 18:41:47 kernel: invalid opcode: 0000 [#1] SMP PTI
Jul 22 18:41:47 kernel: Modules linked in: nfnetlink_queue nfnetlink_log bluetooth ecdh_generic xt_set xt_multiport ip_set_hash_ip rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 tun ip6t_>
Jul 22 18:41:47 kernel:  kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel intel_cstate intel_uncore ipmi_ssif intel_rapl_perf ses enclosure scsi_transport_sas iTCO_wdt iTCO_vendor_support mei_me lpc_ich mei ioatdma i2c_i80>
Jul 22 18:41:47 kernel: CPU: 0 PID: 18128 Comm: dsmc Tainted: G           OE     4.17.6-200.fc28.x86_64 #1
Jul 22 18:41:47 kernel: Hardware name: Supermicro X10DRi/X10DRI-T, BIOS 1.1 04/14/2015
Jul 22 18:41:47 kernel: RIP: 0010:move_freepages_block+0x167/0x2e0
Jul 22 18:41:47 kernel: RSP: 0018:ffffa123c6b1f650 EFLAGS: 00010002
Jul 22 18:41:47 kernel: RAX: ffff8ac87ffd5000 RBX: 0000000000100000 RCX: ffff8ac87ffd5680
Jul 22 18:41:47 kernel: RDX: 0000000000000002 RSI: 0000000000000000 RDI: 0000000000000000
Jul 22 18:41:47 kernel: RBP: 0000000000000002 R08: 0000000000000000 R09: ffff8ac87ffd5888
Jul 22 18:41:47 kernel: R10: ffffffff95e28844 R11: ffff8ad87ffd0000 R12: ffffda8d01e6b200
Jul 22 18:41:47 kernel: R13: ffff8ac87ffd5680 R14: ffffa123c6b1f6b4 R15: ffffda8d01e68000
Jul 22 18:41:47 kernel: FS:  00007f56affff700(0000) GS:ffff8ac83f800000(0000) knlGS:0000000000000000
Jul 22 18:41:47 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jul 22 18:41:47 kernel: CR2: 00007f55d23cb000 CR3: 00000007b0ec0001 CR4: 00000000001606f0
Jul 22 18:41:47 kernel: Call Trace:
Jul 22 18:41:47 kernel:  steal_suitable_fallback+0x119/0x1b0
Jul 22 18:41:47 kernel:  get_page_from_freelist+0xfb8/0x16a0
Jul 22 18:41:47 kernel:  __alloc_pages_slowpath+0x176/0xd20
Jul 22 18:41:47 kernel:  ? ext4_es_lookup_extent+0x24/0x1a0
Jul 22 18:41:47 kernel:  ? __check_block_validity.constprop.83+0x28/0x70
Jul 22 18:41:47 kernel:  __alloc_pages_nodemask+0x28e/0x2b0
Jul 22 18:41:47 kernel:  new_slab+0x293/0x740
Jul 22 18:41:47 kernel:  ___slab_alloc+0x3b4/0x550
Jul 22 18:41:47 kernel:  ? ext4_alloc_inode+0x17/0x160
Jul 22 18:41:47 kernel:  ? ext4_alloc_inode+0x17/0x160
Jul 22 18:41:47 kernel:  __slab_alloc+0x1c/0x30
Jul 22 18:41:47 kernel:  kmem_cache_alloc+0x19d/0x1d0
Jul 22 18:41:47 kernel:  ext4_alloc_inode+0x17/0x160
Jul 22 18:41:47 kernel:  alloc_inode+0x1b/0x80
Jul 22 18:41:47 kernel:  iget_locked+0xd2/0x180
Jul 22 18:41:47 kernel:  ext4_iget+0x3c/0xbc0
Jul 22 18:41:47 kernel:  ? d_alloc_parallel+0x9d/0x490
Jul 22 18:41:47 kernel:  ext4_lookup+0x109/0x200
Jul 22 18:41:47 kernel:  __lookup_slow+0x97/0x150
Jul 22 18:41:47 kernel:  lookup_slow+0x35/0x50
Jul 22 18:41:47 kernel:  walk_component+0x1bf/0x490
Jul 22 18:41:47 kernel:  path_lookupat.isra.50+0x75/0x200
Jul 22 18:41:47 kernel:  filename_lookup.part.64+0xa0/0x170
Jul 22 18:41:47 kernel:  ? __check_object_size+0x9c/0x171
Jul 22 18:41:47 kernel:  ? strncpy_from_user+0x4a/0x170
Jul 22 18:41:47 kernel:  vfs_statx+0x73/0xe0
Jul 22 18:41:47 kernel:  ? strncpy_from_user+0x4a/0x170
Jul 22 18:41:47 kernel:  __do_sys_newlstat+0x39/0x70
Jul 22 18:41:47 kernel:  ? _cond_resched+0x15/0x30
Jul 22 18:41:47 kernel:  ? dput.part.33+0x20/0x100
Jul 22 18:41:47 kernel:  ? path_getxattr+0x75/0xb0
Jul 22 18:41:47 kernel:  do_syscall_64+0x5b/0x160
Jul 22 18:41:47 kernel:  entry_SYSCALL_64_after_hwframe+0x44/0xa9
Jul 22 18:41:47 kernel: RIP: 0033:0x7f56c4996825
Jul 22 18:41:47 kernel: RSP: 002b:00007f56affe5928 EFLAGS: 00000246 ORIG_RAX: 0000000000000006
Jul 22 18:41:47 kernel: RAX: ffffffffffffffda RBX: 00007f56affe5970 RCX: 00007f56c4996825
Jul 22 18:41:47 kernel: RDX: 00007f56affe5988 RSI: 00007f56affe5988 RDI: 00007f55d23cb128
Jul 22 18:41:47 kernel: RBP: 00007f569bfe2dc0 R08: 00007f56affe5c2c R09: 000000000000005c
Jul 22 18:41:47 kernel: R10: 00007f55d23cb128 R11: 0000000000000246 R12: 00007f56afff109c
Jul 22 18:41:47 kernel: R13: 00007f56afff1080 R14: 00007f563a7a2cb8 R15: 00007f563a7a2cb8
Jul 22 18:41:47 kernel: Code: 80 58 38 96 48 89 c6 48 c1 e8 33 83 e0 07 48 c1 ee 36 48 8d 3c 40 48 8d 04 b8 48 c1 e0 07 48 03 04 f5 80 58 38 96 48 39 c1 74 17 <0f> 0b 45 31 e4 48 83 c4 28 44 89 e0 5b 5d 41 5c 41 5d 41 5e 41 
Jul 22 18:41:47 kernel: RIP: move_freepages_block+0x167/0x2e0 RSP: ffffa123c6b1f650
Jul 22 18:41:47 kernel: ---[ end trace c4b1b0d4162aef66 ]---
Jul 22 18:41:48 abrt-dump-journal-oops[2516]: abrt-dump-journal-oops: Found oopses: 1
Jul 22 18:41:48 abrt-dump-journal-oops[2516]: abrt-dump-journal-oops: Creating problem directories
Jul 22 18:42:20 kernel: NMI watchdog: Watchdog detected hard LOCKUP on cpu 18

--- Additional comment from H.J. Lu on 2018-07-23 10:48:55 EDT ---



--- Additional comment from Edgar Hoch on 2018-07-25 19:16:43 EDT ---

I have tried kernel-4.17.9-200.fc28.x86_64 - the crashes still occur.

--- Additional comment from Edgar Hoch on 2018-07-25 19:34:58 EDT ---

(In reply to Edgar Hoch from comment #4)
> I have tried kernel-4.17.9-200.fc28.x86_64 - the crashes still occur.

Sorry, my current crashes match better bug 1600482 - see there.

--- Additional comment from Ken Booth on 2018-07-31 05:22:02 EDT ---

My Lenovo T510 Thinkpad running F27 with kernel 4.17.6-100 also exhibits this error. I can reproduce it 25% by hibernating the laptop. The system hangs and I can only perform a Sysrq-c or hard power off.

[19781.710871] ------------[ cut here ]------------
[19781.710874] kernel BUG at mm/page_alloc.c:2019!
[19781.710884] invalid opcode: 0000 [#1] SMP PTI
[19781.710885] Modules linked in: snd_usb_audio snd_usbmidi_lib snd_rawmidi hid_plantronics ppp_deflate bsd_comp ppp_async ppp_generic slhc rfcomm ccm xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_i
pv4 tun ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip
6table_raw ip6table_security iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack libcrc32c iptable_mangle iptable_raw iptable_security ebtable_filter ebtables ip6table_filt
er ip6_tables bnep binfmt_misc sunrpc dm_crypt fuse uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_common snd_hda_codec_hdmi videodev media snd_hda_codec_conexant intel_powerc
lamp coretemp arc4 kvm_intel
[19781.710920]  snd_hda_codec_generic kvm iTCO_wdt iTCO_vendor_support irqbypass gpio_ich mei_wdt wmi_bmof crct10dif_pclmul btusb mxm_wmi crc32_pclmul btrtl ghash_clmulni_intel btbcm btintel blueto
oth intel_cstate iwldvm mac80211 ecdh_generic intel_uncore joydev snd_hda_intel snd_hda_codec iwlwifi thinkpad_acpi intel_ips i2c_i801 snd_hda_core snd_hwdep snd_seq cfg80211 snd_seq_device snd_pcm
 rfkill mei_me snd_timer mei snd lpc_ich wmi shpchp soundcore acpi_cpufreq i915 i2c_algo_bit drm_kms_helper crc32c_intel serio_raw sdhci_pci cqhci firewire_ohci sdhci drm mmc_core e1000e firewire_c
ore crc_itu_t video
[19781.710956] CPU: 3 PID: 7983 Comm: systemd-sleep Kdump: loaded Not tainted 4.17.6-100.fc27.x86_64 #1
[19781.710957] Hardware name: LENOVO 4384BR2/4384BR2, BIOS 6MET92WW (1.52 ) 09/26/2012
[19781.710964] RIP: 0010:move_freepages_block+0x2d4/0x2e0
[19781.710965] RSP: 0018:ffffb5d2c1c17af8 EFLAGS: 00010002
[19781.710967] RAX: ffff8d6c3bfd1000 RBX: 0000000000100000 RCX: ffff8d6c3bfd1680
[19781.710968] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[19781.710970] RBP: ffffb5d2c1c17b5c R08: 0000000000000024 R09: ffff8d6c3bfd19c0
[19781.710971] R10: ffffffff81e28028 R11: ffffde2340000000 R12: 0000000000000000
[19781.710972] R13: ffff8d6c3bfd1680 R14: ffffde2342ec8000 R15: ffff8d6c3bfd1680
[19781.710974] FS:  00007fb47fcf7940(0000) GS:ffff8d6c3bd80000(0000) knlGS:0000000000000000
[19781.710975] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[19781.710976] CR2: 000055bc2d3e0326 CR3: 00000001964b6006 CR4: 00000000000206e0
[19781.710978] Call Trace:
[19781.710985]  steal_suitable_fallback+0x14c/0x1a0
[19781.710987]  get_page_from_freelist+0x639/0x16f0
[19781.710990]  __alloc_pages_nodemask+0x11e/0x2a0
[19781.710996]  alloc_image_page+0xd/0x60
[19781.710998]  preallocate_image_pages.constprop.39+0x4c/0x60
[19781.711000]  hibernate_preallocate_memory+0x22d/0x440
[19781.711003]  hibernation_snapshot+0x60/0x4b0
[19781.711005]  hibernate+0x146/0x2a8
[19781.711007]  state_store+0xd5/0xe0
[19781.711011]  ? kernfs_fop_write+0xbe/0x190
[19781.711013]  kernfs_fop_write+0x10f/0x190
[19781.711018]  __vfs_write+0x36/0x180
[19781.711021]  ? selinux_file_permission+0x11d/0x130
[19781.711026]  ? security_file_permission+0x2a/0xb0
[19781.711027]  vfs_write+0xad/0x1a0
[19781.711029]  ksys_write+0x52/0xc0
[19781.711034]  do_syscall_64+0x5b/0x160
[19781.711039]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[19781.711042] RIP: 0033:0x7fb47f8327a4
[19781.711043] RSP: 002b:00007ffd5c572108 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[19781.711045] RAX: ffffffffffffffda RBX: 0000000000000005 RCX: 00007fb47f8327a4
[19781.711046] RDX: 0000000000000005 RSI: 0000562fa8260140 RDI: 0000000000000004
[19781.711047] RBP: 0000562fa8260140 R08: 0000562fa825e390 R09: 00007fb47fcf7940
[19781.711048] R10: 000000000000000a R11: 0000000000000246 R12: 0000000000000005
[19781.711049] R13: 0000000000000001 R14: 0000562fa825e2b0 R15: 0000000000000005
[19781.711051] Code: 48 89 c6 48 c1 e8 33 48 c1 ee 36 83 e0 07 48 8d 3c 40 48 8d 04 b8 48 c1 e0 07 48 03 04 f5 c0 59 38 82 48 39 c1 0f 84 fe fd ff ff <0f> 0b 66 2e 0f 1f 84 00 00 00 00 00 66 66 66 
66 90 41 57 41 56 
[19781.711072] RIP: move_freepages_block+0x2d4/0x2e0 RSP: ffffb5d2c1c17af8
[19781.711074] ---[ end trace 484af4af0d00ac3e ]---

--- Additional comment from Laura Abbott on 2018-10-01 17:22:35 EDT ---

We apologize for the inconvenience.  There is a large number of bugs to go through and several of them have gone stale.  Due to this, we are doing a mass bug update across all of the Fedora 28 kernel bugs.
 
Fedora 28 has now been rebased to 4.18.10-300.fc28.  Please test this kernel update (or newer) and let us know if you issue has been resolved or if it is still present with the newer kernel.
 
If you have moved on to Fedora 29, and are still experiencing this issue, please change the version to Fedora 29.
 
If you experience different issues, please open a new bug report for those.

Comment 1 Laura Abbott 2018-10-02 14:27:03 UTC
I can't understand why this report was duped, was this tested on the 4.18 series?

Comment 2 Ken Booth 2018-10-22 12:06:02 UTC
(In reply to Laura Abbott from comment #1)
> I can't understand why this report was duped, was this tested on the 4.18
> series?

I was told that there should be a separate bug for each version of RHEL, does this not apply to Fedora releases? (My bug was for F27 not F28).

Meanwhile, tested with new kernel version and i cannot replicate the issue any more. (I also tested with kernel patches, but booted from old kernel and cannot reproduce in that environment either, so not sure how conclusive the test is).

Comment 3 Laura Abbott 2018-10-22 15:11:18 UTC
It's not necessary to dupe on Fedora since we keep similar kernels between versions. If this bug isn't reproduced I think it's okay to close for now.


Note You need to log in before you can comment on or make changes to this bug.