Urgent: as this is a production server. Description: The Java process in question was a Tomcat Webserver 7.0.82 with ATN a few seconds after it initialized itself. The process did not die, but it stop responding. it had to be restarted via systemctl. INFO: Loaded APR based Apache Tomcat Native library 1.2.14 using APR version 1.6.3. Tomcat did not log any error message, those came via a wall message from syslogd ( s.b. ). The server has been upgraded to fc26 a few minutes ago, so all packets are uptodate. kernel-headers-4.14.11-200.fc26.i686 kernel-PAE-4.13.16-100.fc25.i686 kernel-PAE-4.14.11-200.fc26.i686 kernel-PAE-core-4.13.16-100.fc25.i686 kernel-PAE-core-4.14.11-200.fc26.i686 kernel-PAE-core-4.8.13-100.fc23.i686 kernel-PAE-modules-4.13.16-100.fc25.i686 kernel-PAE-modules-4.14.11-200.fc26.i686 kernel-PAE-modules-4.8.13-100.fc23.i686 kernel-PAE-modules-extra-4.13.16-100.fc25.i686 kernel-PAE-modules-extra-4.14.11-200.fc26.i686 kernel-PAE-modules-extra-4.8.13-100.fc23.i686 Kernel OOPS: Jan 5 22:35:34 xx kernel: BUG: unable to handle kernel NULL pointer dereference at 00000008 Jan 5 22:35:34 xx kernel: IP: __radix_tree_lookup+0xe/0xa0 Jan 5 22:35:34 xx kernel: *pdpt = 000000002894e027 *pde = 0000000000000000 Jan 5 22:35:34 xx kernel: Oops: 0000 [#1] SMP Jan 5 22:35:34 xx kernel: Modules linked in: rmd160 ip_vti ip_tunnel af_key ah6 ah4 esp6 esp4 xfrm4_mode_beet xfrm4_tunnel tunnel4 xfrm4_mode_tunnel xfrm4_mode_transport xfrm6_mode_transport xfrm6_mode_ro xfrm6_mode_beet xfrm6_mode_tunnel ipcomp ipcomp6 xfrm6_tunnel tunnel6 xfrm_ipcomp chacha20poly1305 cmac camellia_generic cast6_generic cast5_generic cast_common deflate ccm serpent_sse2_i586 serpent_generic glue_helper ablk_helper blowfish_generic blowfish_common cls_u32 twofish_generic twofish_i586 twofish_common sch_htb xcbc sha512_generic des_generic geode_aes ip6table_filter ip6_tables xt_owner xt_multiport xenfs xen_privcmd coretemp xen_netfront nf_conntrack_pptp nf_conntrack_proto_gre nf_conntrack libcrc32c crc32c_intel xen_blkfront Jan 5 22:35:34 xx kernel: CPU: 1 PID: 2038 Comm: java Not tainted 4.14.11-200.fc26.i686+PAE #1 Jan 5 22:35:34 xx kernel: task: e67ce780 task.stack: e4c8a000 Jan 5 22:35:34 xx kernel: EIP: __radix_tree_lookup+0xe/0xa0 Jan 5 22:35:34 xx kernel: EFLAGS: 00010282 CPU: 1 Jan 5 22:35:34 xx kernel: EAX: 00000004 EBX: 4864d000 ECX: 00000000 EDX: 00000000 Jan 5 22:35:34 xx kernel: ESI: 00000000 EDI: 00000004 EBP: e4c8bda8 ESP: e4c8bd98 Jan 5 22:35:34 xx kernel: DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0069 Jan 5 22:35:34 xx kernel: CR0: 80050033 CR2: 00000008 CR3: 26728000 CR4: 00002660 Jan 5 22:35:34 xx kernel: Call Trace: Jan 5 22:35:34 xx kernel: radix_tree_lookup_slot+0x1d/0x40 Jan 5 22:35:34 xx kernel: ? pvclock_clocksource_read+0x9b/0x150 Jan 5 22:35:35 xx kernel: find_get_entry+0x20/0x160 Jan 5 22:35:35 xx kernel: pagecache_get_page+0x24/0x290 Jan 5 22:35:35 xx kernel: lookup_swap_cache+0x3a/0x100 Jan 5 22:35:35 xx kernel: swap_readahead_detect+0x55/0x280 Jan 5 22:35:36 xx kernel: do_swap_page+0x22a/0x990 Jan 5 22:35:36 xx kernel: ? update_load_avg+0x4a4/0x5e0 Jan 5 22:35:36 xx kernel: ? kmap_atomic_prot+0x3e/0x130 Jan 5 22:35:36 xx kernel: handle_mm_fault+0x498/0xc90 Jan 5 22:35:36 xx kernel: __do_page_fault+0x202/0x4d0 Jan 5 22:35:36 xx kernel: ? __do_page_fault+0x4d0/0x4d0 Jan 5 22:35:37 xx kernel: do_page_fault+0x27/0xd0 Jan 5 22:35:37 xx kernel: ? exit_to_usermode_loop+0x2a/0xa0 Jan 5 22:35:37 xx kernel: ? __do_page_fault+0x4d0/0x4d0 Jan 5 22:35:37 xx kernel: common_exception+0x6f/0x76 Jan 5 22:35:37 xx kernel: EIP: 0x1f776211 Jan 5 22:35:37 xx kernel: EFLAGS: 00010202 CPU: 1 Jan 5 22:35:37 xx kernel: EAX: 4864d1a0 EBX: 772d9328 ECX: 00000084 EDX: 48645a74 Jan 5 22:35:37 xx kernel: ESI: 772d9328 EDI: 4864d1a0 EBP: 1b8feef8 ESP: 1b8feea0 Jan 5 22:35:37 xx kernel: DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b Jan 5 22:35:37 xx kernel: Code: b6 00 00 00 00 0f 0b 8d b6 00 00 00 00 b8 ef ff ff ff eb a4 e8 34 73 8b ff 8d 74 26 00 55 89 e5 57 56 53 89 c7 83 ec 04 89 4d f0 <8b> 5f 04 89 d8 83 e0 03 83 f8 01 75 6d 89 d8 83 e0 fe 0f b6 08 Jan 5 22:35:37 xx kernel: EIP: __radix_tree_lookup+0xe/0xa0 SS:ESP: 0069:e4c8bd98 Jan 5 22:35:37 xx kernel: CR2: 0000000000000008 Jan 5 22:35:37 xx kernel: ---[ end trace d8df2e7399a1cc87 ]--- Syslogd : Message from syslogd@xx at Jan 5 22:35:34 ... kernel:CPU: 1 PID: 2038 Comm: java Not tainted 4.14.11-200.fc26.i686+PAE #1 Message from syslogd@xx at Jan 5 22:35:34 ... kernel:task: e67ce780 task.stack: e4c8a000 Message from syslogd@xx at Jan 5 22:35:34 ... kernel:Call Trace: Message from syslogd@xx at Jan 5 22:35:34 ... kernel: radix_tree_lookup_slot+0x1d/0x40 Message from syslogd@xx at Jan 5 22:35:34 ... kernel: ? pvclock_clocksource_read+0x9b/0x150 Message from syslogd@xx at Jan 5 22:35:35 ... kernel: find_get_entry+0x20/0x160 Message from syslogd@xx at Jan 5 22:35:35 ... kernel: pagecache_get_page+0x24/0x290 Message from syslogd@xx at Jan 5 22:35:35 ... kernel: lookup_swap_cache+0x3a/0x100 Message from syslogd@xx at Jan 5 22:35:35 ... kernel: swap_readahead_detect+0x55/0x280 Message from syslogd@xx at Jan 5 22:35:36 ... kernel: do_swap_page+0x22a/0x990 Message from syslogd@xx at Jan 5 22:35:36 ... kernel: ? update_load_avg+0x4a4/0x5e0 Message from syslogd@xx at Jan 5 22:35:36 ... kernel: ? kmap_atomic_prot+0x3e/0x130 Message from syslogd@xx at Jan 5 22:35:36 ... kernel: handle_mm_fault+0x498/0xc90 Message from syslogd@xx at Jan 5 22:35:36 ... kernel: __do_page_fault+0x202/0x4d0 Message from syslogd@xx at Jan 5 22:35:36 ... kernel: ? __do_page_fault+0x4d0/0x4d0 Message from syslogd@xx at Jan 5 22:35:37 ... kernel: do_page_fault+0x27/0xd0 Message from syslogd@xx at Jan 5 22:35:37 ... kernel: ? exit_to_usermode_loop+0x2a/0xa0 Message from syslogd@xx at Jan 5 22:35:37 ... kernel: ? __do_page_fault+0x4d0/0x4d0 Message from syslogd@xx at Jan 5 22:35:37 ... kernel: common_exception+0x6f/0x76 Message from syslogd@xx at Jan 5 22:35:37 ... kernel:Code: b6 00 00 00 00 0f 0b 8d b6 00 00 00 00 b8 ef ff ff ff eb a4 e8 34 73 8b ff 8d 74 26 00 55 89 e5 57 56 53 89 c7 83 ec 04 89 4d f0 <8b> 5f 04 89 d8 83 e0 03 83 f8 01 75 6d 89 d8 83 e0 fe 0f b6 08 Message from syslogd@xx at Jan 5 22:35:37 ... kernel:EIP: __radix_tree_lookup+0xe/0xa0 SS:ESP: 0069:e4c8bd98
update: had to reboot the server, after several other processes got oops too. while those happend the server was unresponsive.
New Report from another 32bit server: Jan 6 16:42:02 localhost kernel: BUG: unable to handle kernel NULL pointer dereference at 00000008 Jan 6 16:42:02 localhost kernel: IP: __radix_tree_lookup+0xe/0xa0 Jan 6 16:42:02 localhost kernel: *pdpt = 00000000029fd027 *pde = 0000000000000000 Jan 6 16:42:02 localhost kernel: Oops: 0000 [#3] SMP Jan 6 16:42:02 localhost kernel: Modules linked in: nfsv3 nfs_acl nfs lockd grace sunrpc fscache xt_owner xt_multiport ip6table_filter ip6_tables cfg80211 rfkill xenfs xen_privcmd coretemp crc32c_intel xen_netfront xen_blkfront loop Jan 6 16:42:02 localhost kernel: CPU: 0 PID: 4210 Comm: java Tainted: G D 4.14.8-200.fc26.i686+PAE #1 Jan 6 16:42:02 localhost kernel: task: dbbdb3c0 task.stack: dc244000 Jan 6 16:42:02 localhost kernel: EIP: __radix_tree_lookup+0xe/0xa0 Jan 6 16:42:02 localhost kernel: EFLAGS: 00010282 CPU: 0 Jan 6 16:42:02 localhost kernel: EAX: 00000004 EBX: 0c928000 ECX: 00000000 EDX: 00000000 Jan 6 16:42:02 localhost kernel: ESI: 00000000 EDI: 00000004 EBP: dc245da8 ESP: dc245d98 Jan 6 16:42:02 localhost kernel: DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0069 Jan 6 16:42:02 localhost kernel: CR0: 80050033 CR2: 00000008 CR3: 11cdd000 CR4: 00002660 Jan 6 16:42:02 localhost kernel: Call Trace: Jan 6 16:42:02 localhost kernel: radix_tree_lookup_slot+0x1d/0x40 Jan 6 16:42:02 localhost kernel: find_get_entry+0x20/0x160 Jan 6 16:42:02 localhost kernel: pagecache_get_page+0x24/0x290 Jan 6 16:42:02 localhost kernel: lookup_swap_cache+0x3a/0x100 Jan 6 16:42:02 localhost kernel: swap_readahead_detect+0x55/0x280 Jan 6 16:42:02 localhost kernel: do_swap_page+0x22a/0x990 Jan 6 16:42:02 localhost kernel: ? pvclock_clocksource_read+0x9b/0x150 Jan 6 16:42:02 localhost kernel: ? update_load_avg+0x4a4/0x5e0 Jan 6 16:42:02 localhost kernel: ? kmap_atomic_prot+0x3e/0x130 Jan 6 16:42:02 localhost kernel: handle_mm_fault+0x498/0xc90 Jan 6 16:42:02 localhost kernel: ? load_TLS_descriptor+0x6a/0xf0 Jan 6 16:42:02 localhost kernel: __do_page_fault+0x202/0x4d0 Jan 6 16:42:02 localhost kernel: ? __do_page_fault+0x4d0/0x4d0 Jan 6 16:42:02 localhost kernel: do_page_fault+0x27/0xd0 Jan 6 16:42:02 localhost kernel: ? exit_to_usermode_loop+0x2a/0xa0 Jan 6 16:42:02 localhost kernel: ? __do_page_fault+0x4d0/0x4d0 Jan 6 16:42:02 localhost kernel: common_exception+0x6f/0x76 Jan 6 16:42:02 localhost kernel: EIP: 0x1f4c4506 Jan 6 16:42:02 localhost kernel: EFLAGS: 00010246 CPU: 0 Jan 6 16:42:02 localhost kernel: EAX: 00000000 EBX: 1f958000 ECX: 0000004c EDX: 00000000 Jan 6 16:42:02 localhost kernel: ESI: 0c928058 EDI: 0c928058 EBP: 0c008c78 ESP: 0c008c30 Jan 6 16:42:02 localhost kernel: DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b Jan 6 16:42:02 localhost kernel: ? insn_init+0x10/0x70 Jan 6 16:42:02 localhost kernel: Code: b6 00 00 00 00 0f 0b 8d b6 00 00 00 00 b8 ef ff ff ff eb a4 e8 04 7a 8b ff 8d 74 26 00 55 89 e5 57 56 53 89 c7 83 ec 04 89 4d f0 <8b> 5f 04 89 d8 83 e0 03 83 f8 01 75 6d 89 d8 83 e0 fe 0f b6 08 Jan 6 16:42:02 localhost kernel: EIP: __radix_tree_lookup+0xe/0xa0 SS:ESP: 0069:dc245d98 Jan 6 16:42:02 localhost kernel: CR2: 0000000000000008 Jan 6 16:42:02 localhost kernel: ---[ end trace a9a1edbe8e361481 ]--- that java process was started via cron script. Result: 1086 dead TOP processes over a periode of 16 hours ( started by a cron script ) Load >800
Escalation to 64 BIT : Jan 6 22:09:26 localhost kernel: BUG: unable to handle kernel NULL pointer dereference at 00000000000000c3 Jan 6 22:09:26 localhost kernel: IP: 0xffdcc150 Jan 6 22:09:26 localhost kernel: PGD 19b868067 P4D 19b868067 PUD 136661067 PMD 0 Jan 6 22:09:26 localhost kernel: Oops: 0002 [#11] SMP NOPTI Jan 6 22:09:26 localhost kernel: Modules linked in: nfsv3 nfs_acl nfs lockd grace fscache xt_owner xt_multiport ip6table_filter ip6_tables cfg80211 rfkill xenfs xen_privcmd sunrpc coretemp xen_netfront xen_blkfront crc32c_intel Jan 6 22:09:26 localhost kernel: CPU: 1 PID: 20141 Comm: xenstore Tainted: G D W 4.14.11-200.fc26.x86_64 #1 Jan 6 22:09:26 localhost kernel: task: ffff8800bc830000 task.stack: ffffc900027f0000 Jan 6 22:09:26 localhost kernel: RIP: e030:0xffdcc150 Jan 6 22:09:26 localhost kernel: RSP: e02b:ffffc900027f3fd0 EFLAGS: 00010206 Jan 6 22:09:26 localhost kernel: RAX: 00000000000000c3 RBX: 0000000008052397 RCX: 00000000ffdcc150 Jan 6 22:09:26 localhost kernel: RDX: 000000000805406c RSI: 0000000000000004 RDI: 0000000000000000 Jan 6 22:09:26 localhost kernel: RBP: 00000000ffdcc1b8 R08: 0000000000000000 R09: 0000000000000000 Jan 6 22:09:26 localhost kernel: R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 Jan 6 22:09:26 localhost kernel: R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 Jan 6 22:09:26 localhost kernel: FS: 00007f6278341b40(0000) GS:ffff8801f5c80000(0000) knlGS:0000000000000000 Jan 6 22:09:26 localhost kernel: CS: e033 DS: 002b ES: 002b CR0: 0000000080050033 Jan 6 22:09:26 localhost kernel: CR2: 00000000000000c3 CR3: 00000000fe3a8000 CR4: 0000000000002660 Jan 6 22:09:26 localhost kernel: Call Trace: Jan 6 22:09:26 localhost kernel: ? switch_to_thread_stack+0x21/0x40 Jan 6 22:09:26 localhost kernel: Code: Bad RIP value. Jan 6 22:09:26 localhost kernel: RIP: 0xffdcc150 RSP: ffffc900027f3fd0 Jan 6 22:09:26 localhost kernel: CR2: 00000000000000c3 Jan 6 22:09:26 localhost kernel: ---[ end trace bdd3e92a33cbabaa ]---
found a pattern : older software java 1.6 xenstore like in : Jan 4 23:16:54 localhost kernel: CPU: 2 PID: 1 Comm: swapper/0 Not tainted 4.14.11-200.fc26.x86_64 #1 Jan 5 00:21:28 localhost kernel: CPU: 1 PID: 25717 Comm: xenstore Tainted: G W 4.14.11-200.fc26.x86_64 #1 Jan 5 08:19:20 localhost kernel: CPU: 2 PID: 29274 Comm: xenstore-read Tainted: G D W 4.14.11-200.fc26.x86_64 #1 Jan 5 13:58:55 localhost kernel: CPU: 1 PID: 27507 Comm: xenstore-read Tainted: G D W 4.14.11-200.fc26.x86_64 #1 Jan 5 15:59:07 localhost kernel: CPU: 1 PID: 9614 Comm: xenstore Tainted: G D W 4.14.11-200.fc26.x86_64 #1 Jan 5 18:36:41 localhost kernel: CPU: 3 PID: 1238 Comm: xenstore-read Tainted: G D W 4.14.11-200.fc26.x86_64 #1 Jan 5 19:57:48 localhost kernel: CPU: 0 PID: 30208 Comm: xenstore-read Tainted: G D W 4.14.11-200.fc26.x86_64 #1 Jan 6 06:49:53 localhost kernel: CPU: 1 PID: 19549 Comm: xenstore Tainted: G D W 4.14.11-200.fc26.x86_64 #1 Jan 6 07:25:58 localhost kernel: CPU: 2 PID: 6082 Comm: xenstore-read Tainted: G D W 4.14.11-200.fc26.x86_64 #1 Jan 6 07:26:58 localhost kernel: CPU: 2 PID: 6277 Comm: xenstore-read Tainted: G D W 4.14.11-200.fc26.x86_64 #1 Jan 6 17:11:53 localhost kernel: CPU: 3 PID: 5299 Comm: xenstore-read Tainted: G D W 4.14.11-200.fc26.x86_64 #1 Jan 6 22:09:26 localhost kernel: CPU: 1 PID: 20141 Comm: xenstore Tainted: G D W 4.14.11-200.fc26.x86_64 #1 what worries me, is that "swapper/0" entry, directly after starting the system : Jan 4 23:16:54 localhost kernel: xenbus_probe_frontend: Device with no driver: device/vbd/51712 Jan 4 23:16:54 localhost kernel: xenbus_probe_frontend: Device with no driver: device/vif/0 Jan 4 23:16:54 localhost kernel: Magic number: 1:252:3141 Jan 4 23:16:54 localhost kernel: hctosys: unable to open rtc device (rtc0) Jan 4 23:16:54 localhost kernel: Freeing unused kernel memory: 2084K Jan 4 23:16:54 localhost kernel: Write protecting the kernel read-only data: 18432k Jan 4 23:16:54 localhost kernel: Freeing unused kernel memory: 2012K Jan 4 23:16:54 localhost kernel: Freeing unused kernel memory: 372K Jan 4 23:16:54 localhost kernel: x86/mm: Found insecure W+X mapping at address ffff880000000000/0xffff880000000000 Jan 4 23:16:54 localhost kernel: ------------[ cut here ]------------ Jan 4 23:16:54 localhost kernel: WARNING: CPU: 2 PID: 1 at arch/x86/mm/dump_pagetables.c:266 note_page+0x622/0x810 Jan 4 23:16:54 localhost kernel: Modules linked in: Jan 4 23:16:54 localhost kernel: CPU: 2 PID: 1 Comm: swapper/0 Not tainted 4.14.11-200.fc26.x86_64 #1 Jan 4 23:16:54 localhost kernel: task: ffff8801f4995a00 task.stack: ffffc90001c4c000 Jan 4 23:16:54 localhost kernel: RIP: e030:note_page+0x622/0x810 Jan 4 23:16:54 localhost kernel: RSP: e02b:ffffc90001c4fe40 EFLAGS: 00010286 Jan 4 23:16:54 localhost kernel: RAX: 0000000000000051 RBX: ffffc90001c4fec8 RCX: ffffffff82253008 Jan 4 23:16:54 localhost kernel: RDX: 0000000000000000 RSI: 00000000000000e4 RDI: 0000000000000202 Jan 4 23:16:54 localhost kernel: RBP: 8010000000000065 R08: 00000000000000e4 R09: 0000000000000004 Jan 4 23:16:54 localhost kernel: R10: ffff8801ff00a7a0 R11: 0000000000000001 R12: 0000000000000005 Jan 4 23:16:54 localhost kernel: R13: 0000000000000000 R14: 00003ffffffff000 R15: 0000000000000000 Jan 4 23:16:54 localhost kernel: FS: 00007f8368f28700(0000) GS:ffff8801f5d00000(0000) knlGS:0000000000000000 Jan 4 23:16:54 localhost kernel: CS: e033 DS: 0000 ES: 0000 CR0: 0000000080050033 Jan 4 23:16:54 localhost kernel: CR2: 00007f8368ae4000 CR3: 000000000220a000 CR4: 0000000000002660 Jan 4 23:16:54 localhost kernel: Call Trace: Jan 4 23:16:54 localhost kernel: ptdump_walk_pgd_level_core+0x3c0/0x4a0 Jan 4 23:16:54 localhost kernel: ? rest_init+0xb0/0xb0 Jan 4 23:16:54 localhost kernel: ptdump_walk_pgd_level_checkwx+0x15/0x50 Jan 4 23:16:54 localhost kernel: kernel_init+0x2b/0xfc Jan 4 23:16:54 localhost kernel: ret_from_fork+0x1f/0x30 Jan 4 23:16:54 localhost kernel: Code: fa 22 00 41 f7 c7 00 01 00 00 0f 85 50 fe ff ff e9 c0 fc ff ff 48 89 f2 48 c7 c7 70 f5 07 82 c6 05 3e 34 2c 01 01 e8 29 4f 09 00 <0f> ff 48 8b 73 10 e9 67 fa ff ff 4d 89 c8 b9 10 00 00 00 be 10 Jan 4 23:16:54 localhost kernel: ---[ end trace bdd3e92a33cbab9f ]--- Jan 4 23:16:54 localhost kernel: x86/mm: Checked W+X mappings: FAILED, 8544 W+X pages found. Jan 4 23:16:54 localhost kernel: rodata_test: all tests were successful Jan 4 23:16:54 localhost dracut-cmdline[155]: Using kernel command line parameters: root=UUID=dca7eea1-687e-476a-a9a0-c41ef0329113 ro rhgb quiet audit=0 LANG=de_DE.UTF-8 Jan 4 23:16:54 localhost kernel: xen_netfront: Initialising Xen virtual ethernet driver Jan 4 23:16:54 localhost kernel: blkfront: xvda: barrier: enabled; persistent grants: disabled; indirect descriptors: disabled; Jan 4 23:16:54 localhost kernel: xvda: xvda1 xvda2 Jan 4 23:16:54 localhost kernel: blkfront: xvda: empty barrier op failed Jan 4 23:16:54 localhost kernel: blkfront: xvda: barrier or flush: disabled; persistent grants: disabled; indirect descriptors: disabled; Jan 4 23:16:55 localhost kernel: EXT4-fs (xvda1): mounted filesystem with ordered data mode. Opts: (null)
Next Server 64bit : with some new bug messages.. Jan 7 04:33:36 s129 kernel: BUG: unable to handle kernel NULL pointer dereference at 000000000000002d Jan 7 04:33:36 s129 kernel: IP: 0x805406c Jan 7 04:33:36 s129 kernel: PGD 4c4f8067 P4D 4c4f8067 PUD 1d1f067 PMD 0 Jan 7 04:33:36 s129 kernel: Oops: 0002 [#6] SMP NOPTI Jan 7 04:33:36 s129 kernel: Modules linked in: nfsv3 nfs_acl nfs lockd grace fscache xt_owner xt_multiport ip6table_filter ip6_tables cfg80211 rfkill xenfs xen_privcmd sunrpc intel_rapl x86_pkg_temp_thermal coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel intel_rapl_perf xen_blkfront xen_netfront crc32c_intel Jan 7 04:33:36 s129 kernel: CPU: 0 PID: 18010 Comm: xenstore Tainted: G D W 4.14.11-200.fc26.x86_64 #1 Jan 7 04:33:36 s129 kernel: task: ffff88000072bc00 task.stack: ffffc90000f14000 Jan 7 04:33:36 s129 kernel: RIP: e030:0x805406c Jan 7 04:33:36 s129 kernel: RSP: e02b:ffffc90000f17fd0 EFLAGS: 00010206 Jan 7 04:33:36 s129 kernel: RAX: 000000000000002d RBX: 0000000009843000 RCX: 000000000805406c Jan 7 04:33:36 s129 kernel: RDX: 000000000000002d RSI: 0000000009843000 RDI: 0000000000000000 Jan 7 04:33:36 s129 kernel: RBP: 00000000fff98d60 R08: 0000000000000000 R09: 0000000000000000 Jan 7 04:33:36 s129 kernel: R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 Jan 7 04:33:36 s129 kernel: R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 Jan 7 04:33:36 s129 kernel: FS: 00007f540d996b40(0000) GS:ffff88007f400000(0000) knlGS:0000000000000000 Jan 7 04:33:36 s129 kernel: CS: e033 DS: 002b ES: 002b CR0: 0000000080050033 Jan 7 04:33:36 s129 kernel: CR2: 000000000000002d CR3: 000000007bd2a000 CR4: 0000000000002660 Jan 7 04:33:36 s129 kernel: Call Trace: Jan 7 04:33:36 s129 kernel: ? switch_to_thread_stack+0x21/0x40 Jan 7 04:33:36 s129 kernel: Code: Bad RIP value. Jan 7 04:33:36 s129 kernel: RIP: 0x805406c RSP: ffffc90000f17fd0 Jan 7 04:33:36 s129 kernel: CR2: 000000000000002d Jan 7 04:33:36 s129 kernel: ---[ end trace 1f7ca9d98c190cd4 ]--- Jan 7 04:56:38 s129 kernel: invalid opcode: 0000 [#7] SMP NOPTI Jan 7 04:56:38 s129 kernel: Modules linked in: nfsv3 nfs_acl nfs lockd grace fscache xt_owner xt_multiport ip6table_filter ip6_tables cfg80211 rfkill xenfs xen_privcmd sunrpc intel_rapl x86_pkg_temp_thermal coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel intel_rapl_perf xen_blkfront xen_netfront crc32c_intel Jan 7 04:56:38 s129 kernel: CPU: 1 PID: 23934 Comm: xenstore-read Tainted: G D W 4.14.11-200.fc26.x86_64 #1 Jan 7 04:56:38 s129 kernel: task: ffff88003cfe8000 task.stack: ffffc90001654000 Jan 7 04:56:38 s129 kernel: RIP: e030:0x805838c Jan 7 04:56:38 s129 kernel: RSP: e02b:ffffc90001657fd0 EFLAGS: 00010206 Jan 7 04:56:38 s129 kernel: RAX: 0000000000000001 RBX: 0000000000000000 RCX: 000000000805838c Jan 7 04:56:38 s129 kernel: RDX: 0000000008054258 RSI: 00000000080542dc RDI: 0000000000000000 Jan 7 04:56:38 s129 kernel: RBP: 00000000ffde9e28 R08: 0000000000000000 R09: 0000000000000000 Jan 7 04:56:38 s129 kernel: R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 Jan 7 04:56:38 s129 kernel: R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 Jan 7 04:56:38 s129 kernel: FS: 00007f16d663ab40(0000) GS:ffff88007f500000(0000) knlGS:0000000000000000 Jan 7 04:56:38 s129 kernel: CS: e033 DS: 002b ES: 002b CR0: 0000000080050033 Jan 7 04:56:38 s129 kernel: CR2: 000000000805838c CR3: 0000000007d62000 CR4: 0000000000002660 Jan 7 04:56:38 s129 kernel: Call Trace: Jan 7 04:56:38 s129 kernel: ? switch_to_thread_stack+0x21/0x40 Jan 7 04:56:38 s129 kernel: Code: Bad RIP value. Jan 7 04:56:38 s129 kernel: RIP: 0x805838c RSP: ffffc90001657fd0 Jan 7 04:56:38 s129 kernel: ---[ end trace 1f7ca9d98c190cd5 ]---
Next Server "swapper OOPS" again Jan 7 12:48:15 localhost kernel: Loaded X.509 cert 'Fedora kernel signing key: 26c078a8c4b7e5c4b4f910577c77b04d08538a96' Jan 7 12:48:15 localhost kernel: zswap: loaded using pool lzo/zbud Jan 7 12:48:15 localhost kernel: Key type big_key registered Jan 7 12:48:15 localhost kernel: Key type encrypted registered Jan 7 12:48:15 localhost kernel: xenbus_probe_frontend: Device with no driver: device/vbd/51712 Jan 7 12:48:15 localhost kernel: xenbus_probe_frontend: Device with no driver: device/vbd/51728 Jan 7 12:48:15 localhost kernel: xenbus_probe_frontend: Device with no driver: device/vif/0 Jan 7 12:48:15 localhost kernel: Magic number: 1:252:3141 Jan 7 12:48:15 localhost kernel: hctosys: unable to open rtc device (rtc0) Jan 7 12:48:15 localhost kernel: Freeing unused kernel memory: 2084K Jan 7 12:48:15 localhost kernel: Write protecting the kernel read-only data: 18432k Jan 7 12:48:15 localhost kernel: Freeing unused kernel memory: 2012K Jan 7 12:48:15 localhost kernel: Freeing unused kernel memory: 372K Jan 7 12:48:15 localhost kernel: x86/mm: Found insecure W+X mapping at address ffff880000000000/0xffff880000000000 Jan 7 12:48:15 localhost kernel: ------------[ cut here ]------------ Jan 7 12:48:15 localhost kernel: WARNING: CPU: 0 PID: 1 at arch/x86/mm/dump_pagetables.c:266 note_page+0x622/0x810 Jan 7 12:48:15 localhost kernel: Modules linked in: Jan 7 12:48:15 localhost kernel: CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.14.11-200.fc26.x86_64 #1 Jan 7 12:48:15 localhost kernel: task: ffff88007c450000 task.stack: ffffc90000714000 Jan 7 12:48:15 localhost kernel: RIP: e030:note_page+0x622/0x810 Jan 7 12:48:15 localhost kernel: RSP: e02b:ffffc90000717e40 EFLAGS: 00010286 Jan 7 12:48:15 localhost kernel: RAX: 0000000000000051 RBX: ffffc90000717ec8 RCX: ffffffff82253008 Jan 7 12:48:15 localhost kernel: RDX: 0000000000000000 RSI: 00000000000000dc RDI: 0000000000000200 Jan 7 12:48:15 localhost kernel: RBP: 8010000000000065 R08: 00000000000000dc R09: 0000000000000004 Jan 7 12:48:15 localhost kernel: R10: ffff88007fc007a0 R11: 0000000000000001 R12: 0000000000000005 Jan 7 12:48:15 localhost kernel: R13: 0000000000000000 R14: 00003ffffffff000 R15: 0000000000000000 Jan 7 12:48:15 localhost kernel: FS: 00007f0f65dfb700(0000) GS:ffff88007f400000(0000) knlGS:0000000000000000 Jan 7 12:48:15 localhost kernel: CS: e033 DS: 0000 ES: 0000 CR0: 0000000080050033 Jan 7 12:48:15 localhost kernel: CR2: 00007f0f659ae000 CR3: 000000000220a000 CR4: 0000000000000660 Jan 7 12:48:15 localhost kernel: Call Trace: Jan 7 12:48:15 localhost kernel: ptdump_walk_pgd_level_core+0x3c0/0x4a0 Jan 7 12:48:15 localhost kernel: ? rest_init+0xb0/0xb0 Jan 7 12:48:15 localhost kernel: ptdump_walk_pgd_level_checkwx+0x15/0x50 Jan 7 12:48:15 localhost kernel: kernel_init+0x2b/0xfc Jan 7 12:48:15 localhost kernel: ret_from_fork+0x1f/0x30 Jan 7 12:48:15 localhost kernel: Code: fa 22 00 41 f7 c7 00 01 00 00 0f 85 50 fe ff ff e9 c0 fc ff ff 48 89 f2 48 c7 c7 70 f5 07 82 c6 05 3e 34 2c 01 01 e8 29 4f 09 00 <0f> ff 48 8b 73 10 e9 67 fa ff ff 4d 89 c8 b9 10 00 00 00 be 10 Jan 7 12:48:15 localhost kernel: ---[ end trace 343ef9ee42b93c33 ]--- Jan 7 12:48:15 localhost kernel: x86/mm: Checked W+X mappings: FAILED, 5277 W+X pages found. Jan 7 12:48:15 localhost kernel: rodata_test: all tests were successful Jan 7 12:48:15 localhost dracut-cmdline[175]: Using kernel command line parameters: root=UUID=dca7eea1-687e-476a-a9a0-c41ef0329113 ro rhgb quiet audit=0 LANG=en_US.UTF-8 Jan 7 12:48:15 localhost kernel: xen_netfront: Initialising Xen virtual ethernet driver Jan 7 12:48:15 localhost kernel: Invalid max_queues (4), will use default max: 1. Jan 7 12:48:15 localhost kernel: blkfront: xvda: barrier: enabled; persistent grants: disabled; indirect descriptors: disabled; Jan 7 12:48:15 localhost kernel: xvda: xvda1 xvda2 Jan 7 12:48:15 localhost kernel: blkfront: xvdb: barrier: enabled; persistent grants: disabled; indirect descriptors: disabled; Jan 7 12:48:16 localhost kernel: EXT4-fs (xvda1): mounted filesystem with ordered data mode. Opts: (null) Jan 7 12:48:17 localhost systemd[1]: Stopped Create list of required static device nodes for the current kernel. Jan 7 12:48:28 localhost kernel: systemd: 18 output lines suppressed due to ratelimiting Jan 7 12:48:28 localhost kernel: SELinux: Disabled at runtime.
UPDATE: Same problem with actual JAVA .. 1.8. so scratch "old" as a pattern. there is none anymore. Jan 8 15:15:43 xx kernel: BUG: unable to handle kernel NULL pointer dereference at 00000008 Jan 8 15:15:43 xx kernel: IP: __radix_tree_lookup+0xe/0xa0 Jan 8 15:15:43 xx kernel: *pdpt = 0000000019977027 *pde = 0000000000000000 Jan 8 15:15:43 xx kernel: Oops: 0000 [#1] SMP Jan 8 15:15:43 xx kernel: Modules linked in: nfsv3 nfs_acl nfs lockd grace sunrpc fscache rmd160 ip_vti ip_tunnel af_key ah6 ah4 esp6 esp4 xfrm4_mode_beet xfrm4_tunnel tunnel4 xfrm4_mode_tunnel xfrm4_mode_transport xfrm6_mode_transport xfrm6_mode_ro xfrm6_mode_beet xfrm6 _mode_tunnel ipcomp ipcomp6 xfrm6_tunnel tunnel6 xfrm_ipcomp chacha20poly1305 cmac camellia_generic cast6_generic cast5_generic cast_common deflate ccm serpent_sse2_i586 serpent_generic glue_helper ablk_helper blowfish_generic cls_u32 blowfish_common twofish_generic sch_h tb twofish_i586 twofish_common xcbc sha512_generic des_generic geode_aes xt_owner xt_multiport ip6table_filter ip6_tables xenfs xen_privcmd coretemp xen_netfront nf_conntrack_pptp nf_conntrack_proto_gre nf_conntrack libcrc32c crc32c_intel xen_blkfront Jan 8 15:15:43 xx kernel: CPU: 0 PID: 1740 Comm: java Not tainted 4.14.11-200.fc26.i686+PAE #1 Jan 8 15:15:43 xx kernel: task: d750c500 task.stack: d7f96000 Jan 8 15:15:43 xx kernel: EIP: __radix_tree_lookup+0xe/0xa0 Jan 8 15:15:43 xx kernel: EFLAGS: 00010282 CPU: 0 Jan 8 15:15:43 xx kernel: EAX: 00000004 EBX: 0de6d000 ECX: 00000000 EDX: 00000000 Jan 8 15:15:43 xx kernel: ESI: 00000000 EDI: 00000004 EBP: d7f97da8 ESP: d7f97d98 Jan 8 15:15:43 xx kernel: DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0069 Jan 8 15:15:43 xx kernel: CR0: 80050033 CR2: 00000008 CR3: 01ee4000 CR4: 00002660 Jan 8 15:15:43 xx kernel: Call Trace: Jan 8 15:15:43 xx kernel: radix_tree_lookup_slot+0x1d/0x40 Jan 8 15:15:43 xx kernel: find_get_entry+0x20/0x160 Jan 8 15:15:43 xx kernel: pagecache_get_page+0x24/0x290 Jan 8 15:15:43 xx kernel: lookup_swap_cache+0x3a/0x100 Jan 8 15:15:43 xx kernel: swap_readahead_detect+0x55/0x280 Jan 8 15:15:43 xx kernel: ? xen_set_pte_at+0x81/0x140 Jan 8 15:15:43 xx kernel: do_swap_page+0x22a/0x990 Jan 8 15:15:43 xx kernel: ? wp_page_copy+0x361/0x6f0 Jan 8 15:15:43 xx kernel: ? kmap_atomic_prot+0x3e/0x130 Jan 8 15:15:43 xx kernel: handle_mm_fault+0x498/0xc90 Jan 8 15:15:43 xx kernel: ? xen_timer_interrupt+0x17/0x30 Jan 8 15:15:43 xx kernel: __do_page_fault+0x202/0x4d0 Jan 8 15:15:43 xx kernel: ? __do_page_fault+0x4d0/0x4d0 Jan 8 15:15:43 xx kernel: do_page_fault+0x27/0xd0 Jan 8 15:15:43 xx kernel: ? __do_page_fault+0x4d0/0x4d0 Jan 8 15:15:43 xx kernel: common_exception+0x6f/0x76 Jan 8 15:15:43 xx kernel: EIP: 0x1f51b404 Jan 8 15:15:43 xx kernel: EFLAGS: 00010202 CPU: 0 Jan 8 15:15:43 xx kernel: EAX: 00000000 EBX: 0de6df60 ECX: 00000000 EDX: 40000000 Jan 8 15:15:43 xx kernel: ESI: 00000001 EDI: 40000000 EBP: 0f1f40b8 ESP: 0f1f409c Jan 8 15:15:43 xx kernel: DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b Jan 8 15:15:43 xx kernel: Code: b6 00 00 00 00 0f 0b 8d b6 00 00 00 00 b8 ef ff ff ff eb a4 e8 34 73 8b ff 8d 74 26 00 55 89 e5 57 56 53 89 c7 83 ec 04 89 4d f0 <8b> 5f 04 89 d8 83 e0 03 83 f8 01 75 6d 89 d8 83 e0 fe 0f b6 08 Jan 8 15:15:43 xx kernel: EIP: __radix_tree_lookup+0xe/0xa0 SS:ESP: 0069:d7f97d98 Jan 8 15:15:43 xx kernel: CR2: 0000000000000008 Jan 8 15:15:43 xx kernel: ---[ end trace 91253bf32b64ee98 ]---
New Kernel keeps oopsing ? : 4.14.13-200.fc26.x86_64 #1 Jan 15 17:10:52 s113 kernel: Loaded X.509 cert 'Fedora kernel signing key: 576c47169253a79ded122c9a10ad1a036dd36fc6' Jan 15 17:10:52 s113 kernel: zswap: loaded using pool lzo/zbud Jan 15 17:10:52 s113 kernel: Key type big_key registered Jan 15 17:10:52 s113 kernel: Key type encrypted registered Jan 15 17:10:52 s113 kernel: xenbus_probe_frontend: Device with no driver: device/vbd/51712 Jan 15 17:10:52 s113 kernel: xenbus_probe_frontend: Device with no driver: device/vbd/51728 Jan 15 17:10:52 s113 kernel: xenbus_probe_frontend: Device with no driver: device/vif/0 Jan 15 17:10:52 s113 kernel: Magic number: 1:252:3141 Jan 15 17:10:52 s113 kernel: hctosys: unable to open rtc device (rtc0) Jan 15 17:10:52 s113 kernel: Freeing unused kernel memory: 2108K Jan 15 17:10:52 s113 kernel: Write protecting the kernel read-only data: 18432k Jan 15 17:10:52 s113 kernel: Freeing unused kernel memory: 2012K Jan 15 17:10:52 s113 kernel: Freeing unused kernel memory: 368K Jan 15 17:10:52 s113 kernel: x86/mm: Found insecure W+X mapping at address ffff880000000000/0xffff880000000000 Jan 15 17:10:52 s113 kernel: ------------[ cut here ]------------ Jan 15 17:10:52 s113 kernel: WARNING: CPU: 0 PID: 1 at arch/x86/mm/dump_pagetables.c:266 note_page+0x646/0x830 Jan 15 17:10:52 s113 kernel: Modules linked in: Jan 15 17:10:52 s113 kernel: CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.14.13-200.fc26.x86_64 #1 Jan 15 17:10:52 s113 kernel: task: ffff88007c453c00 task.stack: ffffc90000714000 Jan 15 17:10:52 s113 kernel: RIP: e030:note_page+0x646/0x830 Jan 15 17:10:52 s113 kernel: RSP: e02b:ffffc90000717df8 EFLAGS: 00010282 Jan 15 17:10:52 s113 kernel: RAX: 0000000000000051 RBX: ffffc90000717e90 RCX: ffffffff82252fc8 Jan 15 17:10:52 s113 kernel: RDX: 0000000000000000 RSI: 00000000000000e0 RDI: 0000000000000200 Jan 15 17:10:52 s113 kernel: RBP: ffffc90000717e30 R08: 00000000000000e0 R09: 0000000000000004 Jan 15 17:10:52 s113 kernel: R10: 0000000000080000 R11: 0000000000000001 R12: 8010000000000065 Jan 15 17:10:52 s113 kernel: R13: 0000000000000005 R14: 0000000000000000 R15: ffffc90000717e90 Jan 15 17:10:52 s113 kernel: FS: 00007f163e764700(0000) GS:ffff88007d000000(0000) knlGS:0000000000000000 Jan 15 17:10:52 s113 kernel: CS: e033 DS: 0000 ES: 0000 CR0: 0000000080050033 Jan 15 17:10:52 s113 kernel: CR2: 00007f163e320000 CR3: 000000000220a000 CR4: 0000000000002660 Jan 15 17:10:52 s113 kernel: Call Trace: Jan 15 17:10:52 s113 kernel: ptdump_walk_pgd_level_core+0x402/0x4e0 Jan 15 17:10:52 s113 kernel: ? 0xffffffff81000000 Jan 15 17:10:52 s113 kernel: ptdump_walk_pgd_level_checkwx+0x19/0x50 Jan 15 17:10:52 s113 kernel: mark_rodata_ro+0xf4/0x100 Jan 15 17:10:52 s113 kernel: ? rest_init+0xb0/0xb0 Jan 15 17:10:52 s113 kernel: kernel_init+0x2f/0x101 Jan 15 17:10:52 s113 kernel: ret_from_fork+0x32/0x40 Jan 15 17:10:52 s113 kernel: Code: 19 24 00 41 f7 c7 00 01 00 00 0f 85 33 fe ff ff e9 9f fc ff ff 48 89 f2 48 c7 c7 40 fa 07 82 c6 05 2a fd 2b 01 01 e8 ab 91 09 00 <0f> ff 48 8b 73 10 e9 50 fa ff ff 4d 89 c8 b9 10 00 00 00 be 10 Jan 15 17:10:52 s113 kernel: ---[ end trace 6134594275218b79 ]--- Jan 15 17:10:52 s113 kernel: x86/mm: Checked W+X mappings: FAILED, 5310 W+X pages found. I don#t have a long time log yet.. i will recheck it tomorrow.
New Kernel Bugs : 3 in 9 hours Jan 16 00:22:03 s113 kernel: BUG: unable to handle kernel paging request at 0000000000005401 Jan 16 00:22:03 s113 kernel: IP: 0x5401 Jan 16 00:22:03 s113 kernel: PGD 25ede067 P4D 25ede067 PUD 79a6d067 PMD 0 Jan 16 00:22:03 s113 kernel: Oops: 0010 [#1] SMP NOPTI Jan 16 00:22:03 s113 kernel: Modules linked in: fuse xt_owner xt_multiport ip6table_filter ip6_tables cls_u32 sch_htb cfg80211 rfkill xenfs xen_privcmd coretemp nfsd auth_rpcgss nfs_acl lockd grace sunrpc xen_blkfront crc32c_intel xen_netfront Jan 16 00:22:03 s113 kernel: CPU: 0 PID: 28840 Comm: xenstore Tainted: G W 4.14.13-200.fc26.x86_64 #1 Jan 16 00:22:03 s113 kernel: task: ffff880021735a00 task.stack: ffffc90000b78000 Jan 16 00:22:03 s113 kernel: RIP: e030:0x5401 Jan 16 00:22:03 s113 kernel: RSP: e02b:ffffc90000b7bfd0 EFLAGS: 00010216 Jan 16 00:22:03 s113 kernel: RAX: 0000000000000036 RBX: 0000000000000001 RCX: 0000000000005401 Jan 16 00:22:03 s113 kernel: RDX: 00000000ffa97f1c RSI: 00000000ffa97f58 RDI: 0000000000000000 Jan 16 00:22:03 s113 kernel: RBP: 00000000ffa97f08 R08: 0000000000000000 R09: 0000000000000000 Jan 16 00:22:03 s113 kernel: R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 Jan 16 00:22:03 s113 kernel: R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 Jan 16 00:22:03 s113 kernel: FS: 00007f94cf320b40(0000) GS:ffff88007d000000(0000) knlGS:0000000000000000 Jan 16 00:22:03 s113 kernel: CS: e033 DS: 002b ES: 002b CR0: 0000000080050033 Jan 16 00:22:03 s113 kernel: CR2: 0000000000005401 CR3: 0000000073142000 CR4: 0000000000002660 Jan 16 00:22:03 s113 kernel: Call Trace: Jan 16 00:22:03 s113 kernel: ? switch_to_thread_stack+0x21/0x40 Jan 16 00:22:03 s113 kernel: Code: Bad RIP value. Jan 16 00:22:03 s113 kernel: RIP: 0x5401 RSP: ffffc90000b7bfd0 Jan 16 00:22:03 s113 kernel: CR2: 0000000000005401 Jan 16 00:22:03 s113 kernel: ---[ end trace 6134594275218b7a ]--- Jan 16 02:01:04 s113 kernel: FS-Cache: Loaded Jan 16 02:01:04 s113 kernel: FS-Cache: Netfs 'nfs' registered for caching Jan 16 07:50:48 s113 kernel: BUG: unable to handle kernel NULL pointer dereference at 00000000000000c3 Jan 16 07:50:48 s113 kernel: IP: 0xffa100d0 Jan 16 07:50:48 s113 kernel: PGD 2cca0067 P4D 2cca0067 PUD 730e1067 PMD 0 Jan 16 07:50:48 s113 kernel: Oops: 0002 [#2] SMP NOPTI Jan 16 07:50:48 s113 kernel: Modules linked in: nfsv3 nfs fscache fuse xt_owner xt_multiport ip6table_filter ip6_tables cls_u32 sch_htb cfg80211 rfkill xenfs xen_privcmd coretemp nfsd auth_rpcgss nfs_acl lockd grace sunrpc xen_blkfront crc32c_intel xen_netfront Jan 16 07:50:48 s113 kernel: CPU: 0 PID: 27826 Comm: xenstore Tainted: G D W 4.14.13-200.fc26.x86_64 #1 Jan 16 07:50:48 s113 kernel: task: ffff88002fbada00 task.stack: ffffc90000bf8000 Jan 16 07:50:48 s113 kernel: RIP: e030:0xffa100d0 Jan 16 07:50:48 s113 kernel: RSP: e02b:ffffc90000bfbfd0 EFLAGS: 00010206 Jan 16 07:50:48 s113 kernel: RAX: 00000000000000c3 RBX: 0000000008052397 RCX: 00000000ffa100d0 Jan 16 07:50:48 s113 kernel: RDX: 000000000805406c RSI: 0000000000000004 RDI: 0000000000000000 Jan 16 07:50:48 s113 kernel: RBP: 00000000ffa10138 R08: 0000000000000000 R09: 0000000000000000 Jan 16 07:50:48 s113 kernel: R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 Jan 16 07:50:48 s113 kernel: R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 Jan 16 07:50:48 s113 kernel: FS: 00007f1c68889b40(0000) GS:ffff88007d000000(0000) knlGS:0000000000000000 Jan 16 07:50:48 s113 kernel: CS: e033 DS: 002b ES: 002b CR0: 0000000080050033 Jan 16 07:50:48 s113 kernel: CR2: 00000000000000c3 CR3: 000000007a036000 CR4: 0000000000002660 Jan 16 07:50:48 s113 kernel: Call Trace: Jan 16 07:50:48 s113 kernel: ? switch_to_thread_stack+0x21/0x40 Jan 16 07:50:48 s113 kernel: Code: Bad RIP value. Jan 16 07:50:48 s113 kernel: RIP: 0xffa100d0 RSP: ffffc90000bfbfd0 Jan 16 07:50:48 s113 kernel: CR2: 00000000000000c3 Jan 16 07:50:48 s113 kernel: ---[ end trace 6134594275218b7b ]--- Jan 16 09:34:56 s113 kernel: BUG: unable to handle kernel NULL pointer dereference at 00000000000000c3 Jan 16 09:34:56 s113 kernel: IP: 0xffaa37d0 Jan 16 09:34:56 s113 kernel: PGD 7a521067 P4D 7a521067 PUD 4e4cf067 PMD 0 Jan 16 09:34:56 s113 kernel: Oops: 0002 [#3] SMP NOPTI Jan 16 09:34:56 s113 kernel: Modules linked in: nfsv3 nfs fscache fuse xt_owner xt_multiport ip6table_filter ip6_tables cls_u32 sch_htb cfg80211 rfkill xenfs xen_privcmd coretemp nfsd auth_rpcgss nfs_acl lockd grace sunrpc xen_blkfront crc32c_intel xen_netfront Jan 16 09:34:56 s113 kernel: CPU: 0 PID: 18183 Comm: xenstore Tainted: G D W 4.14.13-200.fc26.x86_64 #1 Jan 16 09:34:56 s113 kernel: task: ffff88007a615a00 task.stack: ffffc900019e0000 Jan 16 09:34:56 s113 kernel: RIP: e030:0xffaa37d0 Jan 16 09:34:56 s113 kernel: RSP: e02b:ffffc900019e3fd0 EFLAGS: 00010206 Jan 16 09:34:56 s113 kernel: RAX: 00000000000000c3 RBX: 0000000008052397 RCX: 00000000ffaa37d0 Jan 16 09:34:56 s113 kernel: RDX: 000000000805406c RSI: 0000000000000004 RDI: 0000000000000000 Jan 16 09:34:56 s113 kernel: RBP: 00000000ffaa3838 R08: 0000000000000000 R09: 0000000000000000 Jan 16 09:34:56 s113 kernel: R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 Jan 16 09:34:56 s113 kernel: R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 Jan 16 09:34:56 s113 kernel: FS: 00007fdecd4a9b40(0000) GS:ffff88007d000000(0000) knlGS:0000000000000000 Jan 16 09:34:56 s113 kernel: CS: e033 DS: 002b ES: 002b CR0: 0000000080050033 Jan 16 09:34:56 s113 kernel: CR2: 00000000000000c3 CR3: 0000000079a86000 CR4: 0000000000002660 Jan 16 09:34:56 s113 kernel: Call Trace: Jan 16 09:34:56 s113 kernel: ? switch_to_thread_stack+0x21/0x40 Jan 16 09:34:56 s113 kernel: Code: Bad RIP value. Jan 16 09:34:56 s113 kernel: RIP: 0xffaa37d0 RSP: ffffc900019e3fd0 Jan 16 09:34:56 s113 kernel: CR2: 00000000000000c3 Jan 16 09:34:56 s113 kernel: ---[ end trace 6134594275218b7c ]---
Same here on 4.14.13-200.fc26.i686+PAE. Dmesg output below: [ 6300.104631] BUG: unable to handle kernel NULL pointer dereference at 00000008 [ 6300.104653] IP: __radix_tree_lookup+0xe/0xa0 [ 6300.104655] *pdpt = 000000000ea5c001 *pde = 0000000000000000 [ 6300.104660] Oops: 0000 [#1] SMP [ 6300.104663] Modules linked in: fuse xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 tun ccm nf_conntrack_netbios_ns nf_conntrack_broadcast xt_CT ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack libcrc32c iptable_mangle iptable_raw iptable_security ebtable_filter ebtables ip6table_filter ip6_tables sunrpc intel_powerclamp arc4 coretemp snd_hda_codec_hdmi kvm_intel snd_hda_codec_realtek snd_hda_codec_generic ath9k kvm ath9k_common ath9k_hw snd_hda_intel uvcvideo iTCO_wdt iTCO_vendor_support snd_hda_codec videobuf2_vmalloc mac80211 irqbypass videobuf2_memops [ 6300.104711] videobuf2_v4l2 intel_cstate intel_uncore videobuf2_core snd_hda_core videodev snd_hwdep snd_seq snd_seq_device ath media snd_pcm cfg80211 joydev snd_timer mei_me snd mei i2c_i801 intel_ips soundcore lpc_ich acer_wmi sparse_keymap rfkill mxm_wmi wmi acpi_cpufreq i915 i2c_algo_bit crc32c_intel drm_kms_helper broadcom bcm_phy_lib serio_raw drm tg3 ata_generic ptp pata_acpi pps_core video [ 6300.104742] CPU: 1 PID: 12127 Comm: java Not tainted 4.14.13-200.fc26.i686+PAE #1 [ 6300.104744] Hardware name: Acer Aspire 4740 /Aspire 4740 , BIOS V1.05 02/01/2010 [ 6300.104746] task: ef710000 task.stack: e51de000 [ 6300.104749] EIP: __radix_tree_lookup+0xe/0xa0 [ 6300.104750] EFLAGS: 00010282 CPU: 1 [ 6300.104752] EAX: 00000004 EBX: 6d135000 ECX: 00000000 EDX: 00000000 [ 6300.104753] ESI: 00000000 EDI: 00000004 EBP: e51dfda8 ESP: e51dfd98 [ 6300.104755] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 [ 6300.104757] CR0: 80050033 CR2: 00000008 CR3: 36233540 CR4: 000006f0 [ 6300.104759] Call Trace: [ 6300.104766] radix_tree_lookup_slot+0x1d/0x40 [ 6300.104773] find_get_entry+0x20/0x160 [ 6300.104776] pagecache_get_page+0x24/0x290 [ 6300.104780] lookup_swap_cache+0x3a/0x100 [ 6300.104783] swap_readahead_detect+0x55/0x280 [ 6300.104788] do_swap_page+0x22a/0x990 [ 6300.104794] ? __update_load_avg_se.isra.25+0x214/0x220 [ 6300.104799] ? kmap_atomic_prot+0x103/0x130 [ 6300.104802] handle_mm_fault+0x498/0xc90 [ 6300.104806] __do_page_fault+0x202/0x4d0 [ 6300.104809] ? __do_page_fault+0x4d0/0x4d0 [ 6300.104811] do_page_fault+0x27/0xd0 [ 6300.104815] ? irq_exit+0x68/0xd0 [ 6300.104817] ? __do_page_fault+0x4d0/0x4d0 [ 6300.104822] common_exception+0x81/0x86 [ 6300.104824] EIP: 0xb7572ef2 [ 6300.104825] EFLAGS: 00010206 CPU: 1 [ 6300.104827] EAX: 00000000 EBX: b790b000 ECX: 00000056 EDX: 6d134f80 [ 6300.104829] ESI: 949eebe0 EDI: 6d135000 EBP: 6c4ff178 ESP: 6c4ff120 [ 6300.104830] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b [ 6300.104832] Code: b6 00 00 00 00 0f 0b 8d b6 00 00 00 00 b8 ef ff ff ff eb a4 e8 64 6c 8b ff 8d 74 26 00 55 89 e5 57 56 53 89 c7 83 ec 04 89 4d f0 <8b> 5f 04 89 d8 83 e0 03 83 f8 01 75 6d 89 d8 83 e0 fe 0f b6 08 [ 6300.104863] EIP: __radix_tree_lookup+0xe/0xa0 SS:ESP: 0068:e51dfd98 [ 6300.104865] CR2: 0000000000000008 [ 6300.104868] ---[ end trace 8fba1c28cc11c45f ]--- [ 6300.109095] BUG: unable to handle kernel NULL pointer dereference at 00000008 [ 6300.109110] IP: __radix_tree_lookup+0xe/0xa0 [ 6300.109112] *pdpt = 000000000ea5c001 *pde = 0000000000000000 [ 6300.109116] Oops: 0000 [#2] SMP [ 6300.109118] Modules linked in: fuse xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 tun ccm nf_conntrack_netbios_ns nf_conntrack_broadcast xt_CT ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack libcrc32c iptable_mangle iptable_raw iptable_security ebtable_filter ebtables ip6table_filter ip6_tables sunrpc intel_powerclamp arc4 coretemp snd_hda_codec_hdmi kvm_intel snd_hda_codec_realtek snd_hda_codec_generic ath9k kvm ath9k_common ath9k_hw snd_hda_intel uvcvideo iTCO_wdt iTCO_vendor_support snd_hda_codec videobuf2_vmalloc mac80211 irqbypass videobuf2_memops [ 6300.109155] videobuf2_v4l2 intel_cstate intel_uncore videobuf2_core snd_hda_core videodev snd_hwdep snd_seq snd_seq_device ath media snd_pcm cfg80211 joydev snd_timer mei_me snd mei i2c_i801 intel_ips soundcore lpc_ich acer_wmi sparse_keymap rfkill mxm_wmi wmi acpi_cpufreq i915 i2c_algo_bit crc32c_intel drm_kms_helper broadcom bcm_phy_lib serio_raw drm tg3 ata_generic ptp pata_acpi pps_core video [ 6300.109181] CPU: 0 PID: 12126 Comm: java Tainted: G D 4.14.13-200.fc26.i686+PAE #1 [ 6300.109182] Hardware name: Acer Aspire 4740 /Aspire 4740 , BIOS V1.05 02/01/2010 [ 6300.109184] task: ef715640 task.stack: cc47e000 [ 6300.109186] EIP: __radix_tree_lookup+0xe/0xa0 [ 6300.109187] EFLAGS: 00010282 CPU: 0 [ 6300.109189] EAX: 00000004 EBX: 6d3c0000 ECX: 00000000 EDX: 00000000 [ 6300.109190] ESI: 00000000 EDI: 00000004 EBP: cc47fda8 ESP: cc47fd98 [ 6300.109192] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 [ 6300.109193] CR0: 80050033 CR2: 00000008 CR3: 36233540 CR4: 000006f0 [ 6300.109195] Call Trace: [ 6300.109199] radix_tree_lookup_slot+0x1d/0x40 [ 6300.109205] find_get_entry+0x20/0x160 [ 6300.109208] pagecache_get_page+0x24/0x290 [ 6300.109212] lookup_swap_cache+0x3a/0x100 [ 6300.109214] swap_readahead_detect+0x55/0x280 [ 6300.109217] do_swap_page+0x22a/0x990 [ 6300.109222] ? irq_exit+0x68/0xd0 [ 6300.109227] ? kmap_atomic_prot+0x103/0x130 [ 6300.109229] handle_mm_fault+0x498/0xc90 [ 6300.109232] __do_page_fault+0x202/0x4d0 [ 6300.109234] ? __do_page_fault+0x4d0/0x4d0 [ 6300.109235] do_page_fault+0x27/0xd0 [ 6300.109239] ? do_IRQ+0x70/0xc0 [ 6300.109241] ? __do_page_fault+0x4d0/0x4d0 [ 6300.109243] common_exception+0x81/0x86 [ 6300.109245] EIP: 0xb7572ef2 [ 6300.109246] EFLAGS: 00010202 CPU: 0 [ 6300.109247] EAX: 00000000 EBX: b790b000 ECX: 0000000c EDX: 6d3bff90 [ 6300.109249] ESI: 9447df30 EDI: 6d3c0000 EBP: 6c6fa178 ESP: 6c6fa120 [ 6300.109250] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b [ 6300.109252] Code: b6 00 00 00 00 0f 0b 8d b6 00 00 00 00 b8 ef ff ff ff eb a4 e8 64 6c 8b ff 8d 74 26 00 55 89 e5 57 56 53 89 c7 83 ec 04 89 4d f0 <8b> 5f 04 89 d8 83 e0 03 83 f8 01 75 6d 89 d8 83 e0 fe 0f b6 08 [ 6300.109275] EIP: __radix_tree_lookup+0xe/0xa0 SS:ESP: 0068:cc47fd98 [ 6300.109276] CR2: 0000000000000008 [ 6300.109279] ---[ end trace 8fba1c28cc11c460 ]---
In my case I tried to start Arduino IDE: RPM -> arduino-1.6.6-3.fc26.noarch
Java version: java-1.8.0-openjdk-1.8.0.151-1.b12.fc26.i386
Someone else reported an instance upstream and one of the developers gave a debugging patch, can you test a scratch build with the debug patch? https://koji.fedoraproject.org/koji/taskinfo?taskID=24303894
Whenn will it be compiled (estimated) ?
Which Package ? _debug or the normal one ?
Normal one should be fine. The debugging patch will spit out a warning "Bad swp_entry:" if it finds anything.
Kernel bootet.. lets wait for a bug ...
and here are the results: Jan 21 06:04:21 s145 kernel: BUG: unable to handle kernel NULL pointer dereference at 0000000000000003 Jan 21 06:04:21 s145 kernel: IP: 0x99cf048 Jan 21 06:04:21 s145 kernel: PGD 461b067 P4D 461b067 PUD 57fd067 PMD 0 Jan 21 06:04:21 s145 kernel: Oops: 0002 [#3] SMP NOPTI Jan 21 06:04:21 s145 kernel: Modules linked in: nfsv3 nfs fscache fuse nfsd auth_rpcgss nfs_acl lockd grace xt_owner xt_multiport ip6table_filter ip6_tables cfg80211 rfkill xenfs xen_privcmd sunrpc edac_mce_amd crct10dif_pclmul crc32_pclmul ghash_clmulni_intel xen_netfront xen_blkfront crc32c_intel Jan 21 06:04:21 s145 kernel: CPU: 0 PID: 6020 Comm: xenstore-read Tainted: G D W 4.14.14-201.fc26.x86_64 #1 Jan 21 06:04:21 s145 kernel: task: ffff880005708000 task.stack: ffffc90000d18000 Jan 21 06:04:21 s145 kernel: RIP: e030:0x99cf048 Jan 21 06:04:21 s145 kernel: RSP: e02b:ffffc90000d1bfd0 EFLAGS: 00010206 Jan 21 06:04:21 s145 kernel: RAX: 0000000000000003 RBX: 0000000000000003 RCX: 00000000099cf048 Jan 21 06:04:21 s145 kernel: RDX: 0000000000000002 RSI: 00000000099cf048 RDI: 0000000000000000 Jan 21 06:04:21 s145 kernel: RBP: 00000000ffddc0a8 R08: 0000000000000000 R09: 0000000000000000 Jan 21 06:04:21 s145 kernel: R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 Jan 21 06:04:21 s145 kernel: R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 Jan 21 06:04:21 s145 kernel: FS: 00007f8761262b40(0000) GS:ffff88007f400000(0000) knlGS:0000000000000000 Jan 21 06:04:21 s145 kernel: CS: e033 DS: 002b ES: 002b CR0: 0000000080050033 Jan 21 06:04:21 s145 kernel: CR2: 0000000000000003 CR3: 0000000003556000 CR4: 0000000000000660 Jan 21 06:04:21 s145 kernel: Call Trace: Jan 21 06:04:21 s145 kernel: ? switch_to_thread_stack+0x21/0x40 Jan 21 06:04:21 s145 kernel: Code: Bad RIP value. Jan 21 06:04:21 s145 kernel: RIP: 0x99cf048 RSP: ffffc90000d1bfd0 Jan 21 06:04:21 s145 kernel: CR2: 0000000000000003 Jan 21 06:04:21 s145 kernel: ---[ end trace 50c257ff957ddb5b ]--- Jan 21 08:17:31 s145 kernel: invalid opcode: 0000 [#4] SMP NOPTI Jan 21 08:17:31 s145 kernel: Modules linked in: nfsv3 nfs fscache fuse nfsd auth_rpcgss nfs_acl lockd grace xt_owner xt_multiport ip6table_filter ip6_tables cfg80211 rfkill xenfs xen_privcmd sunrpc edac_mce_amd crct10dif_pclmul crc32_pclmul ghash_clmulni_intel xen_netfront xen_blkfront crc32c_intel Jan 21 08:17:31 s145 kernel: CPU: 0 PID: 7087 Comm: xenstore-read Tainted: G D W 4.14.14-201.fc26.x86_64 #1 Jan 21 08:17:31 s145 kernel: task: ffff88007a873c00 task.stack: ffffc900008c8000 Jan 21 08:17:31 s145 kernel: RIP: e030:0xffd9af11 Jan 21 08:17:31 s145 kernel: RSP: e02b:ffffc900008cbfd0 EFLAGS: 00010206 Jan 21 08:17:31 s145 kernel: RAX: 0000000000000004 RBX: 0000000000000003 RCX: 00000000ffd9af11 Jan 21 08:17:31 s145 kernel: RDX: 0000000000000022 RSI: 00000000ffd9af11 RDI: 0000000000000000 Jan 21 08:17:31 s145 kernel: RBP: 00000000ffd99b58 R08: 0000000000000000 R09: 0000000000000000 Jan 21 08:17:31 s145 kernel: R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 Jan 21 08:17:31 s145 kernel: R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 Jan 21 08:17:31 s145 kernel: FS: 00007f2cc2baeb40(0000) GS:ffff88007f400000(0000) knlGS:0000000000000000 Jan 21 08:17:31 s145 kernel: CS: e033 DS: 002b ES: 002b CR0: 0000000080050033 Jan 21 08:17:31 s145 kernel: CR2: 00000000088bb004 CR3: 0000000079d66000 CR4: 0000000000000660 Jan 21 08:17:31 s145 kernel: Call Trace: Jan 21 08:17:31 s145 kernel: ? switch_to_thread_stack+0x21/0x40 Jan 21 08:17:31 s145 kernel: Code: Bad RIP value. Jan 21 08:17:31 s145 kernel: RIP: 0xffd9af11 RSP: ffffc900008cbfd0 Jan 21 08:17:31 s145 kernel: ---[ end trace 50c257ff957ddb5c ]---
(In reply to customercare from comment #4) > what worries me, is that "swapper/0" entry, directly after starting the > system : The "swapper/0" entry was printed because the "x86/mm: Found insecure W+X mapping at address ffff880000000000/0xffff880000000000" line was printed. ---------------------------------------- /* * This function gets called on a break in a continuous series * of PTE entries; the next one is different so we need to * print what we collected so far. */ static void note_page(struct seq_file *m, struct pg_state *st, pgprot_t new_prot, int level) { pgprotval_t prot, cur; static const char units[] = "BKMGTPE"; /* * If we have a "break" in the series, we need to flush the state that * we have now. "break" is either changing perms, levels or * address space marker. */ prot = pgprot_val(new_prot); cur = pgprot_val(st->current_prot); if (!st->level) { /* First entry */ st->current_prot = new_prot; st->level = level; st->marker = address_markers; st->lines = 0; pt_dump_seq_printf(m, st->to_dmesg, "---[ %s ]---\n", st->marker->name); } else if (prot != cur || level != st->level || st->current_address >= st->marker[1].start_address) { const char *unit = units; unsigned long delta; int width = sizeof(unsigned long) * 2; pgprotval_t pr = pgprot_val(st->current_prot); if (st->check_wx && (pr & _PAGE_RW) && !(pr & _PAGE_NX)) { WARN_ONCE(1, "x86/mm: Found insecure W+X mapping at address %p/%pS\n", (void *)st->start_address, (void *)st->start_address); st->wx_pages += (st->current_address - st->start_address) / PAGE_SIZE; } ---------------------------------------- This warning in note_page() exists in older kernels such as 4.14.0 and 4.13.0. Did you see "x86/mm: Found insecure W+X mapping at address" line before you started using a kernel version with this problem (e.g. 4.14.11-200.fc26) ? If you did not see the line before, trying to confirm since which kernel version the line started printing might give you some clue. (Doing "git bisect" for identifying the exact commit is nice. But doing rough testing using kernels available as binary rpms would be fine as the first step.)
Info: on fresh booted 32bit system with 4.14.14-200.fc26.i686+PAE, the swapper/0 oops seems to be gone.
an 25 03:14:32 s113 kernel: BUG: unable to handle kernel NULL pointer dereference at 0000000000000030 Jan 25 03:14:32 s113 kernel: IP: switch_to_thread_stack+0x24/0x40 Jan 25 03:14:32 s113 kernel: PGD 7bdad067 P4D 7bdad067 PUD 4d9d067 PMD 0 Jan 25 03:14:32 s113 kernel: Oops: 0000 [#29] SMP NOPTI Jan 25 03:14:32 s113 kernel: Modules linked in: nfsv3 nfs fscache fuse xt_owner xt_multiport ip6table_filter ip6_tables cls_u32 sch_htb cfg80211 rfkill xenfs xen_privcmd coretemp nfsd auth_rpcgss nfs_acl lockd grace sunrpc xen_blkfront crc32c_intel xen_netfront Jan 25 03:14:32 s113 kernel: CPU: 0 PID: 32713 Comm: xenstore-read Tainted: G D W 4.14.13-200.fc26.x86_64 #1 Jan 25 03:14:32 s113 kernel: task: ffff880002b7da00 task.stack: ffffc900009d0000 Jan 25 03:14:32 s113 kernel: RIP: e030:switch_to_thread_stack+0x24/0x40 Jan 25 03:14:33 s113 kernel: RSP: e02b:ffffc900009d3fd0 EFLAGS: 00010216 Jan 25 03:14:33 s113 kernel: RAX: 0000000000000036 RBX: 0000000000000000 RCX: 0000000000005401 Jan 25 03:14:33 s113 kernel: RDX: 00000000ffc54020 RSI: 00000000ffc5405c RDI: 0000000000000000 Jan 25 03:14:33 s113 kernel: RBP: 00000000ffc5400c R08: 0000000000000000 R09: 0000000000000000 Jan 25 03:14:33 s113 kernel: R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 Jan 25 03:14:33 s113 kernel: R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 Jan 25 03:14:33 s113 kernel: FS: 00007fcf716c0700(0000) GS:ffff88007d000000(0000) knlGS:0000000000000000 Jan 25 03:14:33 s113 kernel: CS: e033 DS: 002b ES: 002b CR0: 0000000080050033 Jan 25 03:14:33 s113 kernel: CR2: 0000000000000030 CR3: 000000007ac10000 CR4: 0000000000002660 Jan 25 03:14:33 s113 kernel: Call Trace: Jan 25 03:14:33 s113 kernel: Code: e9 5b fa ff ff 66 90 57 eb 12 0f 20 df 66 66 90 66 90 48 81 e7 ff e7 ff ff 0f 22 df 48 89 e7 65 48 8b 24 25 0c 50 00 00 ff 77 38 <ff> 77 30 ff 77 28 ff 77 20 ff 77 18 ff 77 10 ff 77 08 48 8b 3f Jan 25 03:14:33 s113 kernel: RIP: switch_to_thread_stack+0x24/0x40 RSP: ffffc900009d3fd0 Jan 25 03:14:33 s113 kernel: CR2: 0000000000000030 Jan 25 03:14:33 s113 kernel: ---[ end trace 6134594275218b96 ]--- Jan 25 06:11:02 s113 kernel: BUG: unable to handle kernel paging request at 0000000000005401 Jan 25 06:11:02 s113 kernel: IP: 0x5401 Jan 25 06:11:02 s113 kernel: PGD 5944067 P4D 5944067 PUD 4935067 PMD 0 Jan 25 06:11:02 s113 kernel: Oops: 0010 [#30] SMP NOPTI Jan 25 06:11:02 s113 kernel: Modules linked in: nfsv3 nfs fscache fuse xt_owner xt_multiport ip6table_filter ip6_tables cls_u32 sch_htb cfg80211 rfkill xenfs xen_privcmd coretemp nfsd auth_rpcgss nfs_acl lockd grace sunrpc xen_blkfront crc32c_intel xen_netfront Jan 25 06:11:02 s113 kernel: CPU: 0 PID: 7317 Comm: xenstore-read Tainted: G D W 4.14.13-200.fc26.x86_64 #1 Jan 25 06:11:02 s113 kernel: task: ffff8800169d8000 task.stack: ffffc90001098000 Jan 25 06:11:02 s113 kernel: RIP: e030:0x5401 Jan 25 06:11:02 s113 kernel: RSP: e02b:ffffc9000109bfd0 EFLAGS: 00010216 Jan 25 06:11:02 s113 kernel: RAX: 0000000000000036 RBX: 0000000000000001 RCX: 0000000000005401 Jan 25 06:11:02 s113 kernel: RDX: 00000000ffe5fcfc RSI: 00000000ffe5fd38 RDI: 0000000000000000 Jan 25 06:11:02 s113 kernel: RBP: 00000000ffe5fce8 R08: 0000000000000000 R09: 0000000000000000 Jan 25 06:11:02 s113 kernel: R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 Jan 25 06:11:02 s113 kernel: R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 Jan 25 06:11:02 s113 kernel: FS: 00007f665364e700(0000) GS:ffff88007d000000(0000) knlGS:0000000000000000 Jan 25 06:11:02 s113 kernel: CS: e033 DS: 002b ES: 002b CR0: 0000000080050033 Jan 25 06:11:02 s113 kernel: CR2: 0000000000005401 CR3: 00000000058fe000 CR4: 0000000000002660 Jan 25 06:11:02 s113 kernel: Call Trace: Jan 25 06:11:02 s113 kernel: ? switch_to_thread_stack+0x21/0x40 Jan 25 06:11:02 s113 kernel: Code: Bad RIP value. Jan 25 06:11:02 s113 kernel: RIP: 0x5401 RSP: ffffc9000109bfd0 Jan 25 06:11:02 s113 kernel: CR2: 0000000000005401 Jan 25 06:11:02 s113 kernel: ---[ end trace 6134594275218b97 ]--- Jan 25 07:07:07 s113 kernel: BUG: unable to handle kernel NULL pointer dereference at 00000000000000ae Jan 25 07:07:07 s113 kernel: IP: 0xffd70320 Jan 25 07:07:07 s113 kernel: PGD 561c067 P4D 561c067 PUD 2b93067 PMD 0 Jan 25 07:07:07 s113 kernel: Oops: 0002 [#31] SMP NOPTI Jan 25 07:07:07 s113 kernel: Modules linked in: nfsv3 nfs fscache fuse xt_owner xt_multiport ip6table_filter ip6_tables cls_u32 sch_htb cfg80211 rfkill xenfs xen_privcmd coretemp nfsd auth_rpcgss nfs_acl lockd grace sunrpc xen_blkfront crc32c_intel xen_netfront Jan 25 07:07:07 s113 kernel: CPU: 0 PID: 20186 Comm: xenstore-read Tainted: G D W 4.14.13-200.fc26.x86_64 #1 Jan 25 07:07:07 s113 kernel: task: ffff880037089e00 task.stack: ffffc90001048000 Jan 25 07:07:07 s113 kernel: RIP: e030:0xffd70320 Jan 25 07:07:07 s113 kernel: RSP: e02b:ffffc9000104bfd0 EFLAGS: 00010206 Jan 25 07:07:07 s113 kernel: RAX: 00000000000000ae RBX: 000000000000000d RCX: 00000000ffd70320 Jan 25 07:07:07 s113 kernel: RDX: 00000000ffd70294 RSI: 0000000000000008 RDI: 0000000000000000 Jan 25 07:07:07 s113 kernel: RBP: 00000000ffd7027c R08: 0000000000000000 R09: 0000000000000000 Jan 25 07:07:07 s113 kernel: R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 Jan 25 07:07:07 s113 kernel: R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 Jan 25 07:07:07 s113 kernel: FS: 00007fa627107700(0000) GS:ffff88007d000000(0000) knlGS:0000000000000000 Jan 25 07:07:07 s113 kernel: CS: e033 DS: 002b ES: 002b CR0: 0000000080050033 Jan 25 07:07:07 s113 kernel: CR2: 00000000000000ae CR3: 0000000006284000 CR4: 0000000000002660 Jan 25 07:07:07 s113 kernel: Call Trace: Jan 25 07:07:07 s113 kernel: ? switch_to_thread_stack+0x21/0x40 Jan 25 07:07:07 s113 kernel: Code: Bad RIP value. Jan 25 07:07:07 s113 kernel: RIP: 0xffd70320 RSP: ffffc9000104bfd0 Jan 25 07:07:07 s113 kernel: CR2: 00000000000000ae Jan 25 07:07:07 s113 kernel: ---[ end trace 6134594275218b98 ]---
4.14.14-201 64bit : Jan 25 00:27:17 s145 kernel: BUG: unable to handle kernel NULL pointer dereference at 0000000000000003 Jan 25 00:27:17 s145 kernel: IP: 0x8c37060 Jan 25 00:27:17 s145 kernel: PGD 92ab067 P4D 92ab067 PUD 283a1067 PMD 0 Jan 25 00:27:17 s145 kernel: Oops: 0002 [#15] SMP NOPTI Jan 25 00:27:17 s145 kernel: Modules linked in: nfsv3 nfs fscache fuse nfsd auth_rpcgss nfs_acl lockd grace xt_owner xt_multiport ip6table_filter ip6_tables cfg80211 rfkill xenfs xen_privcmd sunrpc edac_mce_amd crct10dif_pclmul crc32_pclmul ghash_clmulni_intel xen_netfront xen_blkfront crc32c_intel Jan 25 00:27:17 s145 kernel: CPU: 0 PID: 14191 Comm: xenstore Tainted: G D W 4.14.14-201.fc26.x86_64 #1 Jan 25 00:27:17 s145 kernel: task: ffff880079cf8000 task.stack: ffffc90000e68000 Jan 25 00:27:17 s145 kernel: RIP: e030:0x8c37060 Jan 25 00:27:17 s145 kernel: RSP: e02b:ffffc90000e6bfd0 EFLAGS: 00010206 Jan 25 00:27:17 s145 kernel: RAX: 0000000000000003 RBX: 0000000000000003 RCX: 0000000008c37060 Jan 25 00:27:17 s145 kernel: RDX: 0000000000000010 RSI: 0000000008c37060 RDI: 0000000000000000 Jan 25 00:27:17 s145 kernel: RBP: 00000000fffe7d28 R08: 0000000000000000 R09: 0000000000000000 Jan 25 00:27:17 s145 kernel: R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 Jan 25 00:27:17 s145 kernel: R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 Jan 25 00:27:17 s145 kernel: FS: 00007f93a6605700(0000) GS:ffff88007f400000(0000) knlGS:0000000000000000 Jan 25 00:27:17 s145 kernel: CS: e033 DS: 002b ES: 002b CR0: 0000000080050033 Jan 25 00:27:17 s145 kernel: CR2: 0000000000000003 CR3: 0000000002f10000 CR4: 0000000000000660 Jan 25 00:27:17 s145 kernel: Call Trace: Jan 25 00:27:17 s145 kernel: ? switch_to_thread_stack+0x21/0x40 Jan 25 00:27:17 s145 kernel: Code: Bad RIP value. Jan 25 00:27:17 s145 kernel: RIP: 0x8c37060 RSP: ffffc90000e6bfd0 Jan 25 00:27:17 s145 kernel: CR2: 0000000000000003 Jan 25 00:27:17 s145 kernel: ---[ end trace 50c257ff957ddb67 ]--- Jan 25 04:25:38 s145 kernel: BUG: unable to handle kernel NULL pointer dereference at 0000000000000003 Jan 25 04:25:38 s145 kernel: IP: 0x8792048 Jan 25 04:25:38 s145 kernel: PGD 79d28067 P4D 79d28067 PUD 48259067 PMD 0 Jan 25 04:25:38 s145 kernel: Oops: 0002 [#16] SMP NOPTI Jan 25 04:25:38 s145 kernel: Modules linked in: nfsv3 nfs fscache fuse nfsd auth_rpcgss nfs_acl lockd grace xt_owner xt_multiport ip6table_filter ip6_tables cfg80211 rfkill xenfs xen_privcmd sunrpc edac_mce_amd crct10dif_pclmul crc32_pclmul ghash_clmulni_intel xen_netfront xen_blkfront crc32c_intel Jan 25 04:25:38 s145 kernel: CPU: 0 PID: 2625 Comm: xenstore-read Tainted: G D W 4.14.14-201.fc26.x86_64 #1 Jan 25 04:25:38 s145 kernel: task: ffff880006790000 task.stack: ffffc900007f4000 Jan 25 04:25:38 s145 kernel: RIP: e030:0x8792048 Jan 25 04:25:38 s145 kernel: RSP: e02b:ffffc900007f7fd0 EFLAGS: 00010206 Jan 25 04:25:38 s145 kernel: RAX: 0000000000000003 RBX: 0000000000000003 RCX: 0000000008792048 Jan 25 04:25:38 s145 kernel: RDX: 0000000000000002 RSI: 0000000008792048 RDI: 0000000000000000 Jan 25 04:25:38 s145 kernel: RBP: 00000000ffc26798 R08: 0000000000000000 R09: 0000000000000000 Jan 25 04:25:38 s145 kernel: R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 Jan 25 04:25:38 s145 kernel: R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 Jan 25 04:25:38 s145 kernel: FS: 00007f433136d700(0000) GS:ffff88007f400000(0000) knlGS:0000000000000000 Jan 25 04:25:38 s145 kernel: CS: e033 DS: 002b ES: 002b CR0: 0000000080050033 Jan 25 04:25:38 s145 kernel: CR2: 0000000000000003 CR3: 000000000697c000 CR4: 0000000000000660 Jan 25 04:25:38 s145 kernel: Call Trace: Jan 25 04:25:38 s145 kernel: ? switch_to_thread_stack+0x21/0x40 Jan 25 04:25:38 s145 kernel: Code: Bad RIP value. Jan 25 04:25:38 s145 kernel: RIP: 0x8792048 RSP: ffffc900007f7fd0 Jan 25 04:25:38 s145 kernel: CR2: 0000000000000003 Jan 25 04:25:38 s145 kernel: ---[ end trace 50c257ff957ddb68 ]--- Jan 25 08:45:57 s145 kernel: BUG: unable to handle kernel NULL pointer dereference at 00000000000000c3 Jan 25 08:45:57 s145 kernel: IP: 0xffde5f00 Jan 25 08:45:57 s145 kernel: PGD 5615067 P4D 5615067 PUD 2822f067 PMD 0 Jan 25 08:45:57 s145 kernel: Oops: 0002 [#17] SMP NOPTI Jan 25 08:45:57 s145 kernel: Modules linked in: nfsv3 nfs fscache fuse nfsd auth_rpcgss nfs_acl lockd grace xt_owner xt_multiport ip6table_filter ip6_tables cfg80211 rfkill xenfs xen_privcmd sunrpc edac_mce_amd crct10dif_pclmul crc32_pclmul ghash_clmulni_intel xen_netfront xen_blkfront crc32c_intel Jan 25 08:45:57 s145 kernel: CPU: 0 PID: 22490 Comm: xenstore-read Tainted: G D W 4.14.14-201.fc26.x86_64 #1 Jan 25 08:45:57 s145 kernel: task: ffff880005ac9e00 task.stack: ffffc90000f70000 Jan 25 08:45:57 s145 kernel: RIP: e030:0xffde5f00 Jan 25 08:45:57 s145 kernel: RSP: e02b:ffffc90000f73fd0 EFLAGS: 00010206 Jan 25 08:45:57 s145 kernel: RAX: 00000000000000c3 RBX: 000000000805239f RCX: 00000000ffde5f00 Jan 25 08:45:57 s145 kernel: RDX: 000000000805406c RSI: 0000000000000004 RDI: 0000000000000000 Jan 25 08:45:57 s145 kernel: RBP: 00000000ffde5f68 R08: 0000000000000000 R09: 0000000000000000 Jan 25 08:45:57 s145 kernel: R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 Jan 25 08:45:57 s145 kernel: R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 Jan 25 08:45:57 s145 kernel: FS: 00007fb94595d700(0000) GS:ffff88007f400000(0000) knlGS:0000000000000000 Jan 25 08:45:57 s145 kernel: CS: e033 DS: 002b ES: 002b CR0: 0000000080050033 Jan 25 08:45:57 s145 kernel: CR2: 00000000000000c3 CR3: 0000000005a8a000 CR4: 0000000000000660 Jan 25 08:45:57 s145 kernel: Call Trace: Jan 25 08:45:57 s145 kernel: ? switch_to_thread_stack+0x21/0x40 Jan 25 08:45:57 s145 kernel: Code: Bad RIP value. Jan 25 08:45:57 s145 kernel: RIP: 0xffde5f00 RSP: ffffc90000f73fd0 Jan 25 08:45:57 s145 kernel: CR2: 00000000000000c3 Jan 25 08:45:57 s145 kernel: ---[ end trace 50c257ff957ddb69 ]---
hmm.. no kernel oops on 32bit with 4.14.14-200 anymore..
(In reply to customercare from comment #20) > Info: > > on fresh booted 32bit system with 4.14.14-200.fc26.i686+PAE, the swapper/0 > oops seems to be gone. (In reply to customercare from comment #23) > hmm.. no kernel oops on 32bit with 4.14.14-200 anymore.. Does that mean "BUG: unable to handle kernel NULL pointer dereference" lines do not appear unless "x86/mm: Found insecure W+X mapping at address" line appears?
I confirm: 32bit 4.14.14-200.fc26.i686+PAE NO KERNEL OOPS since reboot with this kernel. NO KERNEL OOPS @x86/mm Jan 24 15:46:00 x kernel: Write protecting the kernel text: 8032k Jan 24 15:46:00 x kernel: Write protecting the kernel read-only data: 3212k Jan 24 15:46:00 x kernel: NX-protecting the kernel data: 6304k Jan 24 15:46:00 x kernel: x86/mm: Checked W+X mappings: passed, no W+X pages found. Jan 24 15:46:00 x kernel: rodata_test: all tests were successful Jan 24 15:46:00 x kernel: loop: module loaded Jan 24 15:46:00 x kernel: Invalid max_queues (4), will use default max: 2. # uptime 11:53:25 up 1 day, 20:06, 1 user, load average: 0,09, 0,15, 0,11 Note: After 1 day we had at least one OOPS, mostly 3 oops per day on 32 bit. A longtime test will show if it's true.. I will keep you updated.
64bit: I wondering why, because this message is gone on 32bit systems, but again here it is on 64bit : [ 1.431283] x86/mm: Found insecure W+X mapping at address ffff880000000000/0xffff880000000000 [ 1.431286] ------------[ cut here ]------------ [ 1.431290] WARNING: CPU: 0 PID: 1 at arch/x86/mm/dump_pagetables.c:266 note_page+0x622/0x810 [ 1.431290] Modules linked in: [ 1.431293] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.14.14-200.fc26.x86_64 #1 [ 1.431294] task: ffff88007c458000 task.stack: ffffc90000714000 [ 1.431295] RIP: e030:note_page+0x622/0x810 [ 1.431296] RSP: e02b:ffffc90000717e40 EFLAGS: 00010286 [ 1.431297] RAX: 0000000000000051 RBX: ffffc90000717ec8 RCX: ffffffff82253008 [ 1.431297] RDX: 0000000000000000 RSI: 00000000000000e0 RDI: 0000000000000200 [ 1.431298] RBP: 8010000000000065 R08: 00000000000000e0 R09: 0000000000000004 [ 1.431298] R10: ffff88007fc007a0 R11: 0000000000000001 R12: 0000000000000005 [ 1.431299] R13: 0000000000000000 R14: 00003ffffffff000 R15: 0000000000000000 [ 1.431302] FS: 00007fbb9e317040(0000) GS:ffff88007f400000(0000) knlGS:0000000000000000 [ 1.431303] CS: e033 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1.431303] CR2: 00007fbb9ded0000 CR3: 000000000220a000 CR4: 0000000000002660 [ 1.431304] Call Trace: [ 1.431308] ptdump_walk_pgd_level_core+0x3c0/0x4a0 [ 1.431312] ? rest_init+0xb0/0xb0 [ 1.431314] ptdump_walk_pgd_level_checkwx+0x15/0x50 [ 1.431315] kernel_init+0x2b/0xfc [ 1.431316] ret_from_fork+0x32/0x40 [ 1.431317] Code: 19 23 00 41 f7 c7 00 01 00 00 0f 85 50 fe ff ff e9 c0 fc ff ff 48 89 f2 48 c7 c7 78 00 08 82 c6 05 fe 23 2c 01 01 e8 69 54 09 00 <0f> ff 48 8b 73 10 e9 67 fa ff ff 4d 89 c8 b9 10 00 00 00 be 10 [ 1.431333] ---[ end trace 35d3e0be8d307dea ]--- [ 1.436095] x86/mm: Checked W+X mappings: FAILED, 5264 W+X pages found. Funny sidenote, the boot did not get logged in /var/log/boot.og nor in /var/log/messages ( as it should be )
Should we ask at https://bugzilla.kernel.org/show_bug.cgi?id=198497 whether they are seeing "x86/mm: Found insecure W+X mapping at address" before hitting "BUG: unable to handle kernel NULL pointer dereference" ? If they are seeing, I suspect something might be going wrong with early boot initialization (e.g. some memory region are by error shared with hosts or other guests)?
here we go again: 32bit 4.14.14-200 Jan 28 06:23:54 XXX kernel: BUG: unable to handle kernel NULL pointer dereference at 00000008 Jan 28 06:23:54 XXX kernel: IP: __radix_tree_lookup+0xe/0xa0 Jan 28 06:23:54 XXX kernel: *pdpt = 00000000296a3027 *pde = 0000000000000000 Jan 28 06:23:54 XXX kernel: Oops: 0000 [#1] SMP Jan 28 06:23:54 XXX kernel: Modules linked in: nfsv3 nfs fscache xt_owner xt_multiport ip6table_filter ip6_tables cfg80211 rfkill xenfs xen_privcmd xen_netfront crc32_pclmul crc32c_intel nfsd auth_rpcgss nfs_acl lockd grace sunrpc xen_blkfront loop Jan 28 06:23:54 XXX kernel: CPU: 1 PID: 5403 Comm: java Not tainted 4.14.14-200.fc26.i686+PAE #1 Jan 28 06:23:54 XXX kernel: task: d66cd640 task.stack: e5170000 Jan 28 06:23:54 XXX kernel: EIP: __radix_tree_lookup+0xe/0xa0 Jan 28 06:23:54 XXX kernel: EFLAGS: 00010282 CPU: 1 Jan 28 06:23:54 XXX kernel: EAX: 00000004 EBX: 86f37000 ECX: 00000000 EDX: 00000000 Jan 28 06:23:54 XXX kernel: ESI: 00000000 EDI: 00000004 EBP: e5171da4 ESP: e5171d94 Jan 28 06:23:54 XXX kernel: DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0069 Jan 28 06:23:54 XXX kernel: CR0: 80050033 CR2: 00000008 CR3: 28a77000 CR4: 00000660 Jan 28 06:23:54 XXX kernel: Call Trace: Jan 28 06:23:54 XXX kernel: radix_tree_lookup_slot+0x1d/0x40 Jan 28 06:23:54 XXX kernel: find_get_entry+0x20/0x160 Jan 28 06:23:54 XXX kernel: pagecache_get_page+0x24/0x290 Jan 28 06:23:54 XXX kernel: lookup_swap_cache+0x3a/0x100 Jan 28 06:23:54 XXX kernel: swap_readahead_detect+0x55/0x280 Jan 28 06:23:54 XXX kernel: ? xen_pte_clear+0x80/0x80 Jan 28 06:23:54 XXX kernel: do_swap_page+0x232/0x9b0 Jan 28 06:23:54 XXX kernel: ? wp_page_copy+0x361/0x6f0 Jan 28 06:23:54 XXX kernel: ? kmap_atomic_prot+0x3e/0x130 Jan 28 06:23:54 XXX kernel: handle_mm_fault+0x498/0xc90 Jan 28 06:23:54 XXX kernel: __do_page_fault+0x202/0x4d0 Jan 28 06:23:54 XXX kernel: ? __do_page_fault+0x4d0/0x4d0 Jan 28 06:23:54 XXX kernel: do_page_fault+0x27/0xe0 Jan 28 06:23:54 XXX kernel: ? __do_page_fault+0x4d0/0x4d0 Jan 28 06:23:54 XXX kernel: common_exception+0x81/0x86 Jan 28 06:23:54 XXX kernel: EIP: 0xb74a8404 Jan 28 06:23:54 XXX kernel: EFLAGS: 00010202 CPU: 1 Jan 28 06:23:54 XXX kernel: EAX: 00000000 EBX: 86f37530 ECX: 00000000 EDX: 00100000 Jan 28 06:23:54 XXX kernel: ESI: 00000001 EDI: 00100000 EBP: 87b56fa8 ESP: 87b56f8c Jan 28 06:23:54 XXX kernel: DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b Jan 28 06:23:54 XXX kernel: Code: b6 00 00 00 00 0f 0b 8d b6 00 00 00 00 b8 ef ff ff ff eb a4 e8 b4 e2 8a ff 8d 74 26 00 55 89 e5 57 56 53 89 c7 83 ec 04 89 4d f0 <8b> 5f 04 89 d8 83 e0 03 83 f8 01 75 6d 89 d8 83 e0 fe 0f b6 08 Jan 28 06:23:54 XXX kernel: EIP: __radix_tree_lookup+0xe/0xa0 SS:ESP: 0069:e5171d94 Jan 28 06:23:54 XXX kernel: CR2: 0000000000000008 Jan 28 06:23:54 XXX kernel: ---[ end trace cbd4a849fe9db73a ]--- Jan 28 23:45:09 XXX kernel: nfsd: last server has exited, flushing export cache Jan 28 23:45:10 XXX kernel: NFSD: starting 90-second grace period (net c0f86b40)
I hope I'm not too late to the party. I'm seeing the "insecure W+X mapping" warning on *bare metal* with the latest F26 kernel: [ 1.380839] Freeing unused kernel memory: 952K [ 1.380882] Write protecting the kernel text: 8032k [ 1.380904] Write protecting the kernel read-only data: 3212k [ 1.380905] NX-protecting the kernel data: 6304k [ 1.380964] x86/mm: Found insecure W+X mapping at address d0000000/0xd0000000 [ 1.380967] ------------[ cut here ]------------ [ 1.380971] WARNING: CPU: 0 PID: 1 at arch/x86/mm/dump_pagetables.c:266 note_page+0x670/0x860 [ 1.380971] Modules linked in: [ 1.380974] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.14.14-200.fc26.i686+PAE #1 [ 1.381029] Hardware name: LENOVO 20BV001BUK/20BV001BUK, BIOS JBET53WW (1.18 ) 09/14/2015 [ 1.381080] task: efd28000 task.stack: efd30000 [ 1.381126] EIP: note_page+0x670/0x860 [ 1.381127] EFLAGS: 00010246 CPU: 0 [ 1.381128] EAX: 00000041 EBX: efd31f4c ECX: 00000001 EDX: d98e2ce8 [ 1.381129] ESI: 80000000 EDI: 00000000 EBP: efd31f18 ESP: efd31eec [ 1.381130] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 [ 1.381131] CR0: 80050033 CR2: b7ee65ac CR3: 198cf000 CR4: 003406f0 [ 1.381132] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000 [ 1.381133] DR6: fffe0ff0 DR7: 00000400 [ 1.381134] Call Trace: [ 1.381178] ptdump_walk_pgd_level_core+0x204/0x2e0 [ 1.381225] ptdump_walk_pgd_level_checkwx+0x18/0x20 [ 1.381273] mark_rodata_ro+0xd5/0xf7 [ 1.381318] ? rest_init+0xa0/0xa0 [ 1.381362] kernel_init+0x2e/0xea [ 1.381406] ret_from_fork+0x2b/0x38 [ 1.381450] Code: d9 e9 0c fb ff ff f7 c6 00 10 00 00 74 8c 68 35 b0 5f d9 e9 16 fe ff ff 52 52 68 e4 b0 5f d9 c6 05 66 7f 79 d9 01 e8 ed 57 06 00 <0f> ff 8b 53 0c 83 c4 0c e9 38 fa ff ff 50 6a 08 52 6a 08 68 e5 [ 1.381545] ---[ end trace 653c3292342498e7 ]--- [ 1.381608] x86/mm: Checked W+X mappings: FAILED, 11 W+X pages found. [ 1.381620] rodata_test: all tests were successful As for the radix_tree_lookup_slot oops, I see that too. Unfortunately my kernel is tainted at that point (I run VirtualBox), but that probably doesn't matter since other users with untainted kernels see the exact same problem. [14827.171765] BUG: unable to handle kernel NULL pointer dereference at 00000008 [14827.171773] IP: __radix_tree_lookup+0xe/0xa0 [14827.171774] *pdpt = 0000000023a33001 *pde = 0000000000000000 [14827.171777] Oops: 0000 [#1] SMP [14827.171779] Modules linked in: tun ccm fuse ipip tunnel4 ip_tunnel bridge stp llc vboxnetadp(OE) vboxnetflt(OE) vboxdrv(OE) nf_conntrack_ipv6 nf_defrag_ipv6 iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_nat_ipv4 nf_nat nf_conntrack libcrc32c ip6table_filter ip6_tables iptable_mangle bnep sunrpc dm_crypt intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp arc4 kvm_intel iTCO_wdt iTCO_vendor_support mei_wdt wmi_bmof kvm btusb btrtl btbcm uvcvideo snd_usb_audio btintel iwlmvm snd_usbmidi_lib videobuf2_vmalloc videobuf2_memops mac80211 bluetooth videobuf2_v4l2 snd_rawmidi videobuf2_core cdc_acm snd_hda_codec_hdmi snd_hda_codec_realtek irqbypass snd_hda_codec_generic videodev crc32_pclmul iwlwifi snd_hda_intel intel_cstate media intel_uncore ecdh_generic intel_rapl_perf snd_hda_codec [14827.171813] snd_hda_core snd_hwdep snd_seq cfg80211 snd_seq_device snd_pcm snd_timer thinkpad_acpi joydev rtsx_pci_ms intel_pch_thermal lpc_ich memstick i2c_i801 snd mei_me tpm_tis mei soundcore tpm_tis_core wmi tpm rfkill binfmt_misc cdc_mbim cdc_wdm cdc_ncm usbnet hid_plantronics mii i915 rtsx_pci_sdmmc mmc_core crc32c_intel i2c_algo_bit e1000e drm_kms_helper drm serio_raw ptp rtsx_pci pps_core video [14827.171834] CPU: 0 PID: 2905 Comm: CompositorTileW Tainted: G W OE 4.14.14-200.fc26.i686+PAE #1 [14827.171835] Hardware name: LENOVO 20BV001BUK/20BV001BUK, BIOS JBET53WW (1.18 ) 09/14/2015 [14827.171836] task: e3aedc80 task.stack: e3bbc000 [14827.171838] EIP: __radix_tree_lookup+0xe/0xa0 [14827.171839] EFLAGS: 00210282 CPU: 0 [14827.171841] EAX: 00000004 EBX: b3271000 ECX: 00000000 EDX: 00000000 [14827.171842] ESI: 00000000 EDI: 00000004 EBP: e3bbdda4 ESP: e3bbdd94 [14827.171843] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 [14827.171845] CR0: 80050033 CR2: 00000008 CR3: 25be8780 CR4: 003426f0 [14827.171846] Call Trace: [14827.171849] radix_tree_lookup_slot+0x1d/0x40 [14827.171852] find_get_entry+0x20/0x160 [14827.171855] pagecache_get_page+0x24/0x290 [14827.171857] lookup_swap_cache+0x3a/0x100 [14827.171859] swap_readahead_detect+0x55/0x280 [14827.171861] ? page_cache_tree_insert+0xa0/0xa0 [14827.171864] do_swap_page+0x232/0x9b0 [14827.171865] ? alloc_set_pte+0xdd/0x360 [14827.171868] ? filemap_map_pages+0x154/0x310 [14827.171871] ? kmap_atomic_prot+0x103/0x130 [14827.171875] handle_mm_fault+0x498/0xc90 [14827.171877] __do_page_fault+0x202/0x4d0 [14827.171879] ? __do_page_fault+0x4d0/0x4d0 [14827.171880] do_page_fault+0x27/0xe0 [14827.171882] ? exit_to_usermode_loop+0x4a/0xa0 [14827.171884] ? __do_page_fault+0x4d0/0x4d0 [14827.171887] common_exception+0x81/0x86 [14827.171888] EIP: 0xb7f070ca [14827.171889] EFLAGS: 00210206 CPU: 0 [14827.171890] EAX: 00000095 EBX: bac212a0 ECX: b3271148 EDX: b36ced07 [14827.171892] ESI: 00000041 EDI: aa1c4480 EBP: b3410000 ESP: a23d32d8 [14827.171893] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b [14827.171894] Code: b6 00 00 00 00 0f 0b 8d b6 00 00 00 00 b8 ef ff ff ff eb a4 e8 b4 e2 8a ff 8d 74 26 00 55 89 e5 57 56 53 89 c7 83 ec 04 89 4d f0 <8b> 5f 04 89 d8 83 e0 03 83 f8 01 75 6d 89 d8 83 e0 fe 0f b6 08 [14827.171918] EIP: __radix_tree_lookup+0xe/0xa0 SS:ESP: 0068:e3bbdd94 [14827.171919] CR2: 0000000000000008 [14827.171921] ---[ end trace 653c3292342498e8 ]--- This is on my work laptop, so I have an additional use case: put the system to sleep. That reveals another interesting problem: [18901.418012] Freezing of tasks failed after 20.007 seconds (4 tasks refusing to freeze, wq_busy=0): [18901.418235] chromium-browse D 0 2894 2424 0x00000104 [18901.418244] Call Trace: [18901.418259] __schedule+0x242/0x8c0 [18901.418267] schedule+0x26/0x70 [18901.418274] rwsem_down_read_failed+0xc5/0x110 [18901.418285] call_rwsem_down_read_failed+0xa/0x10 [18901.418292] down_read+0x1c/0x30 [18901.418300] __do_page_fault+0x429/0x4d0 [18901.418311] ? sys_gettid+0x17/0x20 [18901.418317] ? __do_page_fault+0x4d0/0x4d0 [18901.418323] do_page_fault+0x27/0xe0 [18901.418329] ? __do_page_fault+0x4d0/0x4d0 [18901.418337] common_exception+0x81/0x86 [18901.418343] EIP: 0x54ae2e00 [18901.418347] EFLAGS: 00210206 CPU: 0 [18901.418352] EAX: 20955004 EBX: 046bae00 ECX: 5068412d EDX: 50684185 [18901.418357] ESI: 00000000 EDI: 20954ff5 EBP: bfbab7dc ESP: bfbab760 [18901.418363] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b [18901.418370] Chrome_ChildIOT D 0 2900 2424 0x00000104 [18901.418377] Call Trace: [18901.418384] __schedule+0x242/0x8c0 [18901.418391] schedule+0x26/0x70 [18901.418398] rwsem_down_read_failed+0xc5/0x110 [18901.418408] call_rwsem_down_read_failed+0xa/0x10 [18901.418415] down_read+0x1c/0x30 [18901.418420] __do_page_fault+0x429/0x4d0 [18901.418427] ? __do_page_fault+0x4d0/0x4d0 [18901.418433] do_page_fault+0x27/0xe0 [18901.418439] ? __do_page_fault+0x4d0/0x4d0 [18901.418447] common_exception+0x81/0x86 [18901.418451] EIP: 0xb7d8e5e8 [18901.418455] EFLAGS: 00210206 CPU: 0 [18901.418460] EAX: b7db5180 EBX: 00000000 ECX: 00000000 EDX: 05b17000 [18901.418465] ESI: 05b17000 EDI: fffa4135 EBP: b7e39508 ESP: a4dd86d0 [18901.418470] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b [18901.418477] CompositorTileW D 0 2904 2424 0x00000104 [18901.418483] Call Trace: [18901.418489] __schedule+0x242/0x8c0 [18901.418497] schedule+0x26/0x70 [18901.418526] rwsem_down_write_failed_killable+0x18d/0x370 [18901.418540] call_rwsem_down_write_failed_killable+0x9/0xc [18901.418549] down_write_killable+0x25/0x50 [18901.418559] vm_mmap_pgoff+0x7d/0xd0 [18901.418567] ? syscall_trace_enter+0x15c/0x260 [18901.418576] SyS_mmap_pgoff+0x17e/0x1f0 [18901.418584] do_fast_syscall_32+0x76/0x1a0 [18901.418593] entry_SYSENTER_32+0x4e/0x7c [18901.418597] EIP: 0xb7efbcf9 [18901.418601] EFLAGS: 00200286 CPU: 0 [18901.418606] EAX: ffffffda EBX: 00000000 ECX: 00040000 EDX: 00000003 [18901.418611] ESI: 00000001 EDI: 0000015b EBP: 00000000 ESP: a2bd4b5c [18901.418616] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b [18901.418622] TaskSchedulerRe D 0 11204 2424 0x00000104 [18901.418628] Call Trace: [18901.418635] __schedule+0x242/0x8c0 [18901.418641] schedule+0x26/0x70 [18901.418648] rwsem_down_read_failed+0xc5/0x110 [18901.418657] call_rwsem_down_read_failed+0xa/0x10 [18901.418664] down_read+0x1c/0x30 [18901.418672] SyS_madvise+0x934/0xa10 [18901.418682] ? __secure_computing+0x35/0xc0 [18901.418688] ? syscall_trace_enter+0x15c/0x260 [18901.418696] do_fast_syscall_32+0x76/0x1a0 [18901.418705] entry_SYSENTER_32+0x4e/0x7c [18901.418709] EIP: 0xb7efbcf9 [18901.418712] EFLAGS: 00200206 CPU: 0 [18901.418717] EAX: ffffffda EBX: a0922000 ECX: 007fc000 EDX: 00000004 [18901.418722] ESI: a1122b40 EDI: a1122b40 EBP: a11221a8 ESP: a11220c8 [18901.418727] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b I suspect the two problems are related, since they are both in the mm/page_fault area (although I don't see any obvious connection). After the radix_tree_lookup_slot oops happens (but not immediately after), I also see another strange issue where an apparently random task gets frozen (waiting on a futex, in kernel mode) and trying to read from that task's /proc/pid/cmdline hangs. I have been able to debug this down to the task's mm->mmap_sem being in a weird state where 'owner' is 1 (RWSEM_READER_OWNED) but 'count' is -65535, which, according to __down_write in arch/x86/include/asm/rwsem.h, suggests the rwsem is actually write locked. The task that tries to read from /proc/pid/cmdline hangs trying to read lock the hung task's mm->mmap_sem. Unfortunately I don't have the stack traces anymore (for the initial hung task and the task that reads /proc/pid/cmdline), but I will post them in a separate comment as soon as the problem appears again.
Adding task stack traces, as per comment #29: The initial hung task: [root@rrendec ~]# ls -l /proc/1925/exe lrwxrwxrwx 1 rrendec rrendec 0 Jan 31 15:15 /proc/1925/exe -> /usr/libexec/evolution-calendar-factory-subprocess [root@rrendec ~]# cat /proc/1925/stack [<d93cb3ea>] call_rwsem_down_read_failed+0xa/0x10 [<d8c5b549>] __do_page_fault+0x429/0x4d0 [<d8c5b617>] do_page_fault+0x27/0xe0 [<d93d5c27>] common_exception+0x81/0x86 [<ffffffff>] 0xffffffff The "ps" task that hangs while trying to read from /proc/pid/cmdline of the other task: [root@rrendec ~]# ls -l /proc/25153/exe lrwxrwxrwx 1 root root 0 Jan 31 15:17 /proc/25153/exe -> /usr/bin/ps [root@rrendec ~]# cat /proc/25153/wchan call_rwsem_down_read_failed [root@rrendec ~]# cat /proc/25153/stack [<d93cb3ea>] call_rwsem_down_read_failed+0xa/0x10 [<d8e6a3a5>] proc_pid_cmdline_read+0xa5/0x4c0 [<d8dfd880>] __vfs_read+0x30/0x160 [<d8dfda3c>] vfs_read+0x8c/0x160 [<d8dfdff7>] SyS_read+0x47/0xb0 [<d8c03706>] do_fast_syscall_32+0x76/0x1a0 [<d93d4da6>] entry_SYSENTER_32+0x4e/0x7c [<ffffffff>] 0xffffffff
I have seen the same oops at IP: __radix_tree_lookup+0xe, followed by the same symptom of tasks getting hung reading /proc/pid/cmdline. I can test any patches available.
4.14.16-200 : Feb 12 06:44:32 xXx kernel: BUG: unable to handle kernel NULL pointer dereference at 00000008 Feb 12 06:44:32 xXx kernel: IP: __radix_tree_lookup+0xe/0xa0 Feb 12 06:44:32 xXx kernel: *pdpt = 00000000024e1027 *pde = 0000000000000000 Feb 12 06:44:32 xXx kernel: Oops: 0000 [#2] SMP Feb 12 06:44:32 xXx kernel: Modules linked in: nfsv3 nfs fscache xt_owner xt_multiport ip6table_filter ip6_tables cfg80211 rfkill xenfs xen_privcmd xen_netfront crc32_pclmul crc32c_intel nfsd auth_rpcgss nfs_acl lockd grace sunrpc xen_blkfront loop Feb 12 06:44:32 xXx kernel: CPU: 0 PID: 8504 Comm: java Tainted: G D 4.14.16-200.fc26.i686+PAE #1 Feb 12 06:44:32 xXx kernel: task: cf029140 task.stack: e9b36000 Feb 12 06:44:32 xXx kernel: EIP: __radix_tree_lookup+0xe/0xa0 Feb 12 06:44:32 xXx kernel: EFLAGS: 00010282 CPU: 0 Feb 12 06:44:32 xXx kernel: EAX: 00000004 EBX: 87032000 ECX: 00000000 EDX: 00000000 Feb 12 06:44:32 xXx kernel: ESI: 00000000 EDI: 00000004 EBP: e9b37da4 ESP: e9b37d94 Feb 12 06:44:32 xXx kernel: DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0069 Feb 12 06:44:32 xXx kernel: CR0: 80050033 CR2: 00000008 CR3: 2736e000 CR4: 00000660 Feb 12 06:44:32 xXx kernel: Call Trace: Feb 12 06:44:32 xXx kernel: radix_tree_lookup_slot+0x1d/0x40 Feb 12 06:44:32 xXx kernel: find_get_entry+0x20/0x160 Feb 12 06:44:32 xXx kernel: pagecache_get_page+0x24/0x290 Feb 12 06:44:32 xXx kernel: lookup_swap_cache+0x3a/0x100 Feb 12 06:44:32 xXx kernel: swap_readahead_detect+0x55/0x280 Feb 12 06:44:32 xXx kernel: ? xen_pte_clear+0x80/0x80 Feb 12 06:44:32 xXx kernel: do_swap_page+0x232/0x9b0 Feb 12 06:44:32 xXx kernel: ? wp_page_copy+0x361/0x6f0 Feb 12 06:44:32 xXx kernel: ? kmap_atomic_prot+0x3e/0x130 Feb 12 06:44:32 xXx kernel: handle_mm_fault+0x498/0xc90 Feb 12 06:44:32 xXx kernel: __do_page_fault+0x202/0x4d0 Feb 12 06:44:32 xXx kernel: ? __do_page_fault+0x4d0/0x4d0 Feb 12 06:44:32 xXx kernel: do_page_fault+0x27/0xe0 Feb 12 06:44:32 xXx kernel: ? __do_page_fault+0x4d0/0x4d0 Feb 12 06:44:32 xXx kernel: common_exception+0x84/0x8a Feb 12 06:44:32 xXx kernel: EIP: 0xb758a3b4 Feb 12 06:44:32 xXx kernel: EFLAGS: 00010202 CPU: 0 Feb 12 06:44:32 xXx kernel: EAX: 00000000 EBX: 8703207c ECX: 00000000 EDX: 00000800 Feb 12 06:44:32 xXx kernel: ESI: 00000001 EDI: 00000800 EBP: 87c5a0b8 ESP: 87c5a09c Feb 12 06:44:32 xXx kernel: DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b Feb 12 06:44:32 xXx kernel: Code: b6 00 00 00 00 0f 0b 8d b6 00 00 00 00 b8 ef ff ff ff eb a4 e8 04 da 8a ff 8d 74 26 00 55 89 e5 57 56 53 89 c7 83 ec 04 89 4d f0 <8b> 5f 04 89 d8 83 e0 03 83 f8 01 75 6d 89 d8 83 e0 fe 0f b6 08 Feb 12 06:44:32 xXx kernel: EIP: __radix_tree_lookup+0xe/0xa0 SS:ESP: 0069:e9b37d94 Feb 12 06:44:32 xXx kernel: CR2: 0000000000000008 Feb 12 06:44:32 xXx kernel: ---[ end trace 57cd32b1cf1b587c ]---
Feb 15 10:03:01 xXx kernel: BUG: unable to handle kernel NULL pointer dereference at 00000008 Feb 15 10:03:01 xXx kernel: IP: __radix_tree_lookup+0xe/0xa0 Feb 15 10:03:01 xXx kernel: *pdpt = 000000002918a027 *pde = 0000000000000000 Feb 15 10:03:01 xXx kernel: Oops: 0000 [#4] SMP Feb 15 10:03:01 xXx kernel: Modules linked in: nfsv3 nfs fscache xt_owner xt_multiport ip6table_filter ip6_tables cfg80211 rfkill xenfs xen_privcmd xen_netfront crc32_pclmul crc32c_intel nfsd auth_rpcgss nfs_acl lockd grace sunrpc xen_blkfront loop Feb 15 10:03:01 xXx kernel: CPU: 1 PID: 10812 Comm: java Tainted: G D 4.14.16-200.fc26.i686+PAE #1 Feb 15 10:03:01 xXx kernel: task: c3649140 task.stack: e9cec000 Feb 15 10:03:01 xXx kernel: EIP: __radix_tree_lookup+0xe/0xa0 Feb 15 10:03:01 xXx kernel: EFLAGS: 00010282 CPU: 1 Feb 15 10:03:01 xXx kernel: EAX: 00000004 EBX: 0dbc4000 ECX: 00000000 EDX: 00000000 Feb 15 10:03:01 xXx kernel: ESI: 00000000 EDI: 00000004 EBP: e9cedda4 ESP: e9cedd94 Feb 15 10:03:01 xXx kernel: DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0069 Feb 15 10:03:01 xXx kernel: CR0: 80050033 CR2: 00000008 CR3: 01663000 CR4: 00000660 Feb 15 10:03:01 xXx kernel: Call Trace: Feb 15 10:03:01 xXx kernel: radix_tree_lookup_slot+0x1d/0x40 Feb 15 10:03:01 xXx kernel: ? get_page_from_freelist+0x7ba/0x910 Feb 15 10:03:01 xXx kernel: find_get_entry+0x20/0x160 Feb 15 10:03:01 xXx kernel: pagecache_get_page+0x24/0x290 Feb 15 10:03:01 xXx kernel: lookup_swap_cache+0x3a/0x100 Feb 15 10:03:01 xXx kernel: swap_readahead_detect+0x55/0x280 Feb 15 10:03:01 xXx kernel: ? hrtimer_cancel+0x19/0x20 Feb 15 10:03:01 xXx kernel: do_swap_page+0x232/0x9b0 Feb 15 10:03:01 xXx kernel: ? try_charge+0x28e/0x610 Feb 15 10:03:01 xXx kernel: ? xen_batched_set_pte+0x1b/0x1b0 Feb 15 10:03:01 xXx kernel: ? xen_pte_clear+0x80/0x80 Feb 15 10:03:01 xXx kernel: ? xen_set_pte_at+0x81/0x150 Feb 15 10:03:01 xXx kernel: ? kmap_atomic_prot+0x3e/0x130 Feb 15 10:03:01 xXx kernel: handle_mm_fault+0x498/0xc90 Feb 15 10:03:01 xXx kernel: __do_page_fault+0x202/0x4d0 Feb 15 10:03:01 xXx kernel: ? __do_page_fault+0x4d0/0x4d0 Feb 15 10:03:01 xXx kernel: do_page_fault+0x27/0xe0 Feb 15 10:03:01 xXx kernel: ? __do_page_fault+0x4d0/0x4d0 Feb 15 10:03:01 xXx kernel: common_exception+0x84/0x8a Feb 15 10:03:01 xXx kernel: EIP: 0x1f487b66 Feb 15 10:03:01 xXx kernel: EFLAGS: 00010246 CPU: 1 Feb 15 10:03:01 xXx kernel: EAX: 00000000 EBX: 1f91d000 ECX: 0000003a EDX: 00000000 Feb 15 10:03:01 xXx kernel: ESI: 0dbc4028 EDI: 0dbc4028 EBP: 0d35a4f8 ESP: 0d35a4b0 Feb 15 10:03:01 xXx kernel: DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b Feb 15 10:03:01 xXx kernel: Code: b6 00 00 00 00 0f 0b 8d b6 00 00 00 00 b8 ef ff ff ff eb a4 e8 04 da 8a ff 8d 74 26 00 55 89 e5 57 56 53 89 c7 83 ec 04 89 4d f0 <8b> 5f 04 89 d8 83 e0 03 83 f8 01 75 6d 89 d8 83 e0 fe 0f b6 08 Feb 15 10:03:01 xXx kernel: EIP: __radix_tree_lookup+0xe/0xa0 SS:ESP: 0069:e9cedd94 Feb 15 10:03:01 xXx kernel: CR2: 0000000000000008 Feb 15 10:03:01 xXx kernel: ---[ end trace 57cd32b1cf1b587e ]--- After this, already running java processes hung together with all top processes. So some process structures did not get friend correctly after this exception.
"freed" not "friend" :)
Hello, same issue on 4.14.14 Kernel on Debian. I use this server for a dvb-s/s2 recording (software: VDR (Video Disk Recorder)). Uname: Linux ******** 4.14.14 #1 SMP Thu Jan 18 11:31:08 CET 2018 i686 GNU/Linux dmesg: [1293619.141469] BUG: unable to handle kernel NULL pointer dereference at 00000008 [1293619.141527] IP: __radix_tree_lookup+0x11/0x84 [1293619.141554] *pdpt = 000000002f2c8001 *pde = 0000000000000000 [1293619.141585] Oops: 0000 [#1] SMP [1293619.141610] Modules linked in: nfsv3 rpcsec_gss_krb5 nfsv4 nfs fscache nfsd lockd grace nfs_acl auth_rpcgss sunrpc oid_registry joydev ftdi_sio usbserial hid_generic usbhid hid igb i2c_algo_bit loop stv6111 lnbh25 stv0910 x86_pkg_temp_thermal coretemp kvm_intel kvm irqbypass pcbc snd_pcm ddbridge ehci_pci snd_timer ehci_hcd snd dvb_core aesni_intel usbcore soundcore i2c_i801 crypto_simd ipmi_si cryptd i2c_core ipmi_msghandler usb_common lpc_ich pcspkr mfd_core aes_i586 evdev ext4 jbd2 mbcache crc16 sd_mod ahci libahci crc32c_intel e1000e libata scsi_mod ptp pps_core [1293619.141861] CPU: 3 PID: 26810 Comm: section handler Not tainted 4.14.14-xeatre-4.14.14 #1 [1293619.141907] Hardware name: Thomas-Krenn.AG X9SCL/X9SCM/X9SCL/X9SCM, BIOS 2.0b 09/17/2012 [1293619.141954] task: f12cb380 task.stack: ef484000 [1293619.141983] EIP: __radix_tree_lookup+0x11/0x84 [1293619.142008] EFLAGS: 00210296 CPU: 3 [1293619.142033] EAX: 00000004 EBX: f7099e38 ECX: 00000000 EDX: 00000000 [1293619.142062] ESI: 00000000 EDI: 00000008 EBP: ef485d5c ESP: ef485d44 [1293619.142092] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 [1293619.142120] CR0: 80050033 CR2: 00000008 CR3: 37106b60 CR4: 001406f0 [1293619.142149] Call Trace: [1293619.142176] radix_tree_lookup_slot+0x11/0x1e [1293619.142205] find_get_entry+0x1d/0x87 [1293619.142232] pagecache_get_page+0x1f/0x15a [1293619.142261] ? cpumask_next+0x12/0x14 [1293619.142287] ? flush_tlb_mm_range+0xc0/0xc0 [1293619.142315] lookup_swap_cache+0x33/0xc4 [1293619.142342] swap_readahead_detect+0x82/0x254 [1293619.142368] ? flush_tlb_mm_range+0xb8/0xc0 [1293619.142396] do_swap_page+0x49/0x70f [1293619.142424] ? lru_cache_add_active_or_unevictable+0x32/0x9e [1293619.142454] ? wp_page_copy+0x3f7/0x420 [1293619.142482] ? _copy_to_user+0x29/0x32 [1293619.142511] ? dvb_ringbuffer_read_user+0x5f/0x82 [dvb_core] [1293619.142542] ? dvb_dmxdev_buffer_read+0xec/0x113 [dvb_core] [1293619.142572] handle_mm_fault+0x8cb/0x9b2 [1293619.142600] ? find_vma+0x46/0x4d [1293619.142626] __do_page_fault+0x3c4/0x3e2 [1293619.142653] do_page_fault+0xbd/0xc5 [1293619.142679] ? __do_page_fault+0x3e2/0x3e2 [1293619.142706] common_exception+0x74/0x7a [1293619.142731] EIP: 0xb7f55826 [1293619.142754] EFLAGS: 00210246 CPU: 3 [1293619.142779] EAX: 00000000 EBX: 0000000a ECX: 00000001 EDX: 00000001 [1293619.142808] ESI: 081bbb10 EDI: b10f71d8 EBP: 081bbb10 ESP: b10f7194 [1293619.142838] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b [1293619.142867] ? __do_page_fault+0x3e2/0x3e2 [1293619.142892] Code: 5b 5e 5f 5d c3 55 89 e5 8d 4a 1a 53 bb 01 00 00 00 d3 e3 23 18 89 d8 5b 5d c3 55 89 e5 57 56 89 ce 53 83 ec 0c 8d 78 04 89 55 e8 <8b> 07 31 c9 89 c2 83 e2 03 4a 89 45 ec 89 45 f0 75 10 83 e0 fe [1293619.143010] EIP: __radix_tree_lookup+0x11/0x84 SS:ESP: 0068:ef485d44 [1293619.143039] CR2: 0000000000000008 [1293619.143430] ---[ end trace 13d1f261c770964e ]---
Created attachment 1399552 [details] Debugging patch Please try this patch if you are seeing a NULL pointer dereference of 0x00000008 in __radix_tree_lookup. There are half a dozen other bugs on this thread which are completely unrelated crashes, and I'm not trying to address those. (comments 1, 2, 7, 10, 28, 29b, 32, 33, 35 are the same bug comments 3, 4, 5, 6, 8, 9, 18, 21, 22, 26, 29a, 29c are different) You will no longer see a BUG splat! You will instead see the messages: Bad swp_entry: ... do_swap_page:2896: bad pte ... I'm particularly interested in those two lines, but please include a few lines around that unless you can tell they're not interesting. Thanks!
4.14.15 i686: Feb 9 14:31:27 cs01 kernel: Bad swp_entry: 2000000 Feb 9 14:31:27 cs01 kernel: mm/swap_state.c:683: bad pte ef3a3f38(8000000100000000) Feb 9 15:35:19 cs01 kernel: Bad swp_entry: 2000000 Feb 9 15:35:19 cs01 kernel: mm/swap_state.c:683: bad pte eee17f38(8000000100000000)
Thanks, David! Can I just confirm whether or not you're running on Xen?
Not Xen. I can help debug or run any further patches for you, just let me know. I joined the linux-mm mailing list but have not seen this topic pop up since, so haven't been able to reply.
Hi From My comment 35: [44520.280569] perf: interrupt took too long (3976 > 3921), lowering kernel.perf_event_max_sample_rate to 50250 [83534.559267] Bad swp_entry: 2000000 [83534.559300] mm/swap_state.c:685: bad pte ee9fdefc(8000000100000000)
[I already posted this on https://bugzilla.kernel.org/show_bug.cgi?id=198497, but this thread seems to be more alive.] The oops happens here on a Xen-x86-32bit-domU running kernel 4.14.19-gentoo when I try to emerge net-dns/libidn-1.33-r2. The oops leaves a javac hanging with full cpu-usage. I applied the patch in https://bugzilla.kernel.org/show_bug.cgi?id=198497#c12 which looks like the patch you proposed here. This lets me emerge net-dns/libidn-1.33-r2 without javac hanging. The bug can be easily reproduced by emerging libidn again. Feb 17 21:14:56 colin kernel: Bad swp_entry: 4000000 Feb 17 21:14:56 colin kernel: mm/swap_state.c:683: bad pte e2187f30(8000000200000000) Feb 17 21:14:56 colin kernel: Bad swp_entry: 4000000 Feb 17 21:14:56 colin kernel: mm/swap_state.c:683: bad pte e584bf30(8000000200000000) Feb 17 21:16:26 colin kernel: Bad swp_entry: 4000000 Feb 17 21:16:26 colin kernel: mm/swap_state.c:683: bad pte e210df30(8000000200000000) Feb 17 21:23:21 colin kernel: Bad swp_entry: 4000000 Feb 17 21:23:21 colin kernel: mm/swap_state.c:683: bad pte dd5b1f30(8000000200000000) oops (unpatched kernel): Feb 17 21:08:57 colin kernel: BUG: unable to handle kernel NULL pointer dereference at 00000008 Feb 17 21:08:57 colin kernel: IP: __radix_tree_lookup+0xe/0xb0 Feb 17 21:08:57 colin kernel: *pdpt = 0000000021a03027 *pde = 0000000000000000 Feb 17 21:08:57 colin kernel: Oops: 0000 [#1] SMP Feb 17 21:08:57 colin kernel: Modules linked in: Feb 17 21:08:57 colin kernel: CPU: 5 PID: 10914 Comm: javac Not tainted 4.14.19-gentoo #4 Feb 17 21:08:57 colin kernel: task: e9fec0c0 task.stack: e1a9a000 Feb 17 21:08:57 colin kernel: EIP: __radix_tree_lookup+0xe/0xb0 Feb 17 21:08:57 colin kernel: EFLAGS: 00010296 CPU: 5 Feb 17 21:08:57 colin kernel: EAX: 00000004 EBX: 6612d000 ECX: 00000000 EDX: 00000000 Feb 17 21:08:57 colin kernel: ESI: 00000000 EDI: 00000004 EBP: e1a9bddc ESP: e1a9bdcc Feb 17 21:08:57 colin kernel: DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0069 Feb 17 21:08:57 colin kernel: CR0: 80050033 CR2: 00000008 CR3: 267e2000 CR4: 00042660 Feb 17 21:08:57 colin kernel: Call Trace: Feb 17 21:08:57 colin kernel: radix_tree_lookup_slot+0x11/0x30 Feb 17 21:08:57 colin kernel: find_get_entry+0x1d/0xe0 Feb 17 21:08:57 colin kernel: pagecache_get_page+0x1f/0x230 Feb 17 21:08:57 colin kernel: lookup_swap_cache+0x35/0xf0 Feb 17 21:08:57 colin kernel: swap_readahead_detect+0x4c/0x350 Feb 17 21:08:57 colin kernel: ? flush_tlb_mm_range+0x91/0xe0 Feb 17 21:08:57 colin kernel: do_swap_page+0x1ca/0x6d0 Feb 17 21:08:57 colin kernel: ? __raw_callee_save___pv_queued_spin_unlock+0x9/0x10 Feb 17 21:08:57 colin kernel: ? wp_page_copy+0x2af/0x520 Feb 17 21:08:57 colin kernel: ? xen_pmd_val+0x11/0x20 Feb 17 21:08:57 colin kernel: handle_mm_fault+0x3b8/0x940 Feb 17 21:08:57 colin kernel: __do_page_fault+0x178/0x400 Feb 17 21:08:57 colin kernel: ? vmalloc_sync_all+0x250/0x250 Feb 17 21:08:57 colin kernel: do_page_fault+0x1a/0x20 Feb 17 21:08:57 colin kernel: common_exception+0x84/0x8a Feb 17 21:08:57 colin kernel: EIP: 0xb744edf4 Feb 17 21:08:57 colin kernel: EFLAGS: 00010202 CPU: 5 Feb 17 21:08:57 colin kernel: EAX: 00000004 EBX: b77f6f28 ECX: 000284a2 EDX: 00001425 Feb 17 21:08:57 colin kernel: ESI: 6612d094 EDI: b7817148 EBP: 674ff0f8 ESP: 674ff0dc Feb 17 21:08:57 colin kernel: DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b Feb 17 21:08:57 colin kernel: Code: 90 8d 74 26 00 80 41 03 01 eb a7 66 90 0f 0b 8d b6 00 00 00 00 0f 0b 8d b6 00 00 00 00 55 89 e5 57 89 c7 56 53 83 ec 04 89 4d f0 <8b> 77 04 89 f0 83 e0 03 83 f8 01 75 75 89 f0 83 e0 fe 0f b6 08 Feb 17 21:08:57 colin kernel: EIP: __radix_tree_lookup+0xe/0xb0 SS:ESP: 0069:e1a9bdcc Feb 17 21:08:57 colin kernel: CR2: 0000000000000008 Feb 17 21:08:57 colin kernel: ---[ end trace f157259f300d3491 ]---
WARNING: KERNEL 4.15.x <= 4.15.6 are FAILING TO BOOT with 32bit ! seperate BR opened
Hi Guys, how is the stand from this ticket? I must fix this issue for my customers... Any chance to fix this issue next time ?
The only valid solution to this is : cd serverroot/ mkdir old_system mv * old_system/ tar xzvf /new64bitsystem.tgz adjust BLKIDs if needed reboot server it's easier as you may think. Setup a new 64bit server with all needed packages, shut it down, make a tarball, do the above with your 32bit server, boot the empty (now) 64bit server, adjust the configfiles, home, passwd/group, network, etc. ( cp from old_system... ) and your good. I reworked my entire cluster from 32bit to 64bit that way .. all running.
Maybe it's just coincidence, but I haven't seen the __radix_tree_lookup NULL pointer dereference since I upgraded to 4.15.14-300.fc27.i686+PAE. Unfortunately my journald log was rotated, so I cannot tell what was the last kernel version that I saw the bug in. However, my kernel message buffer is complete (I can see the first lines at timestamp 0, including kernel version), so I know for sure the bug hasn't appeared since the kernel booted. Uptime is roughly 8 days (but this is my laptop and it stays in RAM suspend over the night).
The bug is still present in Fedora 26, using kernel kernel-PAE-core-4.15.14-200.fc26.i686. This is on bare metal. I can also confirm the issue that after this that my Java server applications (running latest F26 Java version). The Java processes use up all CPU and do not respond Message log: Apr 22 17:42:47 phoenix kernel: BUG: unable to handle kernel NULL pointer dereference at 00000008 Apr 22 17:42:47 phoenix kernel: IP: __radix_tree_lookup+0xe/0xa0 Apr 22 17:42:47 phoenix kernel: *pdpt = 000000003604a001 *pde = 0000000000000000 Apr 22 17:42:48 phoenix kernel: Oops: 0000 [#1] SMP Apr 22 17:42:48 phoenix kernel: Modules linked in: xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 tun xt_set xt_multiport nf_conntrack_netbios_ns nf_conntrack_broadcast xt_CT ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set_hash_ip ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack libcrc32c iptable_mangle iptable_raw iptable_security ebtable_filter ebtables ip6table_filter ip6_tables nct6775 hwmon_vid rc_technisat_ts35 tda10023 tda10021 snd_hda_codec_realtek snd_hda_codec_generic snd_hda_codec_hdmi budget_av saa7146_vv videobuf_dma_sg videobuf_core snd_hda_intel budget_core kvm_amd ttpci_eeprom saa7146 snd_hda_codec Apr 22 17:42:48 phoenix kernel: mantis kvm mantis_core dvb_core snd_hda_core snd_hwdep joydev v4l2_common videodev irqbypass media rc_core snd_seq crc32_pclmul k10temp sp5100_tco snd_seq_device snd_pcm snd_timer i2c_piix4 snd soundcore video acpi_cpufreq ata_generic pata_acpi radeon hid_logitech_hidpp i2c_algo_bit crc32c_intel drm_kms_helper ttm alx mdio pata_atiixp drm hid_logitech_dj Apr 22 17:42:48 phoenix kernel: CPU: 1 PID: 1759 Comm: java Not tainted 4.15.17-200.fc26.i686+PAE #1 Apr 22 17:42:48 phoenix kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./FM2A88X Extreme4+, BIOS P3.30 05/13/2016 Apr 22 17:42:48 phoenix kernel: EIP: __radix_tree_lookup+0xe/0xa0 Apr 22 17:42:48 phoenix kernel: EFLAGS: 00010286 CPU: 1 Apr 22 17:42:48 phoenix kernel: EAX: 00000004 EBX: 8a290000 ECX: 00000000 EDX: 00000000 Apr 22 17:42:48 phoenix kernel: ESI: 00000000 EDI: 00000004 EBP: eade3da0 ESP: eade3d90 Apr 22 17:42:48 phoenix kernel: DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 Apr 22 17:42:48 phoenix kernel: CR0: 80050033 CR2: 00000008 CR3: 2f4ae3a0 CR4: 000406f0 Apr 22 17:42:48 phoenix kernel: Call Trace: Apr 22 17:42:48 phoenix kernel: radix_tree_lookup_slot+0x1d/0x40 Apr 22 17:42:48 phoenix kernel: find_get_entry+0x20/0x160 Apr 22 17:42:48 phoenix kernel: pagecache_get_page+0x24/0x290 Apr 22 17:42:48 phoenix kernel: lookup_swap_cache+0x3a/0x100 ...
This message is a reminder that Fedora 26 is nearing its end of life. Approximately 4 (four) weeks from now Fedora will stop maintaining and issuing updates for Fedora 26. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as EOL if it remains open with a Fedora 'version' of '26'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version. Thank you for reporting this issue and we are sorry that we were not able to fix it before Fedora 26 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged change the 'version' to a later Fedora version prior this bug is closed as described in the policy above. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete.
(In reply to Fedora End Of Life from comment #47) > This message is a reminder that Fedora 26 is nearing its end of life. > Approximately 4 (four) weeks from now Fedora will stop maintaining > and issuing updates for Fedora 26. It is Fedora's policy to close all > bug reports from releases that are no longer maintained. At that time > this bug will be closed as EOL if it remains open with a Fedora 'version' > of '26'. This bug is still present in Fedora 27 (4.15.17-300.fc27.i686+PAE - see below). Apparently I cannot change the product version of this bug. If anyone can, please do so. I would hate to see this bug closed without being fixed just because Fedora 26 is EOL. May 02 19:49:21 bat kernel: BUG: unable to handle kernel NULL pointer dereference at 00000008 May 02 19:49:21 bat kernel: IP: __radix_tree_lookup+0xe/0xa0 May 02 19:49:21 bat kernel: *pdpt = 0000000026ede001 *pde = 0000000000000000 May 02 19:49:21 bat kernel: Oops: 0000 [#2] SMP May 02 19:49:21 bat kernel: Modules linked in: vfat fat ses enclosure scsi_transport_sas ums_realtek uas usb_storage tun dm_crypt vboxpci(OE) vboxnetadp(OE) vboxnetflt(OE) vboxdrv(OE) bnep ccm fuse nf_conntrack_ftp nf_conntrack_pptp nf_conntrack_proto_gre nf_conntrack_ipv6 nf_defrag_ipv6 xt_conntrack iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat ip6table_filter nf_conntrack libcrc32c ip6_tables sunrpc ath3k btusb btrtl btbcm btintel uvcvideo bluetooth videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_core videodev media ecdh_generic arc4 intel_rapl x86_pkg_temp_thermal intel_powerclamp ath9k coretemp kvm_intel ath9k_common ath9k_hw kvm iTCO_wdt dell_wmi dell_smbios_wmi iTCO_vendor_support dell_wmi_descriptor wmi_bmof sparse_keymap dell_laptop mac80211 dell_smbios_smm snd_hda_codec_hdmi May 02 19:49:21 bat kernel: irqbypass snd_hda_codec_cirrus snd_hda_codec_generic crc32_pclmul dell_smbios intel_uncore snd_hda_intel dcdbas snd_hda_codec intel_rapl_perf snd_hda_core dell_smm_hwmon ath snd_hwdep snd_seq cfg80211 snd_seq_device snd_pcm joydev mei_me snd_timer mei snd i2c_i801 rfkill lpc_ich soundcore wmi acpi_cpufreq binfmt_misc i915 i2c_algo_bit drm_kms_helper crc32c_intel drm serio_raw r8169 mii video [last unloaded: intel_cstate] May 02 19:49:21 bat kernel: CPU: 3 PID: 7709 Comm: URL Classifier Tainted: G D OE 4.15.17-300.fc27.i686+PAE #1 May 02 19:49:21 bat kernel: Hardware name: Dell Inc. Inspiron 3520/0G8TPV, BIOS A07 05/18/2013 May 02 19:49:21 bat kernel: EIP: __radix_tree_lookup+0xe/0xa0 May 02 19:49:21 bat kernel: EFLAGS: 00210286 CPU: 3 May 02 19:49:21 bat kernel: EAX: 00000004 EBX: 96add000 ECX: 00000000 EDX: 00000000 May 02 19:49:21 bat kernel: ESI: 00000000 EDI: 00000004 EBP: ebe41da0 ESP: ebe41d90 May 02 19:49:21 bat kernel: DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 May 02 19:49:21 bat kernel: CR0: 80050033 CR2: 00000008 CR3: 09784cc0 CR4: 001406f0 May 02 19:49:21 bat kernel: Call Trace: May 02 19:49:21 bat kernel: radix_tree_lookup_slot+0x1d/0x40 May 02 19:49:21 bat kernel: find_get_entry+0x20/0x160 May 02 19:49:21 bat kernel: pagecache_get_page+0x24/0x290 May 02 19:49:21 bat kernel: lookup_swap_cache+0x3a/0x100 May 02 19:49:21 bat kernel: swap_readahead_detect+0x55/0x280 May 02 19:49:21 bat kernel: do_swap_page+0x23a/0xa50 May 02 19:49:21 bat kernel: ? lru_cache_add+0x33/0x70 May 02 19:49:21 bat kernel: ? __kunmap_atomic+0x9b/0xf0 May 02 19:49:21 bat kernel: ? native_pmd_clear+0x20/0x20 May 02 19:49:21 bat kernel: ? kmap_atomic_prot+0x103/0x130 May 02 19:49:21 bat kernel: handle_mm_fault+0x498/0xc90 May 02 19:49:21 bat kernel: __do_page_fault+0x1f9/0x4c0 May 02 19:49:21 bat kernel: ? __do_page_fault+0x4c0/0x4c0 May 02 19:49:21 bat kernel: do_page_fault+0x27/0xe0 May 02 19:49:21 bat kernel: ? __do_page_fault+0x4c0/0x4c0 May 02 19:49:21 bat kernel: common_exception+0x84/0x8a May 02 19:49:21 bat kernel: EIP: 0xb09115b5 May 02 19:49:21 bat kernel: EFLAGS: 00210206 CPU: 3 May 02 19:49:21 bat kernel: EAX: 96add000 EBX: b5f22d28 ECX: 00000000 EDX: 00000000 May 02 19:49:21 bat kernel: ESI: 00000078 EDI: 96d34f2c EBP: b5f43148 ESP: 986fbce0 May 02 19:49:21 bat kernel: DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b May 02 19:49:21 bat kernel: Code: b6 00 00 00 00 0f 0b 8d b6 00 00 00 00 b8 ef ff ff ff eb a4 e8 a4 b3 89 ff 8d 74 26 00 55 89 e5 57 56 53 89 c7 83 ec 04 89 4d f0 <8b> 5f 04 89 d8 83 e0 03 83 f8 01 75 6d 89 d8 83 e0 fe 0f b6 08 May 02 19:49:21 bat kernel: EIP: __radix_tree_lookup+0xe/0xa0 SS:ESP: 0068:ebe41d90 May 02 19:49:21 bat kernel: CR2: 0000000000000008 May 02 19:49:21 bat kernel: ---[ end trace 66367a967e24f40f ]---
I have just reproduced the bug (twice) on 4.16.5-200.fc27.i686+PAE. Kernel traces included below. The first time it happened, I traced the process and it was blocked in futex(): [root@bat ~]# strace -p 10803 strace: Process 10803 attached futex(0x1e82bf4, FUTEX_WAIT_PRIVATE, 8, NULL First kernel trace: [26444.306921] BUG: unable to handle kernel NULL pointer dereference at 00000008 [26444.306939] IP: __radix_tree_lookup+0xe/0xa0 [26444.306942] *pdpt = 000000002076a001 *pde = 0000000000000000 [26444.306947] Oops: 0000 [#1] SMP [26444.306950] Modules linked in: ums_realtek uas usb_storage tun dm_crypt bnep fuse ccm vboxpci(OE) vboxnetadp(OE) vboxnetflt(OE) vboxdrv(OE) nf_conntrack_ftp nf_conntrack_pptp nf_conntrack_proto_gre iptable_nat nf_conntrack_ipv4 nf_conntrack_ipv6 nf_defrag_ipv4 nf_defrag_ipv6 nf_nat_ipv4 nf_nat xt_conntrack nf_conntrack ip6table_filter ip6_tables libcrc32c sunrpc uvcvideo ath3k videobuf2_vmalloc btusb videobuf2_memops btrtl btbcm videobuf2_v4l2 videobuf2_common btintel bluetooth videodev intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel media iTCO_wdt dell_wmi iTCO_vendor_support wmi_bmof sparse_keymap kvm ecdh_generic arc4 ath9k snd_hda_codec_hdmi dell_laptop dell_smbios ath9k_common irqbypass snd_hda_codec_cirrus dell_wmi_descriptor snd_hda_codec_generic crc32_pclmul dcdbas ath9k_hw [26444.307004] intel_cstate snd_hda_intel intel_uncore snd_hda_codec intel_rapl_perf mac80211 dell_smm_hwmon snd_hda_core snd_hwdep joydev snd_seq ath snd_seq_device snd_pcm i2c_i801 cfg80211 snd_timer snd mei_me mei lpc_ich soundcore rfkill wmi acpi_cpufreq binfmt_misc i915 i2c_algo_bit drm_kms_helper crc32c_intel drm serio_raw r8169 mii video [26444.307036] CPU: 2 PID: 10806 Comm: convert Tainted: G OE 4.16.5-200.fc27.i686+PAE #1 [26444.307038] Hardware name: Dell Inc. Inspiron 3520/0G8TPV, BIOS A07 05/18/2013 [26444.307043] EIP: __radix_tree_lookup+0xe/0xa0 [26444.307045] EFLAGS: 00210286 CPU: 2 [26444.307048] EAX: 00000004 EBX: b225d000 ECX: 00000000 EDX: 00000000 [26444.307050] ESI: 00000000 EDI: 00000004 EBP: e077dda0 ESP: e077dd90 [26444.307053] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 [26444.307056] CR0: 80050033 CR2: 00000008 CR3: 29df9b60 CR4: 001406f0 [26444.307058] Call Trace: [26444.307064] radix_tree_lookup_slot+0x1d/0x40 [26444.307070] find_get_entry+0x20/0x160 [26444.307074] pagecache_get_page+0x24/0x280 [26444.307078] lookup_swap_cache+0x3a/0x100 [26444.307082] swap_readahead_detect+0x55/0x280 [26444.307086] do_swap_page+0x23a/0xa90 [26444.307093] ? mem_cgroup_commit_charge+0x81/0x140 [26444.307096] ? kmap_atomic_prot+0x103/0x130 [26444.307100] handle_mm_fault+0x4c8/0xcd0 [26444.307105] __do_page_fault+0x1f9/0x4c0 [26444.307109] do_page_fault+0x25/0xf0 [26444.307112] ? __do_page_fault+0x4c0/0x4c0 [26444.307117] common_exception+0x84/0x8a [26444.307120] EIP: 0xb7dc262e [26444.307122] EFLAGS: 00210287 CPU: 2 [26444.307124] EAX: 00000025 EBX: b7eefa50 ECX: 00000195 EDX: b225ced8 [26444.307127] ESI: b5a0f320 EDI: ffffffff EBP: 0000ffff ESP: b1104210 [26444.307130] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b [26444.307132] Code: b6 00 00 00 00 0f 0b 8d b6 00 00 00 00 b8 ef ff ff ff eb a4 e8 b4 55 88 ff 8d 74 26 00 55 89 e5 57 56 53 89 c7 83 ec 04 89 4d f0 <8b> 5f 04 89 d8 83 e0 03 83 f8 01 75 6d 89 d8 83 e0 fe 0f b6 08 [26444.307178] EIP: __radix_tree_lookup+0xe/0xa0 SS:ESP: 0068:e077dd90 [26444.307180] CR2: 0000000000000008 [26444.307183] ---[ end trace 4794206b9961d619 ]--- And again: [26709.205853] BUG: unable to handle kernel NULL pointer dereference at 00000008 [26709.205870] IP: __radix_tree_lookup+0xe/0xa0 [26709.205872] *pdpt = 000000002da44001 *pde = 0000000000000000 [26709.205877] Oops: 0000 [#2] SMP [26709.205880] Modules linked in: ums_realtek uas usb_storage tun dm_crypt bnep fuse ccm vboxpci(OE) vboxnetadp(OE) vboxnetflt(OE) vboxdrv(OE) nf_conntrack_ftp nf_conntrack_pptp nf_conntrack_proto_gre iptable_nat nf_conntrack_ipv4 nf_conntrack_ipv6 nf_defrag_ipv4 nf_defrag_ipv6 nf_nat_ipv4 nf_nat xt_conntrack nf_conntrack ip6table_filter ip6_tables libcrc32c sunrpc uvcvideo ath3k videobuf2_vmalloc btusb videobuf2_memops btrtl btbcm videobuf2_v4l2 videobuf2_common btintel bluetooth videodev intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel media iTCO_wdt dell_wmi iTCO_vendor_support wmi_bmof sparse_keymap kvm ecdh_generic arc4 ath9k snd_hda_codec_hdmi dell_laptop dell_smbios ath9k_common irqbypass snd_hda_codec_cirrus dell_wmi_descriptor snd_hda_codec_generic crc32_pclmul dcdbas ath9k_hw [26709.205935] intel_cstate snd_hda_intel intel_uncore snd_hda_codec intel_rapl_perf mac80211 dell_smm_hwmon snd_hda_core snd_hwdep joydev snd_seq ath snd_seq_device snd_pcm i2c_i801 cfg80211 snd_timer snd mei_me mei lpc_ich soundcore rfkill wmi acpi_cpufreq binfmt_misc i915 i2c_algo_bit drm_kms_helper crc32c_intel drm serio_raw r8169 mii video [26709.205966] CPU: 2 PID: 10997 Comm: convert Tainted: G D OE 4.16.5-200.fc27.i686+PAE #1 [26709.205969] Hardware name: Dell Inc. Inspiron 3520/0G8TPV, BIOS A07 05/18/2013 [26709.205976] EIP: __radix_tree_lookup+0xe/0xa0 [26709.205978] EFLAGS: 00210286 CPU: 2 [26709.205981] EAX: 00000004 EBX: b3134000 ECX: 00000000 EDX: 00000000 [26709.205983] ESI: 00000000 EDI: 00000004 EBP: e0697da0 ESP: e0697d90 [26709.205986] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 [26709.205988] CR0: 80050033 CR2: 00000008 CR3: 35cf4e60 CR4: 001406f0 [26709.205991] Call Trace: [26709.205999] radix_tree_lookup_slot+0x1d/0x40 [26709.206005] find_get_entry+0x20/0x160 [26709.206010] pagecache_get_page+0x24/0x280 [26709.206015] lookup_swap_cache+0x3a/0x100 [26709.206019] swap_readahead_detect+0x55/0x280 [26709.206023] do_swap_page+0x23a/0xa90 [26709.206030] ? mem_cgroup_commit_charge+0x81/0x140 [26709.206034] ? kmap_atomic_prot+0x103/0x130 [26709.206037] handle_mm_fault+0x4c8/0xcd0 [26709.206043] __do_page_fault+0x1f9/0x4c0 [26709.206047] do_page_fault+0x25/0xf0 [26709.206050] ? __do_page_fault+0x4c0/0x4c0 [26709.206056] common_exception+0x84/0x8a [26709.206060] EIP: 0xb7dd562e [26709.206062] EFLAGS: 00210206 CPU: 2 [26709.206064] EAX: 00000000 EBX: b7f02a50 ECX: 00000195 EDX: b3134720 [26709.206066] ESI: b50ecc40 EDI: ffffffff EBP: 0000ffff ESP: b30f0210 [26709.206069] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b [26709.206072] Code: b6 00 00 00 00 0f 0b 8d b6 00 00 00 00 b8 ef ff ff ff eb a4 e8 b4 55 88 ff 8d 74 26 00 55 89 e5 57 56 53 89 c7 83 ec 04 89 4d f0 <8b> 5f 04 89 d8 83 e0 03 83 f8 01 75 6d 89 d8 83 e0 fe 0f b6 08 [26709.206118] EIP: __radix_tree_lookup+0xe/0xa0 SS:ESP: 0068:e0697d90 [26709.206120] CR2: 0000000000000008 [26709.206123] ---[ end trace 4794206b9961d61a ]---
Biggest problem is that I can't upgrade to a new Fedora since it does not have a 4.11.x kernel that is the last one I have been able to use without problem.
Issue is still present in FC26 kernel 4.16.7-EAP
Fedora 26 changed to end-of-life (EOL) status on 2018-05-29. Fedora 26 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. If you are unable to reopen this bug, please file a new report against the current release. If you experience problems, please add a comment to this bug. Thank you for reporting this bug and we are sorry it could not be fixed.