Created attachment 1367022 [details] output from journalctl -b -1 (occured previous boot) but does not show all kernel messages Removed extra boot options to confirm this issue. Occurs perhaps on 50% on reboots. Exact triggers still not observed Using Ryzen 1600X on B350M
Created attachment 1367024 [details] Extra kernel output not entirely present in log ctrl-alt-del at command to obtain extra kernel output. Freeze also experienced when rebooting from GUI
The panic is such that using the reset button is ineffective, and ACPI hard shutdown (holding the power button) is required
Bug still present in 4.14.6
Bug still present in .8
Created attachment 1371095 [details] new kernel console dump in newer kernel point releases 4.14.6 and 4.14.8 have now been dumping even more to the console, and I believe (though it's hard to tell because it scolls by so fast) the original console dump still applies.
Created attachment 1372607 [details] Another kernel panic example I think this is another example of the same issue. It happens on reboot on 4.14.5-300 and 4.14.8-300. Note that this only happens when rebooting via systemd. If I do a reboot --force it behaves normally.
Created attachment 1374213 [details] Upstream patch to fix the issue up to 4.14.9 Problem happens on Fedora 27 with 4.14.9, too: Kernel crashes on reboot on three workstations at work. Fix is available here: https://marc.info/?l=linux-cgroups&m=151378282108794&w=2 I'll also attach the upstream patch to this bug report.
Additional partial kernel oops backtraces so search engines can find this bugzilla entry: ************************************** [ 1164.913034] NMI watchdog: Watchdog detected hard LOCKUP on cpu 3 [ 1164.913034] Modules linked in: netconsole vhost_net vhost tap xt_CHECKSUM tun ebtable_filter ebtables rpcsec_gss_krb5 auth_rpcgs [ 1164.913058] ghash_clmulni_intel r8169 mii sunrpc scsi_transport_iscsi [last unloaded: libcrc32c] [ 1164.913062] CPU: 3 PID: 519 Comm: kworker/dying Not tainted 4.14.9-300.fc27.x86_64 #1 [ 1164.913062] Hardware name: ASUS All Series/B85M-E, BIOS 2306 11/09/2015 [ 1164.913063] task: ffff8ac2f061be80 task.stack: ffffae6c839cc000 [ 1164.913068] RIP: 0010:queued_spin_lock_slowpath+0x12d/0x190 [ 1164.913068] RSP: 0018:ffffae6c839cfe68 EFLAGS: 00000002 [ 1164.913069] RAX: 00000000001c0101 RBX: ffffffffb7e6c200 RCX: 0000000000000001 [ 1164.913070] RDX: 0000000000000101 RSI: 0000000000000001 RDI: ffffffffb856f3a0 [ 1164.913070] RBP: ffff8ac2f061be80 R08: 0000000000000101 R09: 0000000000000000 [ 1164.913071] R10: 0000000000000000 R11: 0000000000000300 R12: 0000000000000000 [ 1164.913071] R13: ffff8ac2f061be01 R14: 0000000000000000 R15: ffffffffb70bdc80 [ 1164.913072] FS: 0000000000000000(0000) GS:ffff8ac31dcc0000(0000) knlGS:0000000000000000 [ 1164.913073] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1164.913073] CR2: 00000000b7746000 CR3: 0000000373e09003 CR4: 00000000001626e0 [ 1164.913074] Call Trace: [ 1164.913077] cgroup_exit+0x4a/0xf0 [ 1164.913081] do_exit+0x2f8/0xba0 [ 1164.913084] ? worker_thread+0x252/0x380 [ 1164.913085] ? process_one_work+0x3a0/0x3a0 [ 1164.913086] kthread+0xe7/0x130 [ 1164.913087] ? kthread_park+0x60/0x60 [ 1164.913089] ? do_syscall_64+0x61/0x170 [ 1164.913090] ? SyS_exit_group+0x10/0x10 [ 1164.913092] ret_from_fork+0x1f/0x30 .. [ 1165.126398] NMI watchdog: Watchdog detected hard LOCKUP on cpu 2 [ 1165.126399] Modules linked in: netconsole vhost_net vhost tap xt_CHECKSUM tun ebtable_filter ebtables rpcsec_gss_krb5 auth_rpcgs [ 1165.126417] ghash_clmulni_intel r8169 mii sunrpc scsi_transport_iscsi [last unloaded: libcrc32c] [ 1165.126420] CPU: 2 PID: 1 Comm: systemd Not tainted 4.14.9-300.fc27.x86_64 #1 [ 1165.126420] Hardware name: ASUS All Series/B85M-E, BIOS 2306 11/09/2015 [ 1165.126421] task: ffff8ac2f9ed3e80 task.stack: ffffae6c83144000 [ 1165.126423] RIP: 0010:css_task_iter_advance+0x22/0x70 [ 1165.126424] RSP: 0018:ffffae6c83147da8 EFLAGS: 00000002 [ 1165.126424] RAX: ffff8ac2ef1ca090 RBX: ffff8ac2f4711c90 RCX: dead000000000200 [ 1165.126425] RDX: ffff8ac2ef1ca0a0 RSI: ffff8ac2ef1ca0a0 RDI: ffff8ac2ef1ef4e0 [ 1165.126426] RBP: ffff8ac2ef1ef480 R08: ffff8ac2ef1ca0a0 R09: ffff8ac2ef1ef480 [ 1165.126426] R10: 00007f2edf875b38 R11: 0000000000003000 R12: ffff8ac2ef1ef300 [ 1165.126427] R13: ffffae6c83147e18 R14: ffff8ac2f413f100 R15: ffff8ac2ef1ef300 [ 1165.126428] FS: 00007f2ee0f0fa00(0000) GS:ffff8ac31dc80000(0000) knlGS:0000000000000000 [ 1165.126428] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1165.126429] CR2: 000055a4c2fcc068 CR3: 00000007f85d7004 CR4: 00000000001626e0 [ 1165.126430] Call Trace: [ 1165.126432] css_task_iter_next+0x4f/0x70 [ 1165.126435] kernfs_seq_start+0x4a/0x80 [ 1165.126438] seq_read+0xa9/0x440 [ 1165.126439] __vfs_read+0x33/0x160 [ 1165.126441] vfs_read+0x89/0x130 [ 1165.126442] SyS_read+0x52/0xc0 [ 1165.126444] entry_SYSCALL_64_fastpath+0x1a/0x7d ************************************** From a second machine: ************************************** [ 266.102397] WARNING: CPU: 6 PID: 1 at kernel/fork.c:414 __put_task_struct+0xeb/0x150 [ 266.102407] Modules linked in: netconsole rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache vhost_net vhost [ 266.102438] e1000e crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel ptp pps_core hid_microsoft sunrpc scsi_transp [ 266.102447] CPU: 6 PID: 1 Comm: systemd Not tainted 4.14.9-300.fc27.x86_64 #1 [ 266.102452] Hardware name: /DH87MC, BIOS MCH8710H.86A.0157.2014.0530.1830 05/30/2014 [ 266.102456] task: ffff95aa7bed1f40 task.stack: ffffaaac83144000 [ 266.102461] RIP: 0010:__put_task_struct+0xeb/0x150 [ 266.102466] RSP: 0018:ffffaaac83147db8 EFLAGS: 00010246 [ 266.102472] RAX: 0000000000000000 RBX: ffff95aa58161110 RCX: 0000000000000001 [ 266.102477] RDX: ffffaaac83147e20 RSI: ffff95aa58161110 RDI: ffff95aa58161110 [ 266.102481] RBP: ffffaaac83147f20 R08: 0000000000001000 R09: 0000000000000007 [ 266.102486] R10: ffff95aa793c2f38 R11: ffff95aa4c768006 R12: ffff95aa793c2f00 [ 266.102490] R13: 00000000ffffffff R14: ffff95aa58161110 R15: ffff95aa74f48f00 [ 266.102495] FS: 00007f3f9ff93a00(0000) GS:ffff95aa9fb80000(0000) knlGS:0000000000000000 [ 266.102499] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 266.102504] CR2: 000055a318166620 CR3: 00000007f2781004 CR4: 00000000001626e0 [ 266.102508] Call Trace: [ 266.102514] css_task_iter_next+0x68/0x70 [ 266.102521] kernfs_seq_next+0x23/0x50 [ 266.102528] ? cgroup_procs_show+0x26/0x30 [ 266.102534] seq_read+0x313/0x440 [ 266.102539] __vfs_read+0x33/0x160 [ 266.102543] vfs_read+0x89/0x130 [ 266.102549] SyS_read+0x52/0xc0 [ 266.102557] entry_SYSCALL_64_fastpath+0x1a/0x7d ************************************** ************************************** [ 266.102663] BUG: unable to handle kernel NULL pointer dereference at 00000000000000b0 [ 266.102671] IP: pids_free+0x11/0x40 [ 266.102674] PGD 0 P4D 0 [ 266.102682] Oops: 0000 [#1] SMP [ 266.102686] Modules linked in: netconsole rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache vhost_net vhost [ 266.102716] e1000e crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel ptp pps_core hid_microsoft sunrpc scsi_transp [ 266.102726] CPU: 6 PID: 1 Comm: systemd Tainted: G W 4.14.9-300.fc27.x86_64 #1 [ 266.102730] Hardware name: /DH87MC, BIOS MCH8710H.86A.0157.2014.0530.1830 05/30/2014 [ 266.102736] task: ffff95aa7bed1f40 task.stack: ffffaaac83144000 [ 266.102741] RIP: 0010:pids_free+0x11/0x40 [ 266.102745] RSP: 0018:ffffaaac83147d70 EFLAGS: 00010246 [ 266.102748] RAX: ffff95a93f6828c0 RBX: 0000000000000000 RCX: 000000000000000b [ 266.102751] RDX: 000000000000000b RSI: 000000000000000c RDI: ffff95aa58161110 [ 266.102755] RBP: ffff95aa58161110 R08: 0000000000001000 R09: 0000000000000007 [ 266.102759] R10: ffff95aa793c2f38 R11: ffff95aa4c768006 R12: ffffffff82e6c640 [ 266.102763] R13: ffff95a93f6828c0 R14: ffff95aa58161110 R15: ffff95aa74f48f00 [ 266.102769] FS: 00007f3f9ff93a00(0000) GS:ffff95aa9fb80000(0000) knlGS:0000000000000000 [ 266.102774] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 266.102778] CR2: 00000000000000b0 CR3: 00000007f2781004 CR4: 00000000001626e0 [ 266.102783] Call Trace: [ 266.102790] cgroup_free+0x5c/0xd0 [ 266.102795] __put_task_struct+0x3d/0x150 [ 266.102801] css_task_iter_next+0x68/0x70 [ 266.102809] kernfs_seq_next+0x23/0x50 [ 266.102816] ? cgroup_procs_show+0x26/0x30 [ 266.102825] seq_read+0x313/0x440 [ 266.102832] __vfs_read+0x33/0x160 [ 266.102838] vfs_read+0x89/0x130 [ 266.102845] SyS_read+0x52/0xc0 [ 266.102853] entry_SYSCALL_64_fastpath+0x1a/0x7d **************************************
Just to sum this up: The upstream patch fixes the issue on all three affected workstations at work. Two of them crashed on every reboot.
*** Bug 1530318 has been marked as a duplicate of this bug. ***
The patch seems to be included in current Fedora kernel builds with Patch631: cgroup-for-4.15-fixes-cgroup-fix-css_task_iter-crash-on-CSS_TASK_ITER_PROC.patch It came with kernel-4.14.11-300.fc27 kernel-4.14.11-200.fc26 but without mentioning this rhbz entry.
Yes, we missed this when doing the updates. Thanks for pointing that out.