Created attachment 1330070 [details] gzip'd flat file from /var/log/messages Description of problem: This problem happened right after logging in as root on a freshly booted system. The boot occurred because of a shutdown that failed to complete... with similar messages produced. I have, however, seen this message recently during removing a large number of files (> 300k files) with a script containing "rm -fv <filename>&". Fairly massive number of parallel removes made the filesystem very busy and caused a cpu to produce these "soft lockup" messages. It may have been with a different kernel. These messages came more frequently than the "23s" the message states. See the enclosed log filė Version-Release number of selected component (if applicable): 4.14.0-0.rc1.git3.1.fc28.x86_64 How reproducible: Rebooted this system and have not seen these messages... yet. Steps to Reproduce: 1.unknown 2. 3. Actual results: Expected results: Additional info:
This is nothing to do with the watchdog package. The errors come from the kernel. In any case you can normally ignore "soft lockup" errors. In this case they are probably indicating that your hard disk is slow at responding or the system is generally overloaded.
Richard, Thanks for your response. I believe you are wrong because this problem FREEZES the system. The system NEVER recovers from the problem. Yesterday I did a "shutdown now" at approximately 9am... When I returned to the system around 7pm, the system was STILL stuck in "close to" finalizing the shutdown that I issued 10 hours earlier. This is NOT the first time I've seen this problem. The first time was when the system drive was VERY busy. This time was just after a reboot which I doubt would make the drive as busy. The hard drive is a fairly new ST1000LX015-1U71 hybrid drive. I suppose the problem could be in the relationship between the kernel and the hard drive's firmware. It's my perception that there's a deadly embrace situation here. Like two components are trying to update data structures on the file system (ext4) and both hold a lock that the other components need. If it's not the watchdog who may just be observing one or more cpus in a stalled/soft lockup condition, who should this bug be transferred to? What does the stack traces indicate? Again, thanks for your response. George...
I have a similar issue with a similar message. When using my laptop normally, the screen would freeze and mouse/keyboard input would do nothing. I am forced to power off by pressing and holding the power button. Here is my setup: HP Envy x360 laptop with AMD Ryzen 5 2500U APU (4 cores/8 threads) Fedora 28 x64 KDE Spin Kernel 4.18.9-200.fc28.x86_64 16 GB RAM Journalctl entry: Sep 28 16:32:24 hostname.localdomain kernel: watchdog: BUG: soft lockup - CPU#3 stuck for 23s! [libvirtd:867] -- Reboot -- I do have virtualization enabled in the BIOS. The freezing issue has happened before with older kernel versions, but I didn't see if the message was the same.
I experienced another freeze, with cpu lockup errors. Here's what shows up in journalctl: Oct 03 10:54:36 hostname.localdomain kernel: watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [kworker/0:0:3235] Oct 03 10:54:36 hostname.localdomain kernel: Modules linked in: fuse ccm rfcomm xt_CHECKSUM ipt_MASQUERADE tun ip6t_rpfilter devlink ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6t> Oct 03 10:54:36 hostname.localdomain kernel: bluetooth hid_sensor_accel_3d hid_sensor_magn_3d snd_hwdep hid_sensor_incl_3d hid_sensor_rotation videobuf2_memops snd_seq hid_sensor_gyro_3d videobuf2_v4l2 hid_sensor_trigger videobuf2_comm> Oct 03 10:54:36 hostname.localdomain kernel: CPU: 0 PID: 3235 Comm: kworker/0:0 Tainted: G C 4.18.10-200.fc28.x86_64 #1 Oct 03 10:54:36 hostname.localdomain kernel: Hardware name: HP HP ENVY x360 Convertible 15-bq1xx/83C6, BIOS F.17 03/29/2018 Oct 03 10:54:36 hostname.localdomain kernel: Workqueue: events netstamp_clear Oct 03 10:54:36 hostname.localdomain kernel: RIP: 0010:smp_call_function_many+0x1ec/0x250 Oct 03 10:54:36 hostname.localdomain kernel: Code: c7 e8 38 bd 7a 00 3b 05 a6 6d 24 01 0f 83 99 fe ff ff 48 63 d0 48 8b 0b 48 03 0c d5 00 c7 18 9f 8b 51 18 83 e2 01 74 0a f3 90 <8b> 51 18 83 e2 01 75 f6 eb c7 48 c7 c2 20 96 38 9f 4c 89 > Oct 03 10:54:36 hostname.localdomain kernel: RSP: 0018:ffffbc26890a7d80 EFLAGS: 00000202 ORIG_RAX: ffffffffffffff13 Oct 03 10:54:36 hostname.localdomain kernel: RAX: 0000000000000002 RBX: ffff980f0ea21fc0 RCX: ffff980f0eaa7280 Oct 03 10:54:36 hostname.localdomain kernel: RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff980f0e429170 Oct 03 10:54:36 hostname.localdomain kernel: RBP: ffffffff9e02db40 R08: 0000000000026040 R09: ffffffff9e05008a Oct 03 10:54:36 hostname.localdomain kernel: R10: ffffdf5b501a1600 R11: 0000000000000fc8 R12: 0000000000000000 Oct 03 10:54:36 hostname.localdomain kernel: R13: 0000000000000001 R14: 0000000000000010 R15: 0000000000000001 Oct 03 10:54:36 hostname.localdomain kernel: FS: 0000000000000000(0000) GS:ffff980f0ea00000(0000) knlGS:0000000000000000 Oct 03 10:54:36 hostname.localdomain kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Oct 03 10:54:36 hostname.localdomain kernel: CR2: 00007f9d3400d510 CR3: 00000003f7420000 CR4: 00000000003406f0 Oct 03 10:54:36 hostname.localdomain kernel: Call Trace: Oct 03 10:54:36 hostname.localdomain kernel: ? netif_receive_skb_internal+0x1d/0xf0 Oct 03 10:54:36 hostname.localdomain kernel: ? cpumask_weight+0x10/0x10 Oct 03 10:54:36 hostname.localdomain kernel: ? netif_receive_skb_internal+0x1e/0xf0 Oct 03 10:54:36 hostname.localdomain kernel: on_each_cpu+0x28/0x60 Oct 03 10:54:36 hostname.localdomain kernel: ? netif_receive_skb_internal+0x1d/0xf0 Oct 03 10:54:36 hostname.localdomain kernel: text_poke_bp+0x68/0xdf Oct 03 10:54:36 hostname.localdomain kernel: __jump_label_transform.isra.0+0xf2/0x140 Oct 03 10:54:36 hostname.localdomain kernel: arch_jump_label_transform+0x2b/0x40 Oct 03 10:54:36 hostname.localdomain kernel: __jump_label_update+0x7d/0xb0 Oct 03 10:54:36 hostname.localdomain kernel: static_key_enable_cpuslocked+0x52/0x80 Oct 03 10:54:36 hostname.localdomain kernel: static_key_enable+0x16/0x20 Oct 03 10:54:36 hostname.localdomain kernel: process_one_work+0x1a1/0x350 Oct 03 10:54:36 hostname.localdomain kernel: worker_thread+0x30/0x380 Oct 03 10:54:36 hostname.localdomain kernel: ? pwq_unbound_release_workfn+0xd0/0xd0 Oct 03 10:54:36 hostname.localdomain kernel: kthread+0x112/0x130 Oct 03 10:54:36 hostname.localdomain kernel: ? kthread_create_worker_on_cpu+0x70/0x70 Oct 03 10:54:36 hostname.localdomain kernel: ret_from_fork+0x22/0x40 Oct 03 10:54:40 hostname.localdomain kernel: r8822be: AP off, try to reconnect now Oct 03 10:54:50 hostname.localdomain kernel: r8822be: AP off, try to reconnect now Oct 03 10:55:00 hostname.localdomain kernel: r8822be: AP off, try to reconnect now Oct 03 10:55:04 hostname.localdomain kernel: watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [kworker/0:0:3235] Oct 03 10:55:04 hostname.localdomain kernel: Modules linked in: fuse ccm rfcomm xt_CHECKSUM ipt_MASQUERADE tun ip6t_rpfilter devlink ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6t> Oct 03 10:55:04 hostname.localdomain kernel: bluetooth hid_sensor_accel_3d hid_sensor_magn_3d snd_hwdep hid_sensor_incl_3d hid_sensor_rotation videobuf2_memops snd_seq hid_sensor_gyro_3d videobuf2_v4l2 hid_sensor_trigger videobuf2_comm>
I have seen this problem a few times after reporting this bug. This bugzilla doesn't seem to tell US which Fedora Core it's reported under though. In my case I would guess it's FC28 or FC29... Also, there doesn't seem to be any indications as to it's on a Virtual system or the "Native" system (host). George...
My setup is a HP Envy x360 with AMD Ryzen 5 2500U APU. The issue happened in F28, and now in F29. Here's the journalctl entries before I rebooted: Nov 01 12:47:49 hostname.localdomain rtkit-daemon[697]: Demoting known real-time threads. Nov 01 12:47:49 hostname.localdomain rtkit-daemon[697]: Successfully demoted thread 1475 of process 1391 (/usr/bin/pulseaudio). Nov 01 12:47:49 hostname.localdomain rtkit-daemon[697]: Successfully demoted thread 1469 of process 1391 (/usr/bin/pulseaudio). Nov 01 12:47:49 hostname.localdomain rtkit-daemon[697]: Successfully demoted thread 1391 of process 1391 (/usr/bin/pulseaudio). Nov 01 12:47:49 hostname.localdomain rtkit-daemon[697]: Demoted 3 threads. Nov 01 12:47:51 hostname.localdomain kernel: r8822be: AP off, try to reconnect now Nov 01 12:47:51 hostname.localdomain kernel: wlp3s0: Connection to AP 9c:3d:cf:42:b0:4d lost Nov 01 12:47:59 hostname.localdomain rtkit-daemon[697]: The canary thread is apparently starving. Taking action. Nov 01 12:47:59 hostname.localdomain rtkit-daemon[697]: Demoting known real-time threads. Nov 01 12:47:59 hostname.localdomain rtkit-daemon[697]: Successfully demoted thread 1475 of process 1391 (/usr/bin/pulseaudio). Nov 01 12:47:59 hostname.localdomain rtkit-daemon[697]: Successfully demoted thread 1469 of process 1391 (/usr/bin/pulseaudio). Nov 01 12:47:59 hostname.localdomain rtkit-daemon[697]: Successfully demoted thread 1391 of process 1391 (/usr/bin/pulseaudio). Nov 01 12:47:59 hostname.localdomain rtkit-daemon[697]: Demoted 3 threads. Nov 01 12:48:01 hostname.localdomain kernel: r8822be: AP off, try to reconnect now Nov 01 12:48:09 hostname.localdomain rtkit-daemon[697]: The canary thread is apparently starving. Taking action. Nov 01 12:48:09 hostname.localdomain rtkit-daemon[697]: Demoting known real-time threads. Nov 01 12:48:09 hostname.localdomain rtkit-daemon[697]: Successfully demoted thread 1475 of process 1391 (/usr/bin/pulseaudio). Nov 01 12:48:09 hostname.localdomain rtkit-daemon[697]: Successfully demoted thread 1469 of process 1391 (/usr/bin/pulseaudio). Nov 01 12:48:09 hostname.localdomain rtkit-daemon[697]: Successfully demoted thread 1391 of process 1391 (/usr/bin/pulseaudio). Nov 01 12:48:09 hostname.localdomain rtkit-daemon[697]: Demoted 3 threads. Nov 01 12:48:11 hostname.localdomain kernel: r8822be: AP off, try to reconnect now Nov 01 12:48:19 hostname.localdomain kernel: watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [kworker/0:4:5227] Nov 01 12:48:19 hostname.localdomain kernel: Modules linked in: fuse nf_conntrack_sane nf_conntrack_netbios_ns nf_conntrack_broadcast xt_CT ccm rfcomm xt_CHECKSUM ipt_MASQUERADE tun bridge stp llc devlink ip6t_rpfilter ip6t_REJECT nf_re> Nov 01 12:48:19 hostname.localdomain kernel: uvcvideo ghash_clmulni_intel hid_sensor_gyro_3d hid_sensor_magn_3d hid_sensor_incl_3d hid_sensor_rotation snd_hda_core videobuf2_vmalloc snd_hwdep hid_sensor_accel_3d videobuf2_memops hid_se> Nov 01 12:48:19 hostname.localdomain kernel: CPU: 0 PID: 5227 Comm: kworker/0:4 Tainted: G C 4.18.16-300.fc29.x86_64 #1 Nov 01 12:48:19 hostname.localdomain kernel: Hardware name: HP HP ENVY x360 Convertible 15-bq1xx/83C6, BIOS F.17 03/29/2018 Nov 01 12:48:19 hostname.localdomain kernel: Workqueue: events netstamp_clear Nov 01 12:48:19 hostname.localdomain kernel: RIP: 0010:smp_call_function_many+0x1ec/0x250 Nov 01 12:48:19 hostname.localdomain kernel: Code: c7 e8 88 af 7a 00 3b 05 96 71 24 01 0f 83 99 fe ff ff 48 63 d0 48 8b 0b 48 03 0c d5 00 c7 18 92 8b 51 18 83 e2 01 74 0a f3 90 <8b> 51 18 83 e2 01 75 f6 eb c7 48 c7 c2 e0 aa 38 92 4c 89 > Nov 01 12:48:19 hostname.localdomain kernel: RSP: 0018:ffffbd488995fd80 EFLAGS: 00000202 ORIG_RAX: ffffffffffffff13 Nov 01 12:48:19 hostname.localdomain kernel: RAX: 0000000000000006 RBX: ffff9f7c9ec21fc0 RCX: ffff9f7c9eda7280 Nov 01 12:48:19 hostname.localdomain kernel: RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff9f7c9e829f08 Nov 01 12:48:19 hostname.localdomain kernel: RBP: ffffffff9102dc70 R08: 0000000000026040 R09: ffffffff910501ba Nov 01 12:48:19 hostname.localdomain kernel: R10: ffffe7550fabe280 R11: 00000000000007d8 R12: 0000000000000000 Nov 01 12:48:19 hostname.localdomain kernel: R13: 0000000000000001 R14: 0000000000000010 R15: 0000000000000001 Nov 01 12:48:19 hostname.localdomain kernel: FS: 0000000000000000(0000) GS:ffff9f7c9ec00000(0000) knlGS:0000000000000000 Nov 01 12:48:19 hostname.localdomain kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Nov 01 12:48:19 hostname.localdomain kernel: CR2: 00007fbaaa820000 CR3: 00000003f6654000 CR4: 00000000003406f0 Nov 01 12:48:19 hostname.localdomain kernel: Call Trace: Nov 01 12:48:19 hostname.localdomain kernel: ? netif_receive_skb_internal+0x1d/0xf0 Nov 01 12:48:19 hostname.localdomain kernel: ? cpumask_weight+0x10/0x10 Nov 01 12:48:19 hostname.localdomain kernel: ? netif_receive_skb_internal+0x1e/0xf0 Nov 01 12:48:19 hostname.localdomain kernel: on_each_cpu+0x28/0x60 Nov 01 12:48:19 hostname.localdomain kernel: ? netif_receive_skb_internal+0x1d/0xf0 Nov 01 12:48:19 hostname.localdomain kernel: text_poke_bp+0x68/0xdf Nov 01 12:48:19 hostname.localdomain kernel: __jump_label_transform.isra.0+0xf2/0x140 Nov 01 12:48:19 hostname.localdomain kernel: arch_jump_label_transform+0x2b/0x40 Nov 01 12:48:19 hostname.localdomain kernel: __jump_label_update+0x7d/0xb0 Nov 01 12:48:19 hostname.localdomain kernel: static_key_enable_cpuslocked+0x52/0x80 Nov 01 12:48:19 hostname.localdomain kernel: static_key_enable+0x16/0x20 Nov 01 12:48:19 hostname.localdomain kernel: process_one_work+0x1a1/0x350 Nov 01 12:48:19 hostname.localdomain kernel: worker_thread+0x30/0x380 Nov 01 12:48:19 hostname.localdomain kernel: ? pwq_unbound_release_workfn+0xd0/0xd0 Nov 01 12:48:19 hostname.localdomain kernel: kthread+0x112/0x130 Nov 01 12:48:19 hostname.localdomain kernel: ? kthread_create_worker_on_cpu+0x70/0x70 Nov 01 12:48:19 hostname.localdomain kernel: ret_from_fork+0x22/0x40 Nov 01 12:48:19 hostname.localdomain rtkit-daemon[697]: The canary thread is apparently starving. Taking action. Nov 01 12:48:19 hostname.localdomain rtkit-daemon[697]: Demoting known real-time threads. Nov 01 12:48:19 hostname.localdomain rtkit-daemon[697]: Successfully demoted thread 1475 of process 1391 (/usr/bin/pulseaudio). Nov 01 12:48:19 hostname.localdomain rtkit-daemon[697]: Successfully demoted thread 1469 of process 1391 (/usr/bin/pulseaudio). Nov 01 12:48:19 hostname.localdomain rtkit-daemon[697]: Successfully demoted thread 1391 of process 1391 (/usr/bin/pulseaudio). Nov 01 12:48:19 hostname.localdomain rtkit-daemon[697]: Demoted 3 threads. Nov 01 12:48:21 hostname.localdomain kernel: r8822be: AP off, try to reconnect now Nov 01 12:48:29 hostname.localdomain rtkit-daemon[697]: The canary thread is apparently starving. Taking action. Nov 01 12:48:29 hostname.localdomain rtkit-daemon[697]: Demoting known real-time threads. Nov 01 12:48:29 hostname.localdomain rtkit-daemon[697]: Successfully demoted thread 1475 of process 1391 (/usr/bin/pulseaudio). Nov 01 12:48:29 hostname.localdomain rtkit-daemon[697]: Successfully demoted thread 1469 of process 1391 (/usr/bin/pulseaudio). Nov 01 12:48:29 hostname.localdomain rtkit-daemon[697]: Successfully demoted thread 1391 of process 1391 (/usr/bin/pulseaudio). Nov 01 12:48:29 hostname.localdomain rtkit-daemon[697]: Demoted 3 threads. Nov 01 12:48:31 hostname.localdomain kernel: watchdog: BUG: soft lockup - CPU#2 stuck for 22s! [QThread:6037]
CPU lockup happened again. I had a few applications open (Firefox, Kontact, Konsole, KeePassXC, Viber) and left the system idle for a little while. When I wanted to use my computer again, the screen didn't come back up and moving the mouse nor entering keys brought the system back. I had to shut it off by holding the power button. I will attach the journalctl output.
Created attachment 1553730 [details] Journalctl output for kernel 4.19.16-200.fc29.x86_64 during cpu lockup
Created attachment 1581535 [details] Journalctl output for kernel 5.1.8-300.fc30.x86_64: during cpu lockup Happened again on Fedora 30, kernel 5.1.8-300.fc30.x86_64, attached is the journalctl output.
Created attachment 1634145 [details] Journalctl output for F31 KDE kernel 5.3.8-300.fc31.x86_64 during cpu lockup Freeze encountered on F31 KDE, kernel 5.3.8-300.fc31.x86_64. Attached is the journalctl output.
Hello, Something similar happened to me on Fedora 31 (kernel 5.3.11-300.fc31.x86_64). Examples of the errors are: A) watchdog: BUG: soft lockup - CPU#1 stuck for 22s! [thumbnail.so:1942] B) watchdog: BUG: soft lockup - CPU#1 stuck for 23s! [kscreenlocker_g:9151] In all cases the system freezes, the reporting tool marks the issue as non reportable, and there is no other exit than manually turning the computer off. The error manifests in a seemingly random fashion. In some occasions as I try to play a video with vlc, or as I try to edit a figure with Gimp, or as I attempt to edit tags on an mp3 using EasyTag, etc. The freeze is not immediate, multiple error pop-ups will show and as I attempt to close a window, or click on the application launcher to select logout/restart/shut-down the system will freeze (some applications first, some later, until the mouse does not respond). My computer runs on a AMD cpu (Athlon(tm) 64 X2 Dual Core Processor 4600+) with an NVidia GeForce 7300 GT and a WD Black hard drive. So the issue happens in old and new computers.
Hi, Additional info: Yesterday I added a kernel to this system "5.4.0-0.rc7.git1.2.fc32.x86_64". I got this from the koji site referenced in the kernel bug procedures here on bugzilla. The system became unusable right after boot with a flury of soft lockup messages which appeared in approximately 2 seconds. Obviously something is wrong here. Each message complained that 23 seconds had elapsed but this is impossible since wall time was less than 2 seconds. Is someone working this bug? I'm currently running an OLD kernel which does NOT exhibit this behavior, 5.4.0-0.rc0.git2.1.fc32.x86_64. Please help. George...
Hi, I have same problem on Intel i7-9750H and NVIDIA 2060 RTX, I temporarily fixed it downgrading to Fedora 30 and using 5.0.9 kernel, which works in my case, but I need latest updates which this fix does not allow . Please fix it soon. Regards, Lukas
Hi, I have frozen my systems here at level just before the level (see above) where this problem started appearing. I have NO IDEA IF ANYONE IS WORKING THIS BUG? Anyone? Are you there? Thanks, George...
Hello, Envy X360 user here. This bug disappeared somewhere during Linux 5.7 (between it would never happen), but now after an upgrade to 5.8 it came back. Not affected: 5.7.0-1 to 5.7.14-200 (at least) Affected: 5.8.0-0.rc7.1 to 5.8.6-201 (at least) I'm gonna attach a log of kernel 5.8.6
Created attachment 1713885 [details] System freeze on 5.8.6