Bug 1266691 - soft lockup when booting in any release after Linux 4.1.3-201.fc22.x86_64
soft lockup when booting in any release after Linux 4.1.3-201.fc22.x86_64
Status: CLOSED ERRATA
Product: Fedora
Classification: Fedora
Component: kernel (Show other bugs)
22
x86_64 Linux
unspecified Severity urgent
: ---
: ---
Assigned To: Kernel Maintainer List
Fedora Extras Quality Assurance
: Reopened
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2015-09-26 11:09 EDT by User3.14
Modified: 2015-11-01 17:21 EST (History)
8 users (show)

See Also:
Fixed In Version: kernel-4.1.10-200.fc22 kernel-4.1.10-100.fc21
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2015-10-21 14:37:03 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
output of dmesg (68.97 KB, text/plain)
2015-09-28 16:25 EDT, User3.14
no flags Details
output of jornalctl -b -l (670.98 KB, text/plain)
2015-09-28 16:29 EDT, User3.14
no flags Details
screenshot when booting into more recent update (495.24 KB, image/jpeg)
2015-09-28 16:39 EDT, User3.14
no flags Details
results of journalctl -b -1 (720.14 KB, text/plain)
2015-09-30 14:01 EDT, User3.14
no flags Details
screenshot when booting into 4.1.10-200.fc22.x86_64 (12.98 KB, image/jpeg)
2015-10-07 13:47 EDT, User3.14
no flags Details
screenshot when booting into 4.1.10-200.fc22.x86_64 (554.84 KB, image/jpeg)
2015-10-07 14:15 EDT, User3.14
no flags Details
another screenshot when booting into 4.1.10-200.fc22.x86_64 (565.48 KB, image/jpeg)
2015-10-07 14:19 EDT, User3.14
no flags Details
output of lspci -nnvv (19.20 KB, text/plain)
2015-10-09 13:51 EDT, User3.14
no flags Details

  None (edit)
Description User3.14 2015-09-26 11:09:33 EDT
Description of problem:
After updating all software using dnf the computer will not load fedora and instead shows "NMI watchdog: BUG: soft lockup - CPU0 stuck for 23s! [migration/0:11]" and similarly repeated for each CPU. After waiting a while I have had to shutdown. Linux 4.1.3-201.fc22.x86_64 still loads fine but no update since then has worked (I have just tried 4.1.7).

I have selected to store 10 versions on GRUB - not sure what the limit is.

Version-Release number of selected component (if applicable):
Linux 4.1.3-201.fc22.x86_64

How reproducible:
Select any version newer than Linux 4.1.3-201.fc22.x86_64 from GRUB and wait.

Steps to Reproduce:
1.Select any version newer than Linux 4.1.3-201.fc22.x86_64 from GRUB and wait.
2.
3.

Actual results:
Soft lockup error.
"NMI watchdog: BUG: soft lockup - CPU0 stuck for 23s! [migration/0:11]" and similarly repeated for each CPU.
Also,
"rcu_sched kthread self-detected stall on CPU {2} (t=60000 jiffies g=-137 c=130 q=0)

Expected results:
Loads and shows login screen

Additional info:
Not sure what info will be required
Comment 1 Josh Boyer 2015-09-28 09:01:56 EDT
Please attach the output of 'journalctl -b -1' after you boot into one of the bad kernels and boot back into a working kernel.  Please also attach the output of dmesg from the working kernel.  If you can capture a picture of the screen on a bad boot, that might be helpful as well.
Comment 2 User3.14 2015-09-28 16:16:58 EDT
output of 'journalctl -b -l'

-- Logs begin at Wed 2014-05-07 19:26:03 BST, end at Mon 2015-09-28 21:13:00 BST
Sep 28 21:11:44 localhost.localdomain systemd-journal[153]: Runtime journal (/ru
                                                            Maximum allowed usag
                                                            Leaving at least 595
                                                            Enforced usage limit
Sep 28 21:11:44 localhost.localdomain systemd-journal[153]: Runtime journal (/ru
                                                            Maximum allowed usag
                                                            Leaving at least 595
                                                            Enforced usage limit
Sep 28 21:11:44 localhost.localdomain kernel: Initializing cgroup subsys cpuset
Sep 28 21:11:44 localhost.localdomain kernel: Initializing cgroup subsys cpu
Sep 28 21:11:44 localhost.localdomain kernel: Initializing cgroup subsys cpuacct
Sep 28 21:11:44 localhost.localdomain kernel: Linux version 4.1.3-201.fc22.x86_6
Sep 28 21:11:44 localhost.localdomain kernel: Command line: BOOT_IMAGE=/vmlinuz-
Sep 28 21:11:44 localhost.localdomain kernel: tseg: 00af800000
Sep 28 21:11:44 localhost.localdomain kernel: e820: BIOS-provided physical RAM m
Sep 28 21:11:44 localhost.localdomain kernel: BIOS-e820: [mem 0x0000000000000000
Sep 28 21:11:44 localhost.localdomain kernel: BIOS-e820: [mem 0x000000000009e800
Sep 28 21:11:44 localhost.localdomain kernel: BIOS-e820: [mem 0x00000000000e0000
Sep 28 21:11:44 localhost.localdomain kernel: BIOS-e820: [mem 0x0000000000100000
Sep 28 21:11:44 localhost.localdomain kernel: BIOS-e820: [mem 0x00000000ae47f000
Sep 28 21:11:44 localhost.localdomain kernel: BIOS-e820: [mem 0x00000000aea3d000
Sep 28 21:11:44 localhost.localdomain kernel: BIOS-e820: [mem 0x00000000aee37000
lines 1-23...skipping...
-- Logs begin at Wed 2014-05-07 19:26:03 BST, end at Mon 2015-09-28 21:13:00 BST. --
Sep 28 21:11:44 localhost.localdomain systemd-journal[153]: Runtime journal (/run/log/journal/) is currently using 8.0M.
                                                            Maximum allowed usage is set to 397.2M.
                                                            Leaving at least 595.9M free (of currently available 3.8G of space).
                                                            Enforced usage limit is thus 397.2M.
Sep 28 21:11:44 localhost.localdomain systemd-journal[153]: Runtime journal (/run/log/journal/) is currently using 8.0M.
                                                            Maximum allowed usage is set to 397.2M.
                                                            Leaving at least 595.9M free (of currently available 3.8G of space).
                                                            Enforced usage limit is thus 397.2M.
Sep 28 21:11:44 localhost.localdomain kernel: Initializing cgroup subsys cpuset
Sep 28 21:11:44 localhost.localdomain kernel: Initializing cgroup subsys cpu
Sep 28 21:11:44 localhost.localdomain kernel: Initializing cgroup subsys cpuacct
Sep 28 21:11:44 localhost.localdomain kernel: Linux version 4.1.3-201.fc22.x86_64 (mockbuild@bkernel02.phx2.fedoraproject.org) (gcc version 5.1.1 20150618 (Red Hat 5.1.1-4) (GCC) ) #1 SMP We
Sep 28 21:11:44 localhost.localdomain kernel: Command line: BOOT_IMAGE=/vmlinuz-4.1.3-201.fc22.x86_64 root=/dev/mapper/fedora-root ro rd.lvm.lv=fedora/swap vconsole.font=latarcyrheb-sun16 rd
Sep 28 21:11:44 localhost.localdomain kernel: tseg: 00af800000
Sep 28 21:11:44 localhost.localdomain kernel: e820: BIOS-provided physical RAM map:
Sep 28 21:11:44 localhost.localdomain kernel: BIOS-e820: [mem 0x0000000000000000-0x000000000009e7ff] usable
Sep 28 21:11:44 localhost.localdomain kernel: BIOS-e820: [mem 0x000000000009e800-0x000000000009ffff] reserved
Sep 28 21:11:44 localhost.localdomain kernel: BIOS-e820: [mem 0x00000000000e0000-0x00000000000fffff] reserved
Sep 28 21:11:44 localhost.localdomain kernel: BIOS-e820: [mem 0x0000000000100000-0x00000000ae47efff] usable
Sep 28 21:11:44 localhost.localdomain kernel: BIOS-e820: [mem 0x00000000ae47f000-0x00000000aea3cfff] reserved
Sep 28 21:11:44 localhost.localdomain kernel: BIOS-e820: [mem 0x00000000aea3d000-0x00000000aee36fff] ACPI NVS
Sep 28 21:11:44 localhost.localdomain kernel: BIOS-e820: [mem 0x00000000aee37000-0x00000000af156fff] reserved
Sep 28 21:11:44 localhost.localdomain kernel: BIOS-e820: [mem 0x00000000af157000-0x00000000af157fff] usable
Sep 28 21:11:44 localhost.localdomain kernel: BIOS-e820: [mem 0x00000000af158000-0x00000000af35dfff] ACPI NVS
Sep 28 21:11:44 localhost.localdomain kernel: BIOS-e820: [mem 0x00000000af35e000-0x00000000af7fffff] usable
Sep 28 21:11:44 localhost.localdomain kernel: BIOS-e820: [mem 0x00000000f8000000-0x00000000fbffffff] reserved
Sep 28 21:11:44 localhost.localdomain kernel: BIOS-e820: [mem 0x00000000fec00000-0x00000000fec00fff] reserved
Sep 28 21:11:44 localhost.localdomain kernel: BIOS-e820: [mem 0x00000000fec10000-0x00000000fec10fff] reserved
Sep 28 21:11:44 localhost.localdomain kernel: BIOS-e820: [mem 0x00000000fec20000-0x00000000fec20fff] reserved
Sep 28 21:11:44 localhost.localdomain kernel: BIOS-e820: [mem 0x00000000fed00000-0x00000000fed00fff] reserved
Sep 28 21:11:44 localhost.localdomain kernel: BIOS-e820: [mem 0x00000000fed61000-0x00000000fed70fff] reserved
Sep 28 21:11:44 localhost.localdomain kernel: BIOS-e820: [mem 0x00000000fed80000-0x00000000fed8ffff] reserved
Sep 28 21:11:44 localhost.localdomain kernel: BIOS-e820: [mem 0x00000000fef00000-0x00000000ffffffff] reserved
Sep 28 21:11:44 localhost.localdomain kernel: BIOS-e820: [mem 0x0000000100001000-0x000000024effffff] usable
Sep 28 21:11:44 localhost.localdomain kernel: NX (Execute Disable) protection: active
Sep 28 21:11:44 localhost.localdomain kernel: SMBIOS 2.7 present.
Sep 28 21:11:44 localhost.localdomain kernel: DMI: Gigabyte Technology Co., Ltd. To be filled by O.E.M./990XA-UD3, BIOS FD 02/04/2013
Sep 28 21:11:44 localhost.localdomain kernel: e820: update [mem 0x00000000-0x00000fff] usable ==> reserved
Sep 28 21:11:44 localhost.localdomain kernel: e820: remove [mem 0x000a0000-0x000fffff] usable
Sep 28 21:11:44 localhost.localdomain kernel: e820: last_pfn = 0x24f000 max_arch_pfn = 0x400000000
Sep 28 21:11:44 localhost.localdomain kernel: MTRR default type: uncachable
Sep 28 21:11:44 localhost.localdomain kernel: MTRR fixed ranges enabled:
Sep 28 21:11:44 localhost.localdomain kernel:   00000-9FFFF write-back
Sep 28 21:11:44 localhost.localdomain kernel:   A0000-BFFFF write-through
Sep 28 21:11:44 localhost.localdomain kernel:   C0000-CEFFF write-protect
Sep 28 21:11:44 localhost.localdomain kernel:   CF000-EBFFF uncachable
Sep 28 21:11:44 localhost.localdomain kernel:   EC000-FFFFF write-protect
Sep 28 21:11:44 localhost.localdomain kernel: MTRR variable ranges enabled:
Sep 28 21:11:44 localhost.localdomain kernel:   0 base 000000000000 mask FFFF80000000 write-back
Sep 28 21:11:44 localhost.localdomain kernel:   1 base 000080000000 mask FFFFC0000000 write-back
Sep 28 21:11:44 localhost.localdomain kernel:   2 base 0000AF800000 mask FFFFFF800000 uncachable
lines 1-52
Comment 3 User3.14 2015-09-28 16:25 EDT
Created attachment 1078051 [details]
output of dmesg
Comment 4 User3.14 2015-09-28 16:29 EDT
Created attachment 1078052 [details]
output of jornalctl -b -l
Comment 5 User3.14 2015-09-28 16:39 EDT
Created attachment 1078053 [details]
screenshot when booting into more recent update
Comment 6 Josh Boyer 2015-09-29 08:19:34 EDT
(In reply to comic_engineer from comment #4)
> Created attachment 1078052 [details]
> output of jornalctl -b -l

This isn't quite what was asked for.  Let me make it more clear as you seemed to have used -l instead of -1 (negative one)

journalctl --boot=-1
Comment 7 User3.14 2015-09-30 14:01 EDT
Created attachment 1078759 [details]
results of journalctl -b -1

I'm not sure if this is what is expected? It seems to contain details of the last successful boot rather than the unsuccessful attempt in between.
Comment 8 Sammy 2015-09-30 17:30:21 EDT
This started happening to me sometime after boot when I upgraded from 4.1.8
to 4.1.9 (from koji). Then I looked at comments on lwn.net announcement of the
new kernel and other people are reporting the same problem on servers. Going
back to 4.1.8 is stable for me and others....perhaps another data point to see
what change in 4.1.9 can influence this.
Comment 9 Sammy 2015-10-04 17:13:39 EDT
4.1.10 still has the bug because a fix was too late to get in (see lwn.net
kernel announcement). Could we put this in to the 4.1.10 build. Thanks.
Comment 10 Fedora Update System 2015-10-05 14:33:41 EDT
kernel-4.1.10-200.fc22 has been submitted as an update to Fedora 22. https://bodhi.fedoraproject.org/updates/FEDORA-2015-dcc260f2f2
Comment 11 Fedora Update System 2015-10-05 14:35:45 EDT
kernel-4.1.10-100.fc21 has been submitted as an update to Fedora 21. https://bodhi.fedoraproject.org/updates/FEDORA-2015-d7e074ba30
Comment 12 Fedora Update System 2015-10-07 11:23:28 EDT
kernel-4.1.10-100.fc21 has been pushed to the Fedora 21 testing repository. If problems still persist, please make note of it in this bug report.
If you want to test the update, you can install it with
$ su -c 'dnf --enablerepo=updates-testing update kernel'
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2015-d7e074ba30
Comment 13 Fedora Update System 2015-10-07 12:26:32 EDT
kernel-4.1.10-200.fc22 has been pushed to the Fedora 22 testing repository. If problems still persist, please make note of it in this bug report.
If you want to test the update, you can install it with
$ su -c 'dnf --enablerepo=updates-testing update kernel'
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2015-dcc260f2f2
Comment 14 User3.14 2015-10-07 13:47 EDT
Created attachment 1080753 [details]
screenshot when booting into 4.1.10-200.fc22.x86_64
Comment 15 Josh Boyer 2015-10-07 13:57:51 EDT
It seems Sammy co-opted your bug with a "me too" that wasn't actually the same problem.  However, your screenshot is so small that I cannot read any of the text.
Comment 16 Sammy 2015-10-07 14:00:45 EDT
Sorry if this is not the same bug but the new 4.1.10 solved the other bug then.
It is working fine in my case.
Comment 17 User3.14 2015-10-07 14:13:16 EDT
Yer, I didn't think that would be considered as the same bug since my issue has lasted singe 4.1.3. This bug is not solved. Sorry about the screenshot - let me try and put a better image up - I should have checked first.
Comment 18 User3.14 2015-10-07 14:15 EDT
Created attachment 1080754 [details]
screenshot when booting into 4.1.10-200.fc22.x86_64
Comment 19 User3.14 2015-10-07 14:19 EDT
Created attachment 1080756 [details]
another screenshot when booting into 4.1.10-200.fc22.x86_64

This is a screenshot from just now with different info
Comment 20 Josh Boyer 2015-10-08 10:05:22 EDT
What kind of machine is this?  The attached dmesg output seems to show an nvidia card but the screenshots show the radeon module is loaded.  Can you provide the output of lsusb and lspci -nnvv as text attachments please?
Comment 21 Fedora Update System 2015-10-09 06:26:57 EDT
kernel-4.1.10-200.fc22 has been pushed to the Fedora 22 stable repository. If problems still persist, please make note of it in this bug report.
Comment 22 User3.14 2015-10-09 13:49:16 EDT
lsusb

Bus 007 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 011 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
Bus 010 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
Bus 006 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 009 Device 002: ID 046d:c03d Logitech, Inc. M-BT96a Pilot Optical Mouse
Bus 009 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
Bus 005 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 008 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
Bus 004 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
Bus 003 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Comment 23 User3.14 2015-10-09 13:51 EDT
Created attachment 1081398 [details]
output of lspci -nnvv
Comment 24 Justin M. Forbes 2015-10-20 15:25:49 EDT
*********** MASS BUG UPDATE **************

We apologize for the inconvenience.  There is a large number of bugs to go through and several of them have gone stale.  Due to this, we are doing a mass bug update across all of the Fedora 22 kernel bugs.

Fedora 22 has now been rebased to 4.2.3-200.fc22.  Please test this kernel update (or newer) and let us know if you issue has been resolved or if it is still present with the newer kernel.

If you have moved on to Fedora 23, and are still experiencing this issue, please change the version to Fedora 23.

If you experience different issues, please open a new bug report for those.
Comment 25 User3.14 2015-10-21 14:37:03 EDT
Bug resolved in 4.2.3-200-fc22
Comment 26 Fedora Update System 2015-11-01 17:21:20 EST
kernel-4.1.10-100.fc21 has been pushed to the Fedora 21 stable repository. If problems still persist, please make note of it in this bug report.

Note You need to log in before you can comment on or make changes to this bug.