Bug 1648366 - System locks up when booting any 4.19 or 4.20 kernel
Summary: System locks up when booting any 4.19 or 4.20 kernel
Keywords:
Status: NEW
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: rawhide
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-11-09 14:30 UTC by Steven Usdansky
Modified: 2019-03-08 08:12 UTC (History)
20 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed:
Type: Bug
Embargoed:


Attachments (Terms of Use)
system characteristics - inxi (1.76 KB, text/plain)
2018-11-09 14:30 UTC, Steven Usdansky
no flags Details
system characteristics - lspci (9.15 KB, text/plain)
2018-11-09 14:32 UTC, Steven Usdansky
no flags Details
log of failed boot (217.37 KB, text/x-vhdl)
2018-11-09 21:14 UTC, Steven Usdansky
no flags Details
bootlog Fedora-Xfce-Live-x86_64-Rawhide-20181111.n.0.iso (256.50 KB, text/x-vhdl)
2018-11-12 02:25 UTC, Steven Usdansky
no flags Details

Description Steven Usdansky 2018-11-09 14:30:09 UTC
Created attachment 1503667 [details]
system characteristics - inxi

Description of problem:
Any attempt to boot a 4.19 or 4.20 Rawhide kernel on bare metal fails with the system locking up, sometimes before the appearance of the display manager, other times as late as the appearance of the GUI 

Version-Release number of selected component (if applicable):
Any kernel 4.19 or higher. Latest attempt with Fedora-Xfce-Live-x86_64-Rawhide-20181109.n.0.iso which contains kernel 4.20.0-0.rc1.git2.1.fc30.x86_64

How reproducible:
100%

Steps to Reproduce:
1. Create bootable USB flash drive from iso file using dd
2. Reboot from USB USB

Alternative steps to reproduce:
1. Install any 4.19 or 4.20 kernel (including no-debug kernels) to an existing Rawhide installation (which uses kernel 4.18.13-300.fc29.x86_64)
2. Reboot to new kernel

Actual results:
System locks up; sometimes as early as pre-display manager, sometimes as late as upon appearance of GUI. Have tried this with Gnome, LXQt, and now Xfce; same thing.

Expected results:
System boots and runs normally


Additional info:
Works fine in a VM, so must be something about the hardware. Booting into runlevel 3 does not solve the problem.

Comment 1 Steven Usdansky 2018-11-09 14:32:11 UTC
Created attachment 1503668 [details]
system characteristics - lspci

Comment 2 Jeremy Cline 2018-11-09 16:44:25 UTC
Hi Steven,

Can you attach the complete kernel logs when a hang occurs? You can get logs from previous boots with the "-b" option on journalctl. There are some troubleshooting tips[0] on hangs and freezes that may be helpful as well.

[0] https://docs.fedoraproject.org/en-US/quick-docs/kernel/troubleshooting/index.html

Comment 3 Steven Usdansky 2018-11-09 21:14:41 UTC
Created attachment 1503856 [details]
log of failed boot

Comment 4 Steven Usdansky 2018-11-09 22:47:15 UTC
Wondering if it's something in the Fedora kernels. Currently running manjaro-xfce-18.0-stable-x86_64.iso with kernel-4.19.0-3-MANJARO off a live USB. Been up 23 minutes; no problems.

Comment 5 Steven Usdansky 2018-11-12 02:25:40 UTC
Created attachment 1504536 [details]
bootlog Fedora-Xfce-Live-x86_64-Rawhide-20181111.n.0.iso

Fedora-Xfce-Live-x86_64-Rawhide-20181111.n.0.iso with 
   kernel 4.20.0-0.rc1.git4.1.fc30.x86_64

1. dd to USB flash drive and boot. Computer hangs.

2. Add nolapic kernel parameter. Computer boots and doesn't hang. Log attached in case it's useful. Not an acceptable solution; I really want to be able to utilize both cores and all four threads rather than just one core of my i3 6100u.

Comment 6 Steve 2018-12-17 05:19:26 UTC
(In reply to Steven Usdansky from comment #5)
> Created attachment 1504536 [details]
> bootlog Fedora-Xfce-Live-x86_64-Rawhide-20181111.n.0.iso
> 
> Fedora-Xfce-Live-x86_64-Rawhide-20181111.n.0.iso with 
>    kernel 4.20.0-0.rc1.git4.1.fc30.x86_64
...

That log file has kernel messages starting with:

Nov 11 19:54:59 localhost kernel: WARNING: possible circular locking dependency detected
Nov 11 19:54:59 localhost kernel: 4.20.0-0.rc1.git4.1.fc30.x86_64 #1 Tainted: G        W        
Nov 11 19:54:59 localhost kernel: ------------------------------------------------------
Nov 11 19:54:59 localhost kernel: kworker/u4:1/31 is trying to acquire lock:
Nov 11 19:54:59 localhost kernel: 000000005672d046 (&obj_hash[i].lock){-.-.}, at: debug_object_activate+0xb5/0x240
...
Nov 11 19:54:59 localhost kernel: Call Trace:
...

Comment 7 Steve 2018-12-17 06:27:39 UTC
(In reply to Steven Usdansky from comment #5)
> Created attachment 1504536 [details]
> bootlog Fedora-Xfce-Live-x86_64-Rawhide-20181111.n.0.iso
> 
> Fedora-Xfce-Live-x86_64-Rawhide-20181111.n.0.iso with 
>    kernel 4.20.0-0.rc1.git4.1.fc30.x86_64

Could you try that again with a newer version?

I tried this on a USB flash drive, and I was able to boot and login:

Fedora-Xfce-Live-x86_64-Rawhide-20181214.n.0.iso
kernel: 4.20.0-0.rc6.git2.1.fc30.x86_64

> 1. dd to USB flash drive and boot. Computer hangs.

"Fedora Media Writer" is a graphical tool for writing to USB flash drives:

# dnf install mediawriter

It should appear in the "System" menu after installation.

Comment 8 Steven Usdansky 2018-12-17 13:28:30 UTC
Tried Fedora-MATE_Compiz-Live-x86_64-Rawhide-20181216.n.1.iso. Created the bootable flash drive my usual way, using dd. Appears to be similar enough to Fedora-Xfce-Live-x86_64-Rawhide-20181214.n.0.iso to be relevant, as it does have the same kernel. Boot seems to proceed normally until system is ready to switch to the lightdm login screen. At that point, the system reboots. 

However...  I can boot the live iso on my problem computer from the flash drive if I add the nolapic kernel parameter.

This appears to be the same bug as Bug 1658623 

Prior to reading comment #7, I performed a full installation of Fedora-MATE_Compiz-Live-x86_64-Rawhide-20181212.n.1.iso 
(kernel-4.20.0-0.rc6.git1.1.fc30.x86_64) to a flash drive using my old HP desktop PC. That full install boots properly on both my old HP and my wife's old Dell desktop. It also requires nolapic to boot on my current desktop PC.

Comment 9 Steven Usdansky 2018-12-17 13:33:04 UTC
My bad - it's not the login screen (no such screen owhen booting from the iso), it's the Mate desktop itself. Fedora-MATE_Compiz-Live-x86_64-Rawhide-20181216.n.1.iso on flash drive boots properly on my wife's old Dell; so not a dd vs. mediawriter problem.

Comment 10 Steve 2018-12-17 15:25:31 UTC
(In reply to Steven Usdansky from comment #8)
...
> However...  I can boot the live iso on my problem computer from the flash
> drive if I add the nolapic kernel parameter.
...

Thanks for your tests. I attempted to replicate that using:

Fedora-MATE_Compiz-Live-x86_64-Rawhide-20181216.n.1.iso

copied to a USB flash drive with "Fedora Media Writer".

1. There is a login screen, but no password is required.
2. With "nolapic" on the kernel command line, there are some "Call Trace" "<IRQ>" entries in the log. That may be normal, because I also see them with my F28 system.*
3. Look for other "Call Trace" entries, such as the ones in Attachment 1504536 [details].

* Snippet:
$ journalctl -b -1 --no-hostname
...
Dec 17 06:54:31 kernel: Command line: BOOT_IMAGE=/vmlinuz-4.19.8-200.fc28.x86_64 root=[removed] ro rd.lvm.lv=[removed] nolapic
...
Dec 17 06:54:47 kernel: Call Trace:
Dec 17 06:54:47 kernel:  <IRQ>
Dec 17 06:54:47 kernel:  dump_stack+0x5c/0x80
Dec 17 06:54:47 kernel:  __report_bad_irq+0x37/0xae
Dec 17 06:54:47 kernel:  note_interrupt.cold.9+0xa/0x69
Dec 17 06:54:47 kernel:  handle_irq_event_percpu+0x6a/0x80
Dec 17 06:54:47 kernel:  handle_irq_event+0x27/0x44
Dec 17 06:54:47 kernel:  handle_level_irq+0x79/0xf0
Dec 17 06:54:47 kernel:  handle_irq+0xbf/0x100
Dec 17 06:54:47 kernel:  do_IRQ+0x49/0xd0
Dec 17 06:54:47 kernel:  common_interrupt+0xf/0xf
...

Comment 11 Wiktor Wandachowicz 2019-03-08 08:12:54 UTC
Yesterday I upgraded kernel in Fedora 29 from 4.20.8-200.fc29.x86_64 to newest 4.20.13-200.fc29.x86_64 and after restart I immediately started seeing similar issue.
Boot sequence has a moment when gray Fedora logo outline is being gradually filled on the screen, then the logo becomes colorful.
In this moment system stops loading, nothing more is being read from or written to disk (green/red indicator is not flashing), everything just hangs.

After cold reboot I selected previous 4.20.8 kernel and everything worked flawlessly.

So I rebooted once more using 4.20.13 kernel. When graphical logo was being displayed, I simply pressed [Esc] to see boot messages and possibly what would be the last line after which anything might stop.
However, this time Xfce loaded without any problems. I'm running this Fedora inside VirtualBox 6.0.4 r128413 and this is 100% repeatable.


Note You need to log in before you can comment on or make changes to this bug.