Bug 1657065 - f29 installation failure -- kernel lockups with installation media and installed system
Summary: f29 installation failure -- kernel lockups with installation media and instal...
Keywords:
Status: CLOSED INSUFFICIENT_DATA
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 29
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-12-07 01:57 UTC by karl kleinpaste
Modified: 2019-02-21 21:07 UTC (History)
17 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2019-02-21 21:07:14 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)

Description karl kleinpaste 2018-12-07 01:57:00 UTC
Description of problem:

system lockups easily induced, both with installation media and in installed system.

$ dmesg | grep -i envy
[    0.000000] DMI: HP HP ENVY x360 Convertible 15-bp1xx/83C9, BIOS F.38 02/26/2018

quad-core i7-8550U.
machine is recent, acquired ~8 months ago. runs F27 w/4.18.19-100 just fine (how i'm using it now).
installation flash drive created with F29 workstation ISO using Fedora Writer.
tested PASS on boot.

Version-Release number of selected component (if applicable):
vmlinuz-4.18.16-300

How reproducible:
every time in all circumstances

Steps to Reproduce:

several flavors.

[A] booting from install media on flash drive:

1. boot.
2. immediately ask for shutdown.
3. some time passes, window system exits to black screen.
4. long pause.
5. "NMI watchdog: Watchdog detected hard LOCKUP on cpu N"
6. dead machine, needs power switch hold to power down.
'N' varies, unsurprisingly. i did this a dozen times, saw N random from 1 to 7, but never 0.
happens every time, whether i go through installation process or not.
happens regardless of whether i install to internal SSD or external USB hard drive.

Actual results:
failure to reboot or halt.

Expected results:
laptop powered down.

[B] after installation, booting installed system:

1. boot.
2. walk through first-run dialogs to identify user, set password, etc.
3. hit final completion button.
4. window system exits. apparently trying to reboot?
5. hang/dead, but no NMI watchdog message as before.

[C] after installation, booting installed system, logging in, not rebooting:

1. boot.
2. login.
3. (guessing) gdm login screen disappears, screen goes black. starting xorg?
4. mouse cursor re-appears.
5. hang/dead/lockup/whatever. mouse unresponsive. machine is gone.
again, happens every time. i've booted this poor machine probably 3 dozen times in the last 2 days.

in all cases, machine is hung and needs power switch hold to make it power down.

the external USB disc remains with f29 installed; if someone needs me to fiddle with this installation, i can.

Comment 1 Steve 2018-12-08 11:50:13 UTC
(In reply to karl kleinpaste from comment #0)
...
> [    0.000000] DMI: HP HP ENVY x360 Convertible 15-bp1xx/83C9, BIOS F.38
> 02/26/2018
...

That sounds like:
Bug 1649067 - Fedora 29: Shut down with hard LOCKUP on CPU

That report links to this one, which has a diagnostic command and a possible workaround:
Bug 1649822 - Fedora 29: Screen hangs; Problem with NVIDIA GeForce MX150 (e.g. HP Envy x360 15-cn0008ng) 

Could you post the output from schaefi's diagnostic command?

$ lspci -k | grep -EA3 'VGA|3D|Display'

For the record, the current F29 kernel is:

# dnf repoquery kernel -q --latest-limit 1 --releasever=29
kernel-0:4.19.6-300.fc29.x86_64

Comment 2 karl kleinpaste 2018-12-08 13:32:00 UTC
yes, it's an envy w/ geforce mx150, 10de:1d10.
it's hybrid graphics, in principle i thought it should prefer the integral UHD 620 as base and ignore nvidia until specifically requested/enabled, but what do i know. in f27, i use bumblebee and favor nvidia.

lspci -k | grep -EA3 'VGA|3D|Display'
00:02.0 VGA compatible controller: Intel Corporation UHD Graphics 620 (rev 07)
	Subsystem: Hewlett-Packard Company Device 83c9
	Kernel driver in use: i915
	Kernel modules: i915
--
00:13.0 Non-VGA unclassified device: Intel Corporation Sunrise Point-LP Integrated Sensor Hub (rev 21)
	Subsystem: Hewlett-Packard Company Device 83c9
	Kernel driver in use: intel_ish_ipc
	Kernel modules: intel_ish_ipc
--
01:00.0 3D controller: NVIDIA Corporation GP108M [GeForce MX150] (rev a1)
	Subsystem: Hewlett-Packard Company Device 83c9
	Kernel driver in use: nvidia
	Kernel modules: nouveau, nvidia_drm, nvidia

as for workaround... well, no. i can't get far enough into boot & login to apply any updates, it has never once succeeded. as a result, f29 is already a dead letter to me. i file and follow this bug report to help the community, not expecting to run f29 ever.

(it figures that there are other bugs reporting this. i *did* search before reporting, but i didn't look specifically for "lockup," i searched on e.g. watchdog and hang and so forth. ohwell.)

Comment 3 Steve 2018-12-08 14:42:13 UTC
(In reply to karl kleinpaste from comment #2)
> yes, it's an envy w/ geforce mx150, 10de:1d10.
...

Thanks for your report. The only difference I can see is:

Yours:     Subsystem: Hewlett-Packard Company Device 83c9
schaefi's: Subsystem: Hewlett-Packard Company Device 8484

> as for workaround... well, no. i can't get far enough into boot & login to
> apply any updates, it has never once succeeded. as a result, f29 is already
> a dead letter to me. i file and follow this bug report to help the
> community, not expecting to run f29 ever.

Here are two things to try:

1. schaefi's workaround (Bug 1649822), which is to add this to the kernel command-line from grub:

nouveau.modeset=0

schaefi also says that "secure boot must be always disabled".

2. If you have a Fedora net install image on a CD or a USB drive, you can boot into "Troubleshooting" mode, select "Rescue a Fedora system", do chroot, and run "dnf update" from there:

Howto Rescue a Fedora System?
https://ask.fedoraproject.org/en/question/23472/howto-rescue-a-fedora-system/

It may take a few tries to figure how it all works, but "Rescue" mode is an incredibly powerful recovery method (although restricted to command-line tools):

Use this to "Rescue a Fedora system":
Fedora-Workstation-netinst-x86_64-29-1.2.iso
(earlier versions would probably work too)

NB: The Workstation install image doesn't have the "Rescue a Fedora system" option.

> (it figures that there are other bugs reporting this. i *did* search before
> reporting, but i didn't look specifically for "lockup," i searched on e.g.
> watchdog and hang and so forth. ohwell.)

I searched for the error message you reported: "NMI watchdog: Watchdog detected hard LOCKUP on cpu".

Bugzilla lets you search comments from the "Advanced Search" tab (Click " Detailed Bug Information"). The "Simple Search" rarely works for me.

Comment 4 Steve 2018-12-08 15:41:27 UTC
(In reply to Steve from comment #3)
...
> Here are two things to try:
...

One more thing:

3. Boot into runlevel 3 by appending "3" to the kernel command-line from grub. That will give you a login prompt for a console. Networking is started, so you should be able to run "dnf update" from the console.

Comment 5 karl kleinpaste 2018-12-08 21:57:05 UTC
thanx for the attention. i will try the modeset solution next weekend; this machine is my workhorse and i can't afford the time to experiment until then. but basically the solution is to get nouveau out of the way -- as soon as i can get it far enough to install proprietary nvidia, the problem will evaporate.

not sure what etiquette is involved, in whether this should be left open but marked dup with the others, or closed with reference to dup. your call.

Comment 6 karl kleinpaste 2018-12-15 02:47:02 UTC
fyi

nouveau.modeset=0 let it move forward. however, the result was a screen that appeared to be using 1920x1080 in an 800x600 motif. gigantic windows with fuzzed resolution detail, and not responding well to mouse.

so i rebooted to state 3 to avoid window system. did dnf update. took the other bug report's suggestion to install real nvidia driver immediately, getting nouveau out of the way. also installed @mate because that's my preference. rebooted, logged in requesting mate...

...and the screen simply went black. that's all. nothing.

ctrl-alt-DIGIT still worked, so i could move between tty1 with gdm, tty2 being black, and tty3 for a text login. ps(1) showed that Xorg was in fact running, the entire mate session environment was active, but the screen stayed black on tty2. useless.

this is clearly no longer a kernel -vs- mx150 problem, it's an X driver problem, so this report can be closed as successful for getting past the installation, and past post-install update. but i have other problems, and i'm just not sure i'm motivated to keep chasing them down. this is the first time in a decade that i've had trouble simply bringing a new fedora release to life. i would prefer to move forward to f29, but f27 is flawless including nvidia driver.

Comment 7 Justin M. Forbes 2019-01-29 16:13:46 UTC
*********** MASS BUG UPDATE **************

We apologize for the inconvenience.  There are a large number of bugs to go through and several of them have gone stale.  Due to this, we are doing a mass bug update across all of the Fedora 29 kernel bugs.

Fedora 29 has now been rebased to 4.20.5-200.fc29.  Please test this kernel update (or newer) and let us know if you issue has been resolved or if it is still present with the newer kernel.

If you experience different issues, please open a new bug report for those.

Comment 8 Justin M. Forbes 2019-02-21 21:07:14 UTC
*********** MASS BUG UPDATE **************
This bug is being closed with INSUFFICIENT_DATA as there has not been a response in 3 weeks. If you are still experiencing this issue, please reopen and attach the relevant data from the latest kernel you are running and any data that might have been requested previously.


Note You need to log in before you can comment on or make changes to this bug.