This service will be undergoing maintenance at 00:00 UTC, 2017-10-23 It is expected to last about 30 minutes
Bug 1462421 - kernel-4.11.5-200 x86_64 fail to boot on Thinkpad T510
kernel-4.11.5-200 x86_64 fail to boot on Thinkpad T510
Status: NEW
Product: Fedora
Classification: Fedora
Component: kernel (Show other bugs)
25
x86_64 Linux
unspecified Severity medium
: ---
: ---
Assigned To: Kernel Maintainer List
Fedora Extras Quality Assurance
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2017-06-17 08:40 EDT by jonathan baron
Modified: 2017-06-25 11:21 EDT (History)
8 users (show)

See Also:
Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description jonathan baron 2017-06-17 08:40:19 EDT
Description of problem:

Thinkpad T510 fails to book with kernel 4.11.4.200 or 4.11.5.200

Version-Release number of selected component (if applicable):

4.11.5.200

How reproducible:

always

Steps to Reproduce:
1. boot computer
2.
3.

Actual results:

error messages on screen:

272.331196 NMI watchdog BUG: soft lockup CPU#0 stuck for 22s migration 1:17

followed by the same thing for CPU#1, #2 and #3, except that for #2 it ends with system udevd:352 instead of migration 1:17

Expected results:

normal boot (but still works with 4.10.17-200 kernel)

Additional info:

The error message is the one I got for 4.11.5-200. It kept repeating for the four CPUs. For 4.11.4-200 it just got stuck on the first one and did not repeat.
Comment 1 Laura Abbott 2017-06-19 13:09:31 EDT
Can you boot with quiet removed from the kernel command line and see if you can get a picture of the full backtrace?
Comment 2 jonathan baron 2017-06-19 19:24:48 EDT
(In reply to Laura Abbott from comment #1)
> Can you boot with quiet removed from the kernel command line and see if you
> can get a picture of the full backtrace?

Can't figure out how to do this. First I trued stopping the boot and editing the command line. This used to work, but now it shows a whole bunch of stuff, many lines, not one, and none containing the word "quiet".

I found the word in /etc/grub.conf, in the listing for this kernel (in a line that looked like what I used to see), and I removed it and booted, but the screen looked the same as it did before. No additional information.

Any hints appreciated.
Comment 3 jonathan baron 2017-06-24 09:46:07 EDT
(In reply to Laura Abbott from comment #1)
> Can you boot with quiet removed from the kernel command line and see if you
> can get a picture of the full backtrace?

I figured out how to do this, I think. There was a LOT of stuff scrolling down the screen. Some of it looked like backtrace, but even that was too long to fit on one screen. I had trouble making photos with my phone. Would a video be better? I put a few of the legible ones (sometimes barely legible) in http://finzi.psych.upenn.edu/~baron/bootpics.tar. (Very large file, so I didn't attach it here.) Maybe you can use this to tell me what to look for next time and I can try to catch it.
Comment 4 jonathan baron 2017-06-25 10:41:59 EDT
I was able to boot the 4.11.6-201.fc25.x86_64 kernel by setting two options:
nouveau.noaccel=1
pci=noacpi

The first may not be necessary, but it is harmless on this old (but still extremely useful, and extensively used) computer. The second is necessary.

I also tried edd=off, but that was not necessary.

In the course of doing all this, on one of the times this kernel did not boot, I got a few more pictures. The screen had slowed down considerably, and it looked like it was providing more information than before.

However, I see that nobody has looked at my earlier pictures, so I won't bother posting these unless someone asks.
Comment 5 jonathan baron 2017-06-25 11:21:22 EDT
It occurs to me that this problem results from the fact that I remove the cdrom drive. It was broken and spinning around all the time for no purpose. And not needed anymore.

This would explain why nobody else has reported this problem or found this bug report.

In the kernel boot record that I get by hitting "e" on the initial boot screen, it has an option like
search --no-floppy

Perhaps if I add one for "--no-cdrom" it will work, but I can't find any list of such options.

Note You need to log in before you can comment on or make changes to this bug.