Bug 1131765 - Soft hangs on an i686 machine with 3.17-rc1
Summary: Soft hangs on an i686 machine with 3.17-rc1
Keywords:
Status: CLOSED RAWHIDE
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: rawhide
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-08-20 04:00 UTC by Bruno Wolff III
Modified: 2014-08-22 04:12 UTC (History)
7 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2014-08-22 04:12:33 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
Picture of alt-sysrq-c traceback during hang (1.00 MB, image/jpeg)
2014-08-20 04:00 UTC, Bruno Wolff III
no flags Details

Description Bruno Wolff III 2014-08-20 04:00:24 UTC
Created attachment 928594 [details]
Picture of alt-sysrq-c traceback during hang

Description of problem:
I am getting a soft hang (ctrl-atl-del still starts a reboot) during boots with 3.17-rc1 (but not 3.16 even after rebuilding the initramfs to match 3.17-rc1).
This doesn't happen on my F21 based x86_64 machine.

I am still having trouble collecting netconsole output when the problem is in the early boot. So I took a picture of the traceback from alt-sysrq-c while the hang was in progress. I don't know if that will tell you what is hanging.

Version-Release number of selected component (if applicable):
kernel-PAE-core-3.17.0-0.rc1.git0.1.fc22.i686

Comment 1 Josh Boyer 2014-08-20 11:53:17 UTC
Yeah, that picture doesn't really tell us anything other than you forced a sysrq-c.

If you can get netconsole working, that woudl be helpful.  The only other avenue is bisection I suppose.

Comment 2 Bruno Wolff III 2014-08-20 12:02:58 UTC
I don't think netconsole is going to provide much help. The boot process appeared fairly normal up until where it hung. It was before asking for the luks passwords. USB devices were being detected.
I'll plan on going the bisect route.

The netconsole stuff is weird, because I don't get anything from the 3.17-rc1 boot, but then I seem to get everything from the following 3.16 boot. It looks like the output gets queued up and then all gets sent once netconsole is up.

I do see errors in the early booting for netconsole. Typically it is because the network device isn't valid yet. (I tried using eth0, and eth1 instead of p7p1 on the kernel parameters, but that didn't get me any output when the system hangs.)

I have another system I can test on, that I have been holding off on. But I'll do something quick. It has a different USB setup and if it boots that might point to some sort of USB problem. I have seen issues in the past with USB 2 devices connected tp USB 1.1 on the motherboard.

Comment 3 Bruno Wolff III 2014-08-20 12:08:28 UTC
My other i686 machine, which is f21 with rawhide nodebug kernels did boot normally.
So I wouldn't be surprised if the issue was USB related, but it's too early to tell for sure. I'll start going down the bisect route.

Comment 4 Bruno Wolff III 2014-08-21 13:01:06 UTC
I verified the vanilla kernels work the same way (v3.16 is OK and v3.17-rc1 is broken), so I should be able to bisect this. It will probably take a week.

I'll also be testing the new Fedora kernels in case the problem gets fixed independently.

Comment 5 Bruno Wolff III 2014-08-22 04:12:33 UTC
This appears to be fixed in 3.17.0-0.rc1.git1.2.fc22.1.i686+PAE, which is going to save me a lot of trouble.


Note You need to log in before you can comment on or make changes to this bug.