Bug 806559 - kernel 3.3.0-0.rc1.i686 and later will not boot on Toshiba Satellite A85-S107
Summary: kernel 3.3.0-0.rc1.i686 and later will not boot on Toshiba Satellite A85-S107
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 16
Hardware: i686
OS: Linux
unspecified
high
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-03-24 19:45 UTC by Bill Gianopoulos
Modified: 2012-04-23 23:23 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2012-04-23 23:14:04 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)

Description Bill Gianopoulos 2012-03-24 19:45:00 UTC
Description of problem: Hangs during boot


Version-Release number of selected component (if applicable): Linux-kernel 3.3.0-4.f16.i686


Version 3.2.9-2.fc16.f16.i686 kernel boots just fine.

Tried nomodeset and acpi=off, but that was no help.

Comment 1 Dave Jones 2012-03-26 15:55:23 UTC
try pcie_aspm=force

Comment 2 Bill Gianopoulos 2012-03-26 23:19:59 UTC
(In reply to comment #1)
> try pcie_aspm=force

I tried both pcie_aspm=force and pcie_aspm=off.  neither one helped.  THis system boots fine with version 3.2.9 kernel.

I should also mention that the last thing displayed on the screen is the echo message about "Loading initial ramdisk" echoed by grub2.  The next thing that happens on a successful 3.2.9 boot is that the screen blanks and the display mode changes.  That is why I initially tried nomodeset.  This does not seem to get to the point where it writes any log files.

Comment 3 Josh Boyer 2012-03-27 12:47:55 UTC
Can you try adding 'nomodeset'?  If that works, what kind of graphics card is in the system?

Comment 4 H.J. Lu 2012-03-27 16:45:32 UTC
Please try patch in

https://bugzilla.kernel.org/show_bug.cgi?id=42979

Comment 5 Josh Boyer 2012-03-27 16:59:41 UTC
Matthew posted a fix for that here:

https://lkml.org/lkml/2012/3/27/170

Seems better than the patch in kernel.org 42979

Comment 6 Josh Boyer 2012-03-27 17:18:47 UTC
I've started a scratch build with the patch from comment #5 included here:

http://koji.fedoraproject.org/koji/taskinfo?taskID=3937168

Please test that when it completes and let us know if it fixes your issue.

Comment 7 Bill Gianopoulos 2012-03-28 01:44:17 UTC
(In reply to comment #6)
> I've started a scratch build with the patch from comment #5 included here:
> 
> http://koji.fedoraproject.org/koji/taskinfo?taskID=3937168
> 
> Please test that when it completes and let us know if it fixes your issue.

Sorry.  This kernel works no differently for me. :-(

Comment 8 Bill Gianopoulos 2012-03-28 22:18:42 UTC
Is there a kernel someplace that I can try with the patch from kernel.org 42979?

Comment 9 Bill Gianopoulos 2012-04-02 00:42:22 UTC
The version 3.3.0-8 kernel that was just released and supposedly contains a fix for the aspm issue also does NOT boot on this configuration.

Comment 10 Bill Gianopoulos 2012-04-04 22:34:49 UTC
I am not entirely sure that my issue here is at all aspm related.

Comment 11 Bill Gianopoulos 2012-04-15 11:36:49 UTC
This still fails with Kernel-3.3.1-5.fc16.i686.

By removing "quiet" and setting "earlyprintk=vga" I was able to get this output on the console:

found SMP MP-table at [c00f6290] f6290
BUG: Int 6: CR2   (null)
     EDI   (null)  ESI 00000001  EBP c0b45f20  ESP c0b45ee8
     EBX 0009fd70  EDX 00000006  ECX c0c60660  EAX 0009fd70
     err   (null)  EIP c0bf84ab   CS 00000060  flg 00010046
STACK: 0009fd70   (null)   (null) c0b45f20 c0bc7119   (null) c0b45f20 c00f6290
       00000001   (null) c0b45f40 c0bc71fa c0a84a84 c00f6290 000f6290   (null)
         (null)   (null) c0b45f48 c0bc7d4b c0b45fc0 c0bbdf04 00060000   (null)
Pid: 0, comm: swapper Not tainted 3.3.1-5.fc16.i686 #1
Call Trace:
 [<c0926d56>] ? printk+0x2d/0x2f
 [<c091ad89>] early_fault+0x2e/0x2e
 [<c0bf84ab>] ? memblock_reserve+0x51/0x70
 [<c0bc7119>] ? get_mpc_size+0x22/0x4b
 [<c0bc71fa>] smp_scan_config+0xb8/0xd1
 [<c0bc7d4b>] default_find_smp_config+0x35/0x51
 [<c0bbdf04>] setup_arch+0x5a6/0xaf2
 [<c04587d8>] ? __mutex_init+0x8/0x30
 [<c0bd1769>] ? cgroup_init_subsys+0x9f/0xb3
 [<c0bbb4a6>] start_kernel+0xb8/0x35d
 [<c0bbb078>] i386_start_kernel+0x78/0x7d

Comment 12 Bill Gianopoulos 2012-04-15 12:22:43 UTC
I should mention that i tried nosmp and maxcpus=1 with no help.

kernel-3.2.9-2 boots just fine.

Comment 13 Bill Gianopoulos 2012-04-15 12:43:48 UTC
Also from looking at other posts in forums tried maxcpus=0.  Still get the same failure.

Comment 14 Bill Gianopoulos 2012-04-15 12:53:09 UTC
I should also not that the bios is Phoenix V1.30, which is the latest version for this model, according to the Toshiba website.

Comment 15 Bill Gianopoulos 2012-04-15 20:28:01 UTC
So it would seem that this issue is SMP related, so please stop posting things to fix ASPM related issues.  That does NOT seem to be relalted to the problem here.

Comment 16 Josh Boyer 2012-04-16 19:48:48 UTC
(In reply to comment #15)
> So it would seem that this issue is SMP related, so please stop posting things
> to fix ASPM related issues.  That does NOT seem to be relalted to the problem
> here.

It would be very helpful if you were able to do a git bisect on this machine to narrow down the commit that introduced the bug.  Alternatively, you could use the 3.3-rcX builds in koji to do something similar to narrow it down.

Would you be willing to do that?

Comment 17 Bill Gianopoulos 2012-04-16 23:09:40 UTC
(In reply to comment #16)
> (In reply to comment #15)
> > So it would seem that this issue is SMP related, so please stop posting things
> > to fix ASPM related issues.  That does NOT seem to be relalted to the problem
> > here.
> 
> It would be very helpful if you were able to do a git bisect on this machine to
> narrow down the commit that introduced the bug.  Alternatively, you could use
> the 3.3-rcX builds in koji to do something similar to narrow it down.
> 
> Would you be willing to do that?

I am not sure trying to do a git bisect would be useful as I have no idea if i have any ability to build a working kernel using this configuration.  On the other hand, I would be perfectly willing to try different rcx builds form koji if you actually provide a link and don't just assume I know what that means.

Comment 18 Josh Boyer 2012-04-17 01:11:54 UTC
OK.  So since 3.2.9 works, make sure that stays installed on your system.  Then, simply download the kernel you need from each of the links below in order, install it, and let us know which one is the first to fail to boot with the same symptoms:

http://koji.fedoraproject.org/koji/buildinfo?buildID=294806 (rc1)
http://koji.fedoraproject.org/koji/buildinfo?buildID=296635 (rc2)
http://koji.fedoraproject.org/koji/buildinfo?buildID=298302 (rc3)
http://koji.fedoraproject.org/koji/buildinfo?buildID=300504 (rc4)
http://koji.fedoraproject.org/koji/buildinfo?buildID=301620 (rc5)
http://koji.fedoraproject.org/koji/buildinfo?buildID=306717 (rc7)

rc6 is skipped because there's no normal build left in koji.  Honestly, I would almost expect the first link to exhibit the issues you reported as that is when the bulk of the changes are in by.  However, maybe we'll get lucky and it's in a later RC.

If you hit it with rc1, we're into git bisect territory.  You should be able to build a kernel just fine on your machine while it's booted into 3.2.9 and I can help you through that if you'd like.

Comment 19 Christopher 2012-04-17 10:56:24 UTC
The Toshiba A85-s107 that I am using is having the exact same issues. Kernel 3.2.9 loads, but 3.3 and above fail to initialize the ramdisk.

Comment 20 Bill Gianopoulos 2012-04-20 12:41:55 UTC
(In reply to comment #18)
> rc6 is skipped because there's no normal build left in koji.  Honestly, I would
> almost expect the first link to exhibit the issues you reported as that is when
> the bulk of the changes are in by.  However, maybe we'll get lucky and it's in
> a later RC.
> 
> If you hit it with rc1, we're into git bisect territory.  You should be able to
> build a kernel just fine on your machine while it's booted into 3.2.9 and I can
> help you through that if you'd like.

You are correct in your assumption.  It fails to boot on the rc1 build.  On a whim, thinking perhaps this was fixed in 3.4, I also tried a 3.4 rc3 kernel and it hits the same issue.

I would be very happy to try to bisect this but I have several issues with doing this:

1.  This is the simplest one that I am sure I can figure out myself.  i have no idea how to do a git bisect but I am sure I could figure that out on my own.  I would be hopeful it is as simple as doing this with mercurial, and there is a built-in bisect command in git that does all the heavy-lifting for me.   If not, however, I did use to do these manually back when the project I am more familiar with doing code for was using CVS.

2.  I did try building kernels on this system when it was thought this was an ASPM issue and tried building kernels with different suggested fixes.  It takes over 8 hours to build a kernel on this machine.

3.  Related to problem #2, the instructions i have found on the web on how to build the fedora kernel seem to NOT do depend makes and always recompile the whole thing.

4.  I do have faster systems I could use to do the build, but they are all running 64-bit versions of fedora.  I am not sure I have known to work instructions on how to build a working 32-bit kernel using a 64-bit OS.

So, any help you can give on how to do this (particularly on issues 2 through 4) would be appreciated.  I have not looked into issue 1 because the other issues kind of mean I really don;t have time to do this bisect unless they can be resolved.

Comment 21 Bill Gianopoulos 2012-04-21 13:30:58 UTC
OK I looked up git bisect and it works similarly to hg bisect so I can figure that part out.

So what I need help with is:

1.  How to i create a git repository corresponding to rc1?

2.  What do I specify as a revision that corresponds to the working version 3.2.9-2?

3.  What is the correct procedure to build the kernel form the git repository that will result in not recompiling the entire kernel at each bisect try?

The build instructions I followed previously resulted in recompiling everything after I made a one line change to one routine.

Comment 22 Josh Boyer 2012-04-23 13:22:53 UTC
Before we do the git bisect route, it might be worth trying the 3.3.2-6 kernel.  Bug 811225 has a very similar failure scenario and we fixed that with a backported patch from upstream.  It might fix the issue for you.

If that doesn't work, I'd be happy to help with the git bisect steps.

Comment 23 Bill Gianopoulos 2012-04-23 22:11:59 UTC
(In reply to comment #22)
> Before we do the git bisect route, it might be worth trying the 3.3.2-6 kernel.
>  Bug 811225 has a very similar failure scenario and we fixed that with a
> backported patch from upstream.  It might fix the issue for you.
> 
> If that doesn't work, I'd be happy to help with the git bisect steps.

The 3.3.2-6 kernel fixes this issue for me.  At some point though, I will probably be interested again in the bisect steps.

Comment 24 Josh Boyer 2012-04-23 23:14:04 UTC
(In reply to comment #23)
> (In reply to comment #22)
> > Before we do the git bisect route, it might be worth trying the 3.3.2-6 kernel.
> >  Bug 811225 has a very similar failure scenario and we fixed that with a
> > backported patch from upstream.  It might fix the issue for you.
> > 
> > If that doesn't work, I'd be happy to help with the git bisect steps.
> 
> The 3.3.2-6 kernel fixes this issue for me.  At some point though, I will
> probably be interested again in the bisect steps.

Excellent.  I'm going to close this bug out then.  Your help and willingness to test has been much appreciated.

As for the bisect steps, I think I'm going to spend some time this week writing up a wiki page describing how to do both 'koji bisect' and then a git bisect after narrowing down the koji builds.

Comment 25 Bill Gianopoulos 2012-04-23 23:23:36 UTC
OK great.  I think I did a koji bisect once before for an atheros wireless bug I ran into.


Note You need to log in before you can comment on or make changes to this bug.