Bug 1045851 - Dell Vostro 3300 hangs at boot after update from linux kernel 3.11 to 3.12 (and even 3.13.0-0.rc4)
Summary: Dell Vostro 3300 hangs at boot after update from linux kernel 3.11 to 3.12 (a...
Keywords:
Status: CLOSED EOL
Alias: None
Product: Fedora
Classification: Fedora
Component: xorg-x11-drv-nouveau
Version: 20
Hardware: Unspecified
OS: Linux
unspecified
unspecified
Target Milestone: ---
Assignee: Ben Skeggs
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-12-22 14:58 UTC by Rémi G.
Modified: 2015-06-29 13:43 UTC (History)
14 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-06-29 13:43:22 UTC
Type: Bug


Attachments (Terms of Use)
dmesg output (101.54 KB, text/plain)
2013-12-22 14:58 UTC, Rémi G.
no flags Details
bisection log file (1.67 KB, text/x-log)
2014-01-02 16:44 UTC, Rémi G.
no flags Details
Bisection log file (first good commit in the linux kernel 3.14) (3.42 KB, text/plain)
2014-02-09 16:24 UTC, Rémi G.
no flags Details

Description Rémi G. 2013-12-22 14:58:51 UTC
Created attachment 840358 [details]
dmesg output

* Description of problem:

The system hangs during start-up. GDM is generally displayed on 3.12.5 kernel, but I can't connect or even enter a virtual console.

* Version-Release number of selected component (if applicable):

kernel-3.12.5-302.fc20 and later
(I have tried kernel-3.13.0-0.rc4.git4.1.fc21 from rawhide too)

* Additional information:

Everything was working fine on linux kernel 3.11.10-301.fc20 and before.

I have been able to log in once in a virtual console and I have attached the dmesg output. The following lines come back very often:
[  122.636172] nouveau W[   PFIFO][0000:01:00.0] unknown intr 0x04200000, ch 2
[  122.636370] ALSA sound/pci/hda/hda_eld.c:338 HDMI: invalid ELD buf size -1
[  125.293200] nouveau W[   PFIFO][0000:01:00.0] unknown intr 0x04000000, ch 2
[  174.749043] i8042: Can't write CTR while closing AUX port
[  175.273866] i8042: Can't reactivate AUX port

My laptop is a Dell Vostro 3300 which has Intel/nVidia hybrid graphics (but not optimus I think).

I'm not a developer but I have time to help. Maybe I should have report this bug against "nouveau" but since I don't really know I let it up to you.

Comment 1 joaquingutierrezgil 2013-12-28 23:48:07 UTC
I have the same problem with this kernel.

My video card is GeForce GT 525M/PCIe/SSE2 and I am using the official drivers.

Comment 2 Rémi G. 2013-12-30 12:32:26 UTC
I have tried older kernels available on koji to narrow down my problem:

 - kernel-3.12.0-0.rc0.git11.1.fc21 is "good" (version v3.11-3891-gae7a835 ?)
 - kernel-3.12.0-0.rc0.git12.1.fc21 is "bad" (version v3.11-4809-ga09e9a7 ?)

I'm trying now to bisect the Linux kernel following the guide below:

  https://fedoraproject.org/wiki/User:Ignatenkobrain/Kernel/Bisection

I faced some issues though:

 - I had to install a few other packages:
    GitPython rpmdevtools yum-utils
 - Since it seems to require more than 10 Go of disk memory (wow...), I also added the line below to /etc/fstab to avoid running out of space following an old message (http://www.redhat.com/archives/fedora-maintainers/2005-July/msg00175.html):
    /home/myuser/mock /var/lib/mock auto bind 0 0

Here are the commands I used:

    git bisect start
    git bisect bad v3.11-4809-ga09e9a7
    git bisect good v3.11-3891-gae7a835
    ~/kernel-package/kernel-package.py
    mock -r fedora-20-x86_64 --rebuild sources/*.src.rpm --resultdir sources/rpms

Unfortunately, I encountered an error after ~30 minutes (found in linux/sources/rpms/build.log):

    + pushd tools/thermal/tmon/
    ~/build/BUILD/kernel-3.10.fc20/linux-3.11.0-0.rc3.git999.1.9c784855.fc20.x86_64
    /var/tmp/rpm-tmp.1KvKxt: line 295: pushd: tools/thermal/tmon/: No such file or directory
    error: Bad exit status from /var/tmp/rpm-tmp.1KvKxt (%build)
    RPM build errors:
        Bad exit status from /var/tmp/rpm-tmp.1KvKxt (%build)
    Child return code was: 1
    EXCEPTION: Command failed. See logs for output.

So I have a few questions:
1) Did I use the right kernel version numbers?
2) Did I do something wrong?
3) How can I avoid the build error?

I'll try to get help on some forums.
In the meantime, I hope that helps.

Comment 3 Michele Baldessari 2014-01-01 17:27:11 UTC
Hi Remi,

thanks for your work. So between good (ae7a835) and bad (a09e9a7) we have ~900
changes:
$ git lg ae7a835..a09e9a7 |wc -l
899

If this is really a nouveau issue only the changeset is much smaller:
git lg ae7a835..a09e9a7 drivers/gpu/drm/nouveau/ |wc -l
27

The first commit of these 27 long list is: 
4cb4ea3 - (2013-07-23 19:25:02 +1000)  drm/nouveau: drop DRIVER_PCI_DMA and DRIVER_SG <Daniel Vetter>

and the last one is:
c859074 - (2013-09-04 13:48:56 +1000)  drm/nouveau: fix command submission to use vmalloc for big allocations <Maarten Lankhorst>

So the best bet would be to do:
1) Verify that with kernel at commit 4cb4ea3~1 is good and the one at
c859074 is bad.
2) Bisect from there. Given that it is 27 steps only it should be pretty quick.

As to the exact steps to bisect, I personally just checkout the linux-stable
tree, copy the Fedora /boot/config-3.X.Y... file to ".config" in the Linux source trees, run "make oldconfig" (accepting the default values) and then
to the compilation, installation runs and keep bisecting:
make -j4
sudo make modules_install install
reboot (into newer kernel)
if it works set it as good via "git bisect good", else reboot in older kernel and set the test as "git bisect bad"

Doing it without rpm is a bit faster as at least in the last git bisect stages
the compilation goes much faster as only few files are in need of a recompile

If the above bisection does not bring us closer to the culprit commit, then
the issue might be somewhere else (acpi layer, etc.)

hth,
Michele

Comment 4 Rémi G. 2014-01-02 16:44:35 UTC
Created attachment 844618 [details]
bisection log file

Hi Michele,

First, thanks for the details, you were very helpful!

I managed to bisect the kernel considering only drivers/gpu/drm/nouveau/ changes using the following commands:

    git bisect start -- drivers/gpu/drm/nouveau/
    git bisect good 4cb4ea3~1
    git bisect bad c859074
    ...

I found that the commit 5addcf0 is causing my issue (see attachment for the bisection log):

    # first bad commit: [5addcf0a5f0fadceba6bd562d0616a1c5d4c1a4d] nouveau: add runtime PM support (v0.9)

To be sure that was a nouveau issue, I also checkout the revision just before (13bb9cc) and compiled the kernel. Maybe that was unnecessary but it was working well.

Hope that helps,

Rémi.

Comment 5 Michele Baldessari 2014-01-02 18:33:54 UTC
Hi Rémi,

excellent job, thank you. If you boot with the parameter "nouveau.runpm=0" does it work?
This should bring the old behaviour back and let you boot.

regards,
Michele

ps. I'll also go ahead and open a BZ upstream as well

Comment 6 Rémi G. 2014-01-02 22:27:23 UTC
Indeed, adding the boot parameter "nouveau.runpm=0" works for me.

Thank you for the workaround and let me know if you need more information.

Rémi.

Comment 7 August Schwerdfeger 2014-01-08 20:09:29 UTC
I also experienced this issue with a Vostro 3700. The 'nouveau.runpm=0' workaround worked for me as well.

Comment 8 Rémi G. 2014-01-25 17:42:18 UTC
Well, good news for me, I have tried some early 3.14 kernels available on koji and it seems that my issue has been solved somewhere between the two following kernels:
 - kernel-3.14.0-0.rc0.git2.1.fc21 (v3.13-3260-g03d11a0)
 - kernel-3.14.0-0.rc0.git3.1.fc21 (v3.13-2502-gec513b1)

Will it be useful If I found which commit solved my problem?
I was wondering if this could be backported to 3.12 and 3.13 kernels.

Comment 9 Rémi G. 2014-02-09 16:24:29 UTC
Created attachment 861067 [details]
Bisection log file (first good commit in the linux kernel 3.14)

I did a new bisection to find the first good commit in the 3.14 Linux kernel. If I dit it correctly, it should be this one:

  commit 258753361534a40ad7180c742da813fc659e427b
  Merge: 315fba8 2aff4c9
  Author: Takashi Iwai <tiwai>
  Date:   Mon Jan 20 10:20:14 2014 +0100
  
      Merge branch 'for-next' into for-linus

Since it is a merge commit, I'm afraid it is not of any help.

Comment 10 Rémi G. 2014-04-23 14:47:52 UTC
As expected, latest kernel from updates-testing repository solves my issue:

  kernel-3.14.1-200.fc20 (https://admin.fedoraproject.org/updates/kernel-3.14.1-200.fc20)

I think the bug can be closed (I let you do it, I don't know what to select).

Comment 11 Fedora End Of Life 2015-05-29 10:06:40 UTC
This message is a reminder that Fedora 20 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 20. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as EOL if it remains open with a Fedora  'version'
of '20'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 20 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 12 Fedora End Of Life 2015-06-29 13:43:22 UTC
Fedora 20 changed to end-of-life (EOL) status on 2015-06-23. Fedora 20 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.


Note You need to log in before you can comment on or make changes to this bug.