Created attachment 604551 [details]
Description of problem:
At first I thought this was a plymouth bug, but it appears to be a kernel bug to me right now.
When using modesetting (aka default), the laptop screen remains off until Xorg starts, and if I hook up an external screen I can see the boot progress on the external screen only.
Furthermore, the screen remains off from ~3 minutes after waking up from suspend.
Version-Release number of selected component (if applicable):
Linux alfmobile 3.6.0-0.rc1.git6.1.fc18.x86_64 #1 SMP Tue Aug 14 12:13:12 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux
Steps to Reproduce:
[elad@alfmobile ~]$ lspci
00:00.0 Host bridge: Intel Corporation Core Processor DRAM Controller (rev 18)
00:02.0 VGA compatible controller: Intel Corporation Core Processor Integrated Graphics Controller (rev 18)
00:16.0 Communication controller: Intel Corporation 5 Series/3400 Series Chipset HECI Controller (rev 06)
00:1a.0 USB Controller: Intel Corporation 5 Series/3400 Series Chipset USB2 Enhanced Host Controller (rev 06)
00:1b.0 Audio device: Intel Corporation 5 Series/3400 Series Chipset High Definition Audio (rev 06)
00:1c.0 PCI bridge: Intel Corporation 5 Series/3400 Series Chipset PCI Express Root Port 1 (rev 06)
00:1c.1 PCI bridge: Intel Corporation 5 Series/3400 Series Chipset PCI Express Root Port 2 (rev 06)
00:1c.5 PCI bridge: Intel Corporation 5 Series/3400 Series Chipset PCI Express Root Port 6 (rev 06)
00:1d.0 USB Controller: Intel Corporation 5 Series/3400 Series Chipset USB2 Enhanced Host Controller (rev 06)
00:1e.0 PCI bridge: Intel Corporation 82801 Mobile PCI Bridge (rev a6)
00:1f.0 ISA bridge: Intel Corporation Mobile 5 Series Chipset LPC Interface Controller (rev 06)
00:1f.2 SATA controller: Intel Corporation 5 Series/3400 Series Chipset 6 port SATA AHCI Controller (rev 06)
00:1f.3 SMBus: Intel Corporation 5 Series/3400 Series Chipset SMBus Controller (rev 06)
00:1f.6 Signal processing controller: Intel Corporation 5 Series/3400 Series Chipset Thermal Subsystem (rev 06)
03:00.0 Network controller: Broadcom Corporation BCM4313 802.11b/g/n Wireless LAN Controller (rev 01)
04:00.0 Ethernet controller: Atheros Communications AR8152 v1.1 Fast Ethernet (rev c1)
ff:00.0 Host bridge: Intel Corporation Core Processor QuickPath Architecture Generic Non-core Registers (rev 05)
ff:00.1 Host bridge: Intel Corporation Core Processor QuickPath Architecture System Address Decoder (rev 05)
ff:02.0 Host bridge: Intel Corporation Core Processor QPI Link 0 (rev 05)
ff:02.1 Host bridge: Intel Corporation Core Processor QPI Physical 0 (rev 05)
ff:02.2 Host bridge: Intel Corporation Core Processor Reserved (rev 05)
ff:02.3 Host bridge: Intel Corporation Core Processor Reserved (rev 05)
Created attachment 604553 [details]
dmesg output with drm.debug=0x04
Laptop model is Dell Inspiron N3010.
I would happily provide any more information required to understand the source of the problem and maybe even fix it.
Note that this bug was not present in Fedoras 17.
Maybe a combination of plymouth failing to deal with configuration changes and the kernel changing configuration shortly after plymouth is started.
i think should be fixed by:
plymouth-0.8.7-1.fc18 has been submitted as an update for Fedora 18.
* should fix your issue,
* was pushed to the Fedora 18 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing plymouth-0.8.7-1.fc18'
as soon as you are able to.
Please go to the following url:
then log in and leave karma (feedback).
Unfortunately this update did not fix the problem (installed it and rebuilt the initramfs, rebooted, still same problem)
Fedora Bugzappers volunteer triage team
Created attachment 606060 [details]
dmesg output with drm.debug=4 with plymouth update installed and kernel 3.6.0-0.rc1.git6.1.fc18.x86_64
Will try with a more recent kernel from koji shortly
Fixed in kernel-3.6.0-0.rc2.git1.2.fc18
Re-opening as the update to plymouth isn't pushed stable. airlied and halfline say that the bug in plymouth - fixed in 0.8.6.2-1.2012.07.23 with the note "fix plymouth race at bootup breaking efi/vesa handoff" - is a significant issue which can cause quite a lot of issues; as well as this problem, for instance, it's probably the reason the 'cirrus' kernel module doesn't work right in Alpha and you get vesa not modesetting as the X driver in cirrus VMs. So, nominating this as NTH to get https://admin.fedoraproject.org/updates/FEDORA-2012-12334/plymouth-0.8.7-1.fc18 into Alpha.
Discussed at 2012-09-12 NTH review meeting. Accepted as NTH due to the possibility of the bug causing serious problems; airlied and halfline both believe it's the best course to pull the fix, as the race could affect all sorts of things. We already know it breaks the cirrus kernel module, for instance.
plymouth-0.8.7-1.fc18 has been pushed to the Fedora 18 stable repository. If problems still persist, please make note of it in this bug report.
So some complete idiot decided to put this in RC3. On my desktop, it actually seems to make things *worse*.
When I boot the RC3 desktop live image, whether via UEFI or BIOS compat, it usually fails to start X cleanly. Often I just see a bunch of boot messages on tty1 - though systemctl status gdm.service and ps agree that gdm is running, on tty1. Sometimes I get a completely corrupted graphical display. Never do I get a fully working gdm, or shell. Sometimes I get a gdm which more or less works, but no cursor; if I try to log in, I get a Shell with completely corrupted graphics.
If I go to a console and restart gdm.service, it sometimes comes up, though it's often weird - sluggish cursor response. If it comes up and I can get into Shell, it'll often be corrupted as above.
By comparison, RC2 - which used the old plymouth - works perfectly, booted in BIOS mode. booted in UEFI native mode it behaves as described in this bug - the screen goes into power-saving mode until GDM comes up. But GDM *does* come up, and works perfectly. No failure, no weird corruption. Shell is likewise fine.
I built two live images with the sole difference being the version of plymouth included, just to confirm plymouth is the problem. It is. The 'oldplymouth' image behaves exactly as I described RC2 above, the 'newplymouth' image behaves as I described RC3. RC3/newplymouth behaviour can be pretty varied, but in every attempt it was broken _somehow_, and it was _always_ worse than _any_ RC2/oldplymouth boot.
This doesn't seem to be universal, though. RC3 is fine in a VM, for instance, and others have reported it's ok on their hardware.
My system has a GeForce 9600 GT graphics adapter. It's also quite fast - I know speed can matter to these bugs. The system drive is an SSD, so it boots in single-digit seconds.
oh, forgot to mention - the one way I can actually get to a working desktop with RC3/newplymouth is to boot to 'runlevel 3' and then do 'systemctl start gdm.service'. then gdm comes up working fine.
Looking at dmesg | grep nouveau, there are a *ton* of messages of the form:
[drm] nouveau 0000:01:00.0: PFIFO_CACHE_ERROR - Ch 2/2 Mthd 0xfoof Data 0xfoofoofo
the values for Mthd and Data change with each message, all the rest remains the same.
oh, sometimes it's Ch 2/7 not 2/2.
kparal confirms this on a Quadro NVS 140M, and airlied on a geforce 9300m, so it's looking like this affects a lot of NVIDIA hardware. Bumping up to proposed blocker.
Works correctly on Quadro NVS 285, Plymouth is ok, X starts correctly, Gnome Shell/Plasma Workspaces works accelerated.
Issues on nVidia Corporation Device 1057 - Plymouth is broken - I can see Fedora logo 4x times. X starts, Gnome Shell/Plasma Workspaces works unaccelerated as expected.
Created attachment 612415 [details]
screen corruption with Quadro NVS 140M on Alpha RC3
I see the same error messages as Adam.
I only tested this once, but for me, removing 'rhgb' from the boot parameters seems to work around the issue - I get a owrking gdm and shell then.
I think I am seeing this in FC 16 on a Toshiba NB555D. Versions are:
This happens intermittently, about once in ten boots or resumes from hibernate.
CTRL-ALT-DEL reboots and fixes the problem. I haven't tried waiting the 3 minutes suggested in the description above. I will, next time the problem occurs.
I have not noticed the problem with the previous kernel, kernel-3.4.9-1.fc16.x86_64, I think.
I split off the NVIDIA regression as https://bugzilla.redhat.com/show_bug.cgi?id=857300 - please follow it up there, not here. sorry for the confusion. I'll close this again, Peter, could you file a new bug for F16?