Bug 848305

Summary: When kernel modsetting is enabled, laptop screen remians off until Xorg starts (ironlake)
Product: [Fedora] Fedora Reporter: Elad Alfassa <elad>
Component: plymouthAssignee: Ray Strode [halfline] <rstrode>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 18CC: ajax, awilliam, fedora, gansalmon, itamar, jonathan, jones, jreznik, kernel-maint, kparal, madhu.chinakonda, ptalbert, robatino, rstrode, xgl-maint
Target Milestone: ---Keywords: Reopened
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: AcceptedNTH
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-09-14 00:40:21 EDT Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Bug Depends On:    
Bug Blocks: 752654, 752662    
Attachments:
Description Flags
dmesg output
none
dmesg output with drm.debug=0x04
none
dmesg output with drm.debug=4 with plymouth update installed and kernel 3.6.0-0.rc1.git6.1.fc18.x86_64
none
screen corruption with Quadro NVS 140M on Alpha RC3 none

Description Elad Alfassa 2012-08-15 04:47:30 EDT
Created attachment 604551 [details]
dmesg output

Description of problem:
http://doom.co.il/plymouthbug.webm
At first I thought this was a plymouth bug, but it appears to be a kernel bug to me right now.

When using modesetting (aka default), the laptop screen remains off until Xorg starts, and if  I hook up an external screen I can see the boot progress on the external screen only.

Furthermore, the screen remains off from ~3 minutes after waking up from suspend.

Version-Release number of selected component (if applicable):
Linux alfmobile 3.6.0-0.rc1.git6.1.fc18.x86_64 #1 SMP Tue Aug 14 12:13:12 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux

plymouth-0.8.6.2-0.2012.07.23.fc18.x86_64

xorg-x11-server-Xorg-1.12.99.904-1.20120808.fc18.x86_64

How reproducible:
100%

Steps to Reproduce:
1. Boot
  
Actual results:
Screen off

Expected results:
Screen on

Additional info:
[elad@alfmobile ~]$ lspci
00:00.0 Host bridge: Intel Corporation Core Processor DRAM Controller (rev 18)
00:02.0 VGA compatible controller: Intel Corporation Core Processor Integrated Graphics Controller (rev 18)
00:16.0 Communication controller: Intel Corporation 5 Series/3400 Series Chipset HECI Controller (rev 06)
00:1a.0 USB Controller: Intel Corporation 5 Series/3400 Series Chipset USB2 Enhanced Host Controller (rev 06)
00:1b.0 Audio device: Intel Corporation 5 Series/3400 Series Chipset High Definition Audio (rev 06)
00:1c.0 PCI bridge: Intel Corporation 5 Series/3400 Series Chipset PCI Express Root Port 1 (rev 06)
00:1c.1 PCI bridge: Intel Corporation 5 Series/3400 Series Chipset PCI Express Root Port 2 (rev 06)
00:1c.5 PCI bridge: Intel Corporation 5 Series/3400 Series Chipset PCI Express Root Port 6 (rev 06)
00:1d.0 USB Controller: Intel Corporation 5 Series/3400 Series Chipset USB2 Enhanced Host Controller (rev 06)
00:1e.0 PCI bridge: Intel Corporation 82801 Mobile PCI Bridge (rev a6)
00:1f.0 ISA bridge: Intel Corporation Mobile 5 Series Chipset LPC Interface Controller (rev 06)
00:1f.2 SATA controller: Intel Corporation 5 Series/3400 Series Chipset 6 port SATA AHCI Controller (rev 06)
00:1f.3 SMBus: Intel Corporation 5 Series/3400 Series Chipset SMBus Controller (rev 06)
00:1f.6 Signal processing controller: Intel Corporation 5 Series/3400 Series Chipset Thermal Subsystem (rev 06)
03:00.0 Network controller: Broadcom Corporation BCM4313 802.11b/g/n Wireless LAN Controller (rev 01)
04:00.0 Ethernet controller: Atheros Communications AR8152 v1.1 Fast Ethernet (rev c1)
ff:00.0 Host bridge: Intel Corporation Core Processor QuickPath Architecture Generic Non-core Registers (rev 05)
ff:00.1 Host bridge: Intel Corporation Core Processor QuickPath Architecture System Address Decoder (rev 05)
ff:02.0 Host bridge: Intel Corporation Core Processor QPI Link 0 (rev 05)
ff:02.1 Host bridge: Intel Corporation Core Processor QPI Physical 0 (rev 05)
ff:02.2 Host bridge: Intel Corporation Core Processor Reserved (rev 05)
ff:02.3 Host bridge: Intel Corporation Core Processor Reserved (rev 05)
Comment 1 Elad Alfassa 2012-08-15 04:55:36 EDT
Created attachment 604553 [details]
dmesg output with drm.debug=0x04

Laptop model is Dell Inspiron N3010.
I would happily provide any more information required to understand the source of the problem and maybe even fix it.
Note that this bug was not present in Fedoras 17.
Comment 2 Ray Strode [halfline] 2012-08-15 10:31:23 EDT
Maybe a combination of plymouth failing to deal with configuration changes and the kernel changing configuration shortly after plymouth is started.
Comment 3 Ray Strode [halfline] 2012-08-21 00:08:37 EDT
i think should be fixed by:

https://admin.fedoraproject.org/updates/edit/plymouth-0.8.7-1.fc18
Comment 4 Fedora Update System 2012-08-21 00:08:45 EDT
plymouth-0.8.7-1.fc18 has been submitted as an update for Fedora 18.
https://admin.fedoraproject.org/updates/plymouth-0.8.7-1.fc18
Comment 5 Fedora Update System 2012-08-21 00:52:53 EDT
Package plymouth-0.8.7-1.fc18:
* should fix your issue,
* was pushed to the Fedora 18 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing plymouth-0.8.7-1.fc18'
as soon as you are able to.
Please go to the following url:
https://admin.fedoraproject.org/updates/FEDORA-2012-12334/plymouth-0.8.7-1.fc18
then log in and leave karma (feedback).
Comment 6 Elad Alfassa 2012-08-21 06:55:13 EDT
Unfortunately this update did not fix the problem (installed it and rebuilt the initramfs, rebooted, still same problem)



-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers
Comment 7 Elad Alfassa 2012-08-21 17:04:10 EDT
Created attachment 606060 [details]
dmesg output with drm.debug=4 with plymouth update installed and kernel 3.6.0-0.rc1.git6.1.fc18.x86_64

Will try with a more recent kernel from koji shortly
Comment 8 Elad Alfassa 2012-08-21 17:15:51 EDT
Fixed in kernel-3.6.0-0.rc2.git1.2.fc18
Comment 9 Adam Williamson 2012-09-11 18:35:47 EDT
Re-opening as the update to plymouth isn't pushed stable. airlied and halfline say that the bug in plymouth - fixed in 0.8.6.2-1.2012.07.23 with the note "fix plymouth race at bootup breaking efi/vesa handoff" - is a significant issue which can cause quite a lot of issues; as well as this problem, for instance, it's probably the reason the 'cirrus' kernel module doesn't work right in Alpha and you get vesa not modesetting as the X driver in cirrus VMs. So, nominating this as NTH to get https://admin.fedoraproject.org/updates/FEDORA-2012-12334/plymouth-0.8.7-1.fc18 into Alpha.
Comment 10 Adam Williamson 2012-09-12 15:00:20 EDT
Discussed at 2012-09-12 NTH review meeting. Accepted as NTH due to the possibility of the bug causing serious problems; airlied and halfline both believe it's the best course to pull the fix, as the race could affect all sorts of things. We already know it breaks the cirrus kernel module, for instance.
Comment 11 Fedora Update System 2012-09-12 19:50:57 EDT
plymouth-0.8.7-1.fc18 has been pushed to the Fedora 18 stable repository.  If problems still persist, please make note of it in this bug report.
Comment 12 Adam Williamson 2012-09-13 05:37:57 EDT
So some complete idiot decided to put this in RC3. On my desktop, it actually seems to make things *worse*.

When I boot the RC3 desktop live image, whether via UEFI or BIOS compat, it usually fails to start X cleanly. Often I just see a  bunch of boot messages on tty1 - though systemctl status gdm.service and ps agree that gdm is running, on tty1. Sometimes I get a completely corrupted graphical display. Never do I get a fully working gdm, or shell. Sometimes I get a gdm which more or less works, but no cursor; if I try to log in, I get a Shell with completely corrupted graphics.

If I go to a console and restart gdm.service, it sometimes comes up, though it's often weird - sluggish cursor response. If it comes up and I can get into Shell, it'll often be corrupted as above.

By comparison, RC2 - which used the old plymouth - works perfectly, booted in BIOS mode. booted in UEFI native mode it behaves as described in this bug - the screen goes into power-saving mode until GDM comes up. But GDM *does* come up, and works perfectly. No failure, no weird corruption. Shell is likewise fine.

I built two live images with the sole difference being the version of plymouth included, just to confirm plymouth is the problem. It is. The 'oldplymouth' image behaves exactly as I described RC2 above, the 'newplymouth' image behaves as I described RC3. RC3/newplymouth behaviour can be pretty varied, but in every attempt it was broken _somehow_, and it was _always_ worse than _any_ RC2/oldplymouth boot.

This doesn't seem to be universal, though. RC3 is fine in a VM, for instance, and others have reported it's ok on their hardware.

My system has a GeForce 9600 GT graphics adapter. It's also quite fast - I know speed can matter to these bugs. The system drive is an SSD, so it boots in single-digit seconds.
Comment 13 Adam Williamson 2012-09-13 05:40:43 EDT
oh, forgot to mention - the one way I can actually get to a working desktop with RC3/newplymouth is to boot to 'runlevel 3' and then do 'systemctl start gdm.service'. then gdm comes up working fine.
Comment 14 Adam Williamson 2012-09-13 05:51:48 EDT
Looking at dmesg | grep nouveau, there are a *ton* of messages of the form:

[drm] nouveau 0000:01:00.0: PFIFO_CACHE_ERROR - Ch 2/2 Mthd 0xfoof Data 0xfoofoofo

the values for Mthd and Data change with each message, all the rest remains the same.
Comment 15 Adam Williamson 2012-09-13 05:53:09 EDT
oh, sometimes it's Ch 2/7 not 2/2.
Comment 16 Adam Williamson 2012-09-13 06:15:21 EDT
kparal confirms this on a Quadro NVS 140M, and airlied on a geforce 9300m, so it's looking like this affects a lot of NVIDIA hardware. Bumping up to proposed blocker.
Comment 17 Jaroslav Reznik 2012-09-13 07:08:29 EDT
Works correctly on Quadro NVS 285, Plymouth is ok, X starts correctly, Gnome Shell/Plasma Workspaces works accelerated.
Issues on nVidia Corporation Device 1057 - Plymouth is broken - I can see Fedora logo 4x times. X starts, Gnome Shell/Plasma Workspaces works unaccelerated as expected.
Comment 18 Kamil Páral 2012-09-13 07:31:29 EDT
Created attachment 612415 [details]
screen corruption with Quadro NVS 140M on Alpha RC3
Comment 19 Kamil Páral 2012-09-13 07:32:08 EDT
I see the same error messages as Adam.
Comment 20 Adam Williamson 2012-09-13 15:06:29 EDT
I only tested this once, but for me, removing 'rhgb'  from the boot parameters seems to work around the issue - I get a owrking gdm and shell then.
Comment 21 Peter H. Jones 2012-09-13 16:26:18 EDT
I think I am seeing this in FC 16 on a Toshiba NB555D. Versions are:
plymouth-0.8.4-0.20110822.5.fc16.x86_64
kernel-3.4.9-2.fc16.x86_64

This happens intermittently, about once in ten boots or resumes from hibernate.
CTRL-ALT-DEL reboots and fixes the problem. I haven't tried waiting the 3 minutes suggested in the description above. I will, next time the problem occurs.

I have not noticed the problem with the previous kernel,  kernel-3.4.9-1.fc16.x86_64, I think.
Comment 22 Adam Williamson 2012-09-14 00:40:21 EDT
I split off the NVIDIA regression as https://bugzilla.redhat.com/show_bug.cgi?id=857300 - please follow it up there, not here. sorry for the confusion. I'll close this again, Peter, could you file a new bug for F16?