Bug 597291

Summary: GUI doesn't start in stage2 / Xen PV guest
Product: Red Hat Enterprise Linux 6 Reporter: Alexander Todorov <atodorov>
Component: xorg-x11-drv-fbdevAssignee: Adam Jackson <ajax>
Status: CLOSED CURRENTRELEASE QA Contact: desktop-bugs <desktop-bugs>
Severity: medium Docs Contact:
Priority: low    
Version: 6.0CC: dgregor, jkoten, xen-maint
Target Milestone: rcKeywords: Rebase, Triaged
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: xorg-x11-drv-fbdev-0.4.2-1.el6 Doc Type: Rebase: Bug Fixes and Enhancements
Doc Text:
Story Points: ---
Clone Of:
: 609245 (view as bug list) Environment:
Last Closed: 2010-11-10 21:55:51 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 609245    
Attachments:
Description Flags
X.log from failed attempt ("no screens found")
none
dmesg output
none
anaconda.log
none
program.log
none
syslog
none
storage.log
none
Xorg.0.log from already running system none

Description Alexander Todorov 2010-05-28 15:17:23 UTC
Description of problem:
I'm trying to install a Xen PV guest with latest snap #6 (0527.2). Loader starts and I can proceed to stage2. On tty1 I see Starting anaconda the Red Hat Enterprise Linux installer and then only black screen. Switching to other consoles (tty2, tty3) shows nothing but black screen.

Version-Release number of selected component (if applicable):
anaconda-13.21.48-1.el6 / 0527.2 tree

How reproducible:
Always (tried 2 different hosts)

Steps to Reproduce:
1. Prepare a Xen hypervisor with RHEL 5.5 GA / x86_64.
2. Using virt-manager start new PV guest. 
3. Select Linux/RHEL6 as OS type/version, supply URL to the tree
4. Configure memory to 1024 (startup and maximum memory)
  
Actual results:
Anaconda proceeds to stage2 and shows black screen

Expected results:
Anaconda shown GUI in stage2.

Additional info:
- vnc and text mode start fine (add vnc or text on the command line)
- if I select os type/variant as Generic/Generic same thing - black screen
- if I increase the guest memory to 2048M same thing - black screen
- seen this on two hosts: sun-x4440-01.rhts.eng.bos.redhat.com and hp-z800-01.lab.eng.brq.redhat.com

- anaconda-13.21.45-1 / 0523.0 tree works fine on hp-z800-01.

Comment 1 Matěj Cepl 2010-05-29 16:57:11 UTC
(In reply to comment #0)
> I'm trying to install a Xen PV guest with latest snap #6 (0527.2). Loader
> starts and I can proceed to stage2.

I don't understand ... you mean you tried to start RHEL-6 as paravirtualized guest? Why do you think RHEL-6 should be able to run paravirtualized? Should it?

Comment 2 Matěj Cepl 2010-05-29 17:01:02 UTC
(In reply to comment #0)
> - vnc and text mode start fine (add vnc or text on the command line)

Whoops, you are right RHEL-6 should run paravirtualized http://www.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/6-Beta/html/Beta_Release_Notes/virtualization.html#id569407 and it apparently worked for you in text mode.

If the computer is not completely frozen when installation fails, switch to the console (Ctrl+Alt+F2) and copy /tmp/X* and /var/log/anaconda.xlog to some other place -- USB stick, some other computer via network, somewhere on the Internet, and please attach it to the bug report as individual uncompressed file attachments using the bugzilla file attachment link above.

If the computer is completely useless after installation fails, you can also install Fedora with a VESA mode driver (see https://fedoraproject.org/wiki/Documentation_Beats_Installer
for more information on that). Then after successful installation you can collect /var/log/anaconda.xlog, /var/log/Xorg.0.log, and the output of the program dmesg instead.

Or you can install Fedora in a text mode completely, and then start X after that. If it fails, still /var/log/Xorg.0.log and the output of dmesg program from the failed attempt to start X would be useful.

We will review this issue again once you've had a chance to attach this information.

Thank you very much in advance.

Comment 3 Alexander Todorov 2010-06-01 16:01:15 UTC
(In reply to comment #2)
> (In reply to comment #0)
> > - vnc and text mode start fine (add vnc or text on the command line)
> 
> Whoops, you are right RHEL-6 should run paravirtualized
> http://www.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/6-Beta/html/Beta_Release_Notes/virtualization.html#id569407
> and it apparently worked for you in text mode.
> 
> If the computer is not completely frozen when installation fails, switch to the
> console (Ctrl+Alt+F2) and copy /tmp/X* and /var/log/anaconda.xlog to some other
> place -- USB stick, some other computer via network, somewhere on the Internet,
> and please attach it to the bug report as individual uncompressed file
> attachments using the bugzilla file attachment link above.
> 

I can't switch tty's at all. 

> If the computer is completely useless after installation fails, you can also
> install Fedora with a VESA mode driver (see
> https://fedoraproject.org/wiki/Documentation_Beats_Installer
> for more information on that). Then after successful installation you can
> collect /var/log/anaconda.xlog, /var/log/Xorg.0.log, and the output of the
> program dmesg instead.

Booting with xdriver=vesa on the command line failed to start X. The last lines in X.log are:

(II) VESA: driver for VESA chipsets: vesa
(WW) Falling back to old probe method for vesa
(EE) No devices detected.

Fatal server error:
no screens found


Will attach all logs from stage2 environment.

> 
> Or you can install Fedora in a text mode completely, and then start X after
> that. If it fails, still /var/log/Xorg.0.log and the output of dmesg program
> from the failed attempt to start X would be useful.

After install, I've installed X and the system was able to start in runlevel 5 and login to GNOME. So no problems here.

Comment 4 Alexander Todorov 2010-06-01 16:02:15 UTC
Created attachment 418701 [details]
X.log from failed attempt ("no screens found")

Comment 5 Alexander Todorov 2010-06-01 16:02:47 UTC
Created attachment 418703 [details]
dmesg output

Comment 6 Alexander Todorov 2010-06-01 16:03:15 UTC
Created attachment 418704 [details]
anaconda.log

Comment 7 Alexander Todorov 2010-06-01 16:03:44 UTC
Created attachment 418706 [details]
program.log

Comment 8 Alexander Todorov 2010-06-01 16:04:14 UTC
Created attachment 418708 [details]
syslog

Comment 9 Alexander Todorov 2010-06-01 16:04:41 UTC
Created attachment 418710 [details]
storage.log

Comment 10 Adam Jackson 2010-06-03 14:46:42 UTC
The X log doesn't show any PCI devices right afetr the "using VT number" line.  Which indicates to me that you don't have any video devices in your Xen guest.  I'm very sorry about that, but I don't think it's something X can be expected to fix.

Check 'lspci' output for VGA or Multimedia devices, and if they're not there, you need to have added them to the guest before booting it.

Comment 11 Adam Jackson 2010-06-03 14:48:23 UTC
Actually, there's another case there, which is if we've got a /dev/fb0 for xenfb.  We should get that right, afaik, but forcing xdriver=vesa will prevent that.  Can you attach an X log from normal startup too?

Comment 12 Alexander Todorov 2010-06-11 17:56:48 UTC
Hi Adam,
sorry for late reply. 

When I start the Xen guest so that it shows black screen I can't get any output out of it. I can't switch to tty2 and type commands or do anything else. 

If I start it with the vnc parameter lspci shows nothing. Empty output. There is however /dev/fb0 device.


X log from normal operation will be attached shortly.

Comment 13 Alexander Todorov 2010-06-11 18:26:04 UTC
Created attachment 423364 [details]
Xorg.0.log from already running system

lspci on th erunning system also produced empty output.

Comment 14 Adam Jackson 2010-06-24 14:52:36 UTC
What exactly does "when [you] start the Xen guest so that it shows black screen" mean?  Does that mean "when I start with xdriver=vesa" ?  Because that's not expected to work for Xen guests.

Comment 15 Alexander Todorov 2010-06-24 16:44:56 UTC
I should have been more verbose. 

When I start the PV guest without any additional parameters I hit the black screen. When that happens I can't switch to tty2 or any other tty and I cat get you any log files.

When I start it with the xdriver=vesa X fails and I'm offered to start vnc ot text mode. X.log says:
(EE) No devices detected.

Fatal server error:
no screens found


As you say above this is not expected to work. I can attach the X.log from this install if needed.

Comment 16 Adam Jackson 2010-06-24 17:09:19 UTC
Are you still trying with the 0527 snapshot?  I can't reproduce this with the 0616 workstation iso.

Comment 17 Alexander Todorov 2010-06-24 18:00:21 UTC
Nope, comment #15 was with 0622.1/Server/x86_64. Dom0 is RHEL5.5-Server GA (without updates)-x86_64

Comment 18 Adam Jackson 2010-06-25 19:19:53 UTC
Server, eh.  I've been trying with Workstation.  If that's the difference, I'm going to be quite upset.

Comment 19 Adam Jackson 2010-06-29 15:50:28 UTC
That _is_ the difference.  I am quite upset.

Comment 20 Adam Jackson 2010-06-29 17:15:15 UTC
Actually, I can repro it on Workstation too, which is good.  The problem is it's a cascade of two bugs.

Aside: to debug problems like this, do a network install of the Xen guest with 'sshd' on kcmdline.  Inspect the arp cache on the host with 'arp -n' to find the IP address of the guest, and then ssh in once stage2 starts.

One of the bugs is a bug in the fbdev driver where it segfaults on server regeneration:

Backtrace:
0: Xorg (xorg_backtrace+0x28) [0x4adf38]
1: Xorg (0x400000+0x629e9) [0x4629e9]
2: /lib64/libpthread.so.0 (0x7f2b38f19000+0xf440) [0x7f2b38f28440]
3: Xorg (DamageUnregister+0x53) [0x4de543]
4: /usr/lib64/xorg/modules/libshadow.so (shadowRemove+0x33) [0x7f2b352e02c3]
5: /usr/lib64/xorg/modules/libshadow.so (0x7f2b352df000+0x1784) [0x7f2b352e0784]
6: Xorg (0x400000+0xafc29) [0x4afc29]
7: Xorg (0x400000+0x152f0c) [0x552f0c]
8: Xorg (0x400000+0x2209c) [0x42209c]
9: /lib64/libc.so.6 (__libc_start_main+0xfd) [0x7f2b37b6ac5d]
10: Xorg (0x400000+0x21bb9) [0x421bb9]
Segmentation fault at address (nil)

This is pretty straightforward to fix, we had a similar bug in the vesa code.

But the second bug is that we're getting to this at all; the X server should not be regenerating while anaconda is running.  I suspect this is a race condition introduced by the change to metacity from mini-wm; we used to wait for mini-wm to connect before running xrandr(1), but now since we don't, if xrandr beats metacity to the display (and it will, since metacity requires a lot of libraries to load), then when it disconnects the server will regenerate.  The server shouldn't fault when regenerating; but anaconda should also be careful to wait for metacity to initialize before proceeding.

Comment 21 Adam Jackson 2010-06-29 17:16:17 UTC
Probably the reason I couldn't reproduce it with Workstation before is that I was initially creating the guest with 1 VCPU, which would mitigate some races.  Live, learn.

Comment 22 Adam Jackson 2010-06-29 18:59:23 UTC
2560662 build (RHEL-6-candidate, /cvs/dist:rpms/xorg-x11-drv-fbdev/RHEL-6:xorg-x11-drv-fbdev-0_4_2-1_el6) completed successfully

MODIFIED

Comment 23 Alexander Todorov 2010-07-02 15:39:53 UTC
This now works for me with snapshot #7 which has xorg-x11-drv-fbdev-0.4.2-1.el6.

Comment 24 Alexander Todorov 2010-07-02 15:40:55 UTC
(In reply to comment #20)

> But the second bug is that we're getting to this at all; the X server should
> not be regenerating while anaconda is running.  I suspect this is a race
> condition introduced by the change to metacity from mini-wm; we used to wait
> for mini-wm to connect before running xrandr(1), but now since we don't, if
> xrandr beats metacity to the display (and it will, since metacity requires a
> lot of libraries to load), then when it disconnects the server will regenerate.
>  The server shouldn't fault when regenerating; but anaconda should also be
> careful to wait for metacity to initialize before proceeding.    


Hi Adam,
how can I reproduce this in anaconda now that X doesn't crash? I want to file a bug against anaconda for this behavior.

Comment 27 releng-rhel@redhat.com 2010-11-10 21:55:51 UTC
Red Hat Enterprise Linux 6.0 is now available and should resolve
the problem described in this bug report. This report is therefore being closed
with a resolution of CURRENTRELEASE. You may reopen this bug report if the
solution does not work for you.