1167103 – vnc based install gives grey blank screen only

Bug 1167103 - vnc based install gives grey blank screen only

Summary: vnc based install gives grey blank screen only

Keywords:
Status:	CLOSED RAWHIDE
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	anaconda
Sub Component:
Version:	21
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	---
Assignee:	David Shea
QA Contact:	Fedora Extras Quality Assurance
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2014-11-23 16:24 UTC by Terje Røsten
Modified:	2015-01-12 18:43 UTC (History)
CC List:	6 users (show)
Fixed In Version:	anaconda-22.14-1
Clone Of:
Environment:
Last Closed:	2015-01-12 18:43:03 UTC
Type:	Bug
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
vncserver.log (10.47 KB, text/plain) 2014-12-01 18:44 UTC, Terje Røsten	no flags	Details
anaconda.log (8.83 KB, text/plain) 2014-12-05 16:47 UTC, David Shea	no flags	Details
ifcfg.log (4.07 KB, text/plain) 2014-12-05 16:48 UTC, David Shea	no flags	Details
packaging.log (4.28 KB, text/plain) 2014-12-05 16:48 UTC, David Shea	no flags	Details
program.log (18.81 KB, text/plain) 2014-12-05 16:49 UTC, David Shea	no flags	Details
storage.log (51.19 KB, text/plain) 2014-12-05 16:49 UTC, David Shea	no flags	Details
syslog (59.00 KB, text/plain) 2014-12-05 16:50 UTC, David Shea	no flags	Details
Hacky patch fixing issues (4.36 KB, patch) 2014-12-17 23:19 UTC, Jason Tibbitts	no flags	Details \| Diff
View All

Description Terje Røsten 2014-11-23 16:24:45 UTC

Description of problem:

Using vnc options to anaconda:

vnc vncconnect=example.com

On client example.com (running fc20) 'vncviewer --listen' is running
and host/vm doing install is connecting and a window pops up on
example.com, however window just contains a blank grey background, nothing
else. 

Logs on install vm, don't show anything special as far as I can understand.

Version-Release number of selected component (if applicable):

FC 21 TC3 install.

How reproducible:

Steps to Reproduce:

Create VM in virt-manager, choose network install and include vnc options in
Kernel Options box. Run 'vncviewer --listen' on client machine, start install
and wait for window to pop up.


Note: I don't need vnc options to work for VM installs, however it's important when going to real HW later in the testing and deployment phase of FC21.

Comment 1 David Shea 2014-11-24 16:40:18 UTC

Please attach the log files from /tmp to this bug

Comment 2 Terje Røsten 2014-11-24 18:26:53 UTC

I think I found the issue, when removing the network line from kickstart file:


 network -bootproto=static --device=eth0 --gateway=$gw
 --ip=$ip --nameserver=$dns --netmask=$mask
 --hostname=$name --onboot=on

things starts to work.

What's special in my setup is that installing VM gets random, free IP from DHCP server in initial state, then kickstart later sets correct, different IP info by network directiv above.

If initial IP is identical IP given by kickstart network line (or removing the line) things work.

I would guess changing IP during install confuses vnc setup, or vnc connection is done from old IP and rest of IP from new IP, creating trouble.


BTW: this special setup has worked past 10-12 releases at least.

Comment 3 David Shea 2014-11-24 20:59:36 UTC

Still need logs, though. The network should not be changing after the vnc connection has been established. The network lines from kickstart are parsed within the initramfs, before the stage2 is loaded.

Comment 4 Terje Røsten 2014-12-01 18:35:08 UTC

Did some more debugging, removing network line from kickstart file and things works.

Note: IP info in kickstart file is different than given by DHCP (this is key to reproduce).

Logs from failed test (dhcp ip address and ks address differs):

 http://web.phys.ntnu.no/~terjeros/ks-network.tar.gz

Logs from passed test (dhcp ip only, no ip info in kickstart file):

 http://web.phys.ntnu.no/~terjeros/non-ks-network.tar.gz


One more indicium changing of IP is confusing vnc or something:

on the listening vnc client host, the VNC pop up window has title from the first
IP (coming fra DHCP), not hostname belonging to IP given in kickstart file.

Comment 5 Terje Røsten 2014-12-01 18:44:19 UTC

Something strange happened, vnc connection time out, then I tried to reconnect to server:

 vncviewer <host-installing>:5901

and then it worked. Attached is vncserver.log from this session.

Comment 6 Terje Røsten 2014-12-01 18:44:52 UTC

Created attachment 963394 [details]
vncserver.log

vncserver.log

Comment 7 Terje Røsten 2014-12-03 18:46:07 UTC

Still same issue with RC4.

Comment 8 David Shea 2014-12-05 16:47:56 UTC

Created attachment 965185 [details]
anaconda.log

Comment 9 David Shea 2014-12-05 16:48:19 UTC

Created attachment 965186 [details]
ifcfg.log

Comment 10 David Shea 2014-12-05 16:48:49 UTC

Created attachment 965187 [details]
packaging.log

Comment 11 David Shea 2014-12-05 16:49:19 UTC

Created attachment 965188 [details]
program.log

Comment 12 David Shea 2014-12-05 16:49:49 UTC

Created attachment 965189 [details]
storage.log

Comment 13 David Shea 2014-12-05 16:50:11 UTC

Created attachment 965190 [details]
syslog

Comment 14 Jason Tibbitts 2014-12-16 01:01:43 UTC

I am having the exact same problem in the same basic situation, though I am installing on real hardware instead of a VM.

Anaconda opens the vnc window, then changes the IP address and the vnc session hangs.  In F20 it would leave both IP addresses active on the interface so things still worked, even though the VNC session came up with the old IP address.  Maybe I'll poke through the code to see if I can change the ordering of changing the IP and starting VNC.

Comment 15 Jason Tibbitts 2014-12-16 01:28:56 UTC

So, in the main F20 anaconda executable, in __main__, setupDisplay is called to open the VNC server on line 1295.  networkInitialize isn't called until line 1352.

I haven't poked around that much, but it sort of looks like you could bump the necessary imports and call to networkInitialize, along with the thread setup to wait for anaconda to finish. up above the VNC call.  I could just add some strategic sleeps, too.  After all, setupDisplay always takes about 60 seconds for me for whatever reason.

The big problem?  I've no idea how to rebuild the installer images.  I know there's a trick to make an update.img containing only the updated files, but I can't remember it and I don't know if that applies to the main anaconda executable.

Comment 16 Jason Tibbitts 2014-12-16 01:30:03 UTC

Of course that's the F21 anaconda executable, sorry.

Comment 17 David Shea 2014-12-16 15:31:11 UTC

(In reply to Jason Tibbitts from comment #15)
> So, in the main F20 anaconda executable, in __main__, setupDisplay is called
> to open the VNC server on line 1295.  networkInitialize isn't called until
> line 1352.
> 
> I haven't poked around that much, but it sort of looks like you could bump
> the necessary imports and call to networkInitialize, along with the thread
> setup to wait for anaconda to finish. up above the VNC call.  I could just
> add some strategic sleeps, too.  After all, setupDisplay always takes about
> 60 seconds for me for whatever reason.

That seems reasonable. One concern I have at first glance is that network can take a real long time to set up sometimes, and, except in the case of VNC, we can go ahead and get the anaconda interface running so the user can pick a language and probe storage and everything while network is happening in the background. Maybe a better solution would be to start the thread that waits for NetworkManager before we call setupDisplay and then join that thread before we start VNC.

And the other concern is that the anaconda script is kind of a minefield, as you've probably noticed.


> The big problem?  I've no idea how to rebuild the installer images.  I know
> there's a trick to make an update.img containing only the updated files, but
> I can't remember it and I don't know if that applies to the main anaconda
> executable.

There's a script in the git source, scripts/makeupdates, that will do that for you. Just run something like:

./scripts/makeupdates -t <tag>

where the tag is whatever you branched from (master, f21-branch, a specific version, whatever), and that'll create a file updates.img which you can then pass to the anaconda installer by specifing inst.updates=<location> on the boot command line. The easiest way is usually to put it on a web server and pass a http:// URL to anaconda.

Also, if you have code, feel free to attach it here or send it to anaconda-patches.org

Comment 18 Jason Tibbitts 2014-12-16 21:25:53 UTC

You're right, joining the thread if VNC startup is needed would be a great idea, except that VNC startup is buried down in the display setup routine and I'm not familiar with the internals yet to know how best to get the thread details down in there.  And, yeah, the main anaconda script is kind of melting my brain at the moment.

I did dig up my old notes and found makeupdates; I had to generate an updates image in the F18 timeframe to alter the included yum.conf (to enable the groups-as-objects code for the initial installation).  I found that it does work well on the main anaconda script, to the point that I can now get plenty of crashes from the code that I introduced.

I haven't produced anything remotely useful at this point, though if you have hints I will be glad to try things out.

Comment 19 Jason Tibbitts 2014-12-16 21:30:16 UTC

Oh, and my workaround to actually get working installs at this point is simply to remove '--activate' from my network line.  I would really, really like the machine to take its assigned IP as early as possible so that the VNC window tells me the real name of the host I'm installing (and so I can ssh in if I need to) but I can certainly live with it this way as I have to deploy 140 machines by next Tuesday or I don't get to leave early for the holidays.

Comment 20 David Shea 2014-12-16 21:41:01 UTC

(In reply to Jason Tibbitts from comment #18)
> You're right, joining the thread if VNC startup is needed would be a great
> idea, except that VNC startup is buried down in the display setup routine
> and I'm not familiar with the internals yet to know how best to get the
> thread details down in there.

All of anaconda's threads go through the threadMgr object and use string constants for names. So something like:

from pyanaconda.threads import threadMgr
from pyanaconda.constants import THREAD_WAIT_FOR_CONNECTING_NM
threadMgr.wait(THREAD_WAIT_FOR_CONNECTING_NM)

will join the thread. And since I went ahead and looked up how pthread-y python's join is, I also bring good news:

   "A thread can be join()ed many times."

so a new threadMgr.wait will not interfere with the existing one in pyanaconda/packaging.

Comment 21 Jason Tibbitts 2014-12-17 00:43:06 UTC

Update: I finally got a chance to get back to this.  I just naively moved the block of text containing the imports, networkInitialize call and the thread stuff up above setupDisplay and it works almost great.  The VNC window opens immediately and has the proper kickstart-set hostname for the machine.  The connection comes from the kickstart-set IP address.

Unfortunately it looks like the repository setup can't resolve the hostname where my repos are.  I have to go into the installation source dialog and click done.  Them everything works.  I need to do more experimentation, and try playing with moving the thread join around (though I think that's only going to make the hostname resolution problem worse).  Is there something else on which I could wait?

Comment 22 Jason Tibbitts 2014-12-17 23:18:42 UTC

Attached is a patch (against 21.48.22) which works around all the problems I have run into so far.  I won't pretend that this is remotely acceptable for inclusion, but I'm willing to work on it.  If anyone needs an updates.img, let me know.

Comment 23 Jason Tibbitts 2014-12-17 23:19:13 UTC

Created attachment 970336 [details]
Hacky patch fixing issues

Comment 24 David Shea 2014-12-19 16:48:11 UTC

I wouldn't move the storage thread or rescue mode parts. Starting rescue mode before the display would probably cause problems.

I'm not real sure what's going with your payload problems, but could you give https://dshea.fedorapeople.org/1167103.img a try as an updates image? All it does is patch the anaconda script to do networkInitialize and start the wait for NM thread before doing setupDisplay.

Comment 25 David Shea 2014-12-19 17:02:55 UTC

https://lists.fedorahosted.org/pipermail/anaconda-patches/2014-December/015213.html is the patch I tested with, and https://dshea.fedorapeople.org/1167103.ks as the kickstart, and that seemed to work fine. Could you test with the updates image, and if you still have repo setup troubles, post the logs from /tmp?

Comment 26 David Shea 2014-12-19 19:24:32 UTC

(In reply to Jason Tibbitts from comment #21)

> Unfortunately it looks like the repository setup can't resolve the hostname
> where my repos are.  

This might be the same thing as bug 1124590.

Comment 27 Jason Tibbitts 2014-12-19 19:36:21 UTC

I added a progressively increasing sleep when the repo hostname can't be resolved; that fixes the hostname resolution problem for me.  It's in the patch, but I'm pretty sure it's not something you could use.  The real issue is that NetworkManager doesn't immediately update resolv.conf.  I've talked to the NM devs about it and they say that they want to fix that, but it isn't in yet.  Maybe F22.  It will write out resolv.conf if you change any property of the interface directly using ip addr or the like; you can add a dummy address and then immediately remove it, but that's even more hacky.

I will try your image in a bit.  I'm deep into my deployment now and since what I have works perfectly for me I don't want to go messing it up.  Guess I should join anaconda-patches instead of anaconda-devel because the latter seems to have very little activity.

Note You need to log in before you can comment on or make changes to this bug.