Bug 1137023

Summary: VNC does not paint the entire desktop during installation
Product: [Fedora] Fedora Reporter: Mark Hamzy <hamzy>
Component: tigervncAssignee: Tim Waugh <twaugh>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: 21CC: awilliam, bphinz, gustavold, hamzy, sgallagh, twaugh, xgl-maint
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard: AcceptedFreezeException
Fixed In Version: tigervnc-1.3.1-11.fc21 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1140587 (view as bug list) Environment:
Last Closed: 2014-09-23 02:41:36 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1043122, 1140359    
Attachments:
Description Flags
tcpdump capture of vnc traffic none

Description Mark Hamzy 2014-09-03 21:13:46 UTC
Created attachment 934220 [details]
tcpdump capture of vnc traffic

tigervnc-1.3.1-10.fc21
http://ppc.koji.fedoraproject.org/kojifiles/mash/branched-20140903/21-ppc/

I am trying to investigate a vnc problem during installation with Fedora 21 and ppc64le.  When I first connect, it does not paint the entire desktop.  And it seems to hang on me.  I cannot click any of the UI or get the installer screen to repaint.

I notice the following debug log running vinagre

...
gtk-vnc: Expose 0x0 @ 1049,787
gtk-vnc: Running main loop
gtk-vnc: FramebufferUpdate(-239, 1, 2, 12, 20)
gtk-vnc: FramebufferUpdate(7, 0, 0, 1024, 64)
gtk-vnc: FramebufferUpdate(7, 0, 64, 1024, 64)
gtk-vnc: FramebufferUpdate(7, 0, 128, 1024, 64)
gtk-vnc: FramebufferUpdate(7, 0, 192, 1024, 64)
gtk-vnc: FramebufferUpdate(7, 0, 256, 1024, 64)
gtk-vnc: FramebufferUpdate(7, 0, 320, 1024, 64)
gtk-vnc: FramebufferUpdate(7, 0, 384, 1024, 64)
gtk-vnc: FramebufferUpdate(7, 0, 448, 1024, 64)
gtk-vnc: Expose 12x9 @ 1025,449
...

Comment 1 Mark Hamzy 2014-09-03 21:15:08 UTC
And when I try and run tshark to print out the network sniff, I see the following:

http://paste.fedoraproject.org/130814/14097779/raw/

Comment 2 Tim Waugh 2014-09-04 14:17:56 UTC
I don't see anything in particular wrong.

Some of the more recent extensions to RFB are not understood by all network decoders, and wireshark gets confused by some of the finer points, but the basic structure of the conversation is shown by what gtk-vnc says about it. The server sends rectangles of 1024x64 striping across the display, but only gets to 448+64 (of 704+64).

I'm not sure why it gets stuck. The client hasn't said anything at that point, just ACKd the data. Maybe it needs gdb-server (on the ppc64le machine) to help us figure out what Xvnc is thinking at that point?

Comment 3 Tim Waugh 2014-09-04 14:18:33 UTC
Also, does this only happen for installation, or does it also fail when running a VNC server on an installed system?

Comment 4 Tim Waugh 2014-09-04 16:11:26 UTC
Playing around with a test system I see this sort of thing from the server:

hu Sep  4 12:08:20 2014
 VNCSConnST:  Client pixel format depth 24 (32bpp) little-endian rgb888
 Connections: closed: 127.0.0.1::39981 (Internal error: inPF is not native
              endian)
 SMsgWriter:  framebuffer updates 0
 SMsgWriter:    raw bytes equivalent 0, compression ratio nan

Comment 5 Tim Waugh 2014-09-04 17:05:03 UTC
What's happening is that the pixel format of the framebuffer VNC uses internally is seen as not being in the native order, which doesn't make any sense.

The reason this happens is that the X server tells VNC it is so:

Breakpoint 3, vncGetPixelFormat (pScreen=0x10302740) at vncExtInit.cc:155
155	  bigEndian = (screenInfo.imageByteOrder == MSBFirst);
[...]
157	  for (i = 0; i < pScreen->numVisuals; i++) {
(gdb) p bigEndian 
$8 = 1
(gdb) p screenInfo
$9 = {imageByteOrder = 1, bitmapScanlineUnit = 32, bitmapScanlinePad = 32, 
  bitmapBitOrder = 1, numPixmapFormats = 6, formats = {{depth = 1 '\001', 
      bitsPerPixel = 1 '\001', scanlinePad = 32 ' '}, {depth = 4 '\004', 
      bitsPerPixel = 8 '\b', scanlinePad = 32 ' '}, {depth = 8 '\b', 
      bitsPerPixel = 8 '\b', scanlinePad = 32 ' '}, {depth = 16 '\020', 
      bitsPerPixel = 16 '\020', scanlinePad = 32 ' '}, {depth = 24 '\030', 
      bitsPerPixel = 32 ' ', scanlinePad = 32 ' '}, {depth = 32 ' ', 
      bitsPerPixel = 32 ' ', scanlinePad = 32 ' '}, {depth = 0 '\000', 
      bitsPerPixel = 0 '\000', scanlinePad = 0 '\000'}, {depth = 0 '\000', 
      bitsPerPixel = 0 '\000', scanlinePad = 0 '\000'}}, numScreens = 1, 
  screens = {0x10302740, 0x0 <repeats 15 times>}, numGPUScreens = 0, 
  gpuscreens = {0x0 <repeats 16 times>}, x = 0, y = 0, width = 1024, 
  height = 768}

Changing component to xorg-x11-server.

Comment 6 Mark Hamzy 2014-09-04 17:44:38 UTC
Maybe we are seeing two different issues?

I am seeing no color corruption.  Just Partial painting of the desktop.
VNC after the install worked. Just VNC during installation is impossible.

[anaconda root@ppc64lehamzytest2 ~]#  wget http://ppc.koji.fedoraproject.org/mash/branched-20140904/21-ppc/ppc64le/os/Packages/g/gdb-7.8-20.fc21.ppc64le.rpm; wget http://ppc.koji.fedoraproject.org/mash/branched-20140904/21-ppc/ppc64le/os/Packages/l/libbabeltrace-1.2.1-3.fc21.ppc64le.rpm; rpm -i --nodeps gdb-7.8-20.fc21.ppc64le.rpm libbabeltrace-1.2.1-3.fc21.ppc64le.rpm; /bin/rm *.rpm
[anaconda root@ppc64lehamzytest2 ~]# wget http://ppc.koji.fedoraproject.org/kojifiles/packages/tigervnc/1.3.1/10.fc21/ppc64le/tigervnc-debuginfo-1.3.1-10.fc21.ppc64le.rpm; rpm -i tigervnc-debuginfo-1.3.1-10.fc21.ppc64le.rpm; /bin/rm *.rpm
[anaconda root@ppc64lehamzytest2 ~]# wget http://ppc.koji.fedoraproject.org/kojifiles/packages/glibc/2.19.90/35.fc21/ppc64le/glibc-debuginfo-2.19.90-35.fc21.ppc64le.rpm; wget http://ppc.koji.fedoraproject.org/kojifiles/packages/glibc/2.19.90/35.fc21/ppc64le/glibc-debuginfo-common-2.19.90-35.fc21.ppc64le.rpm; rpm -i glibc-debuginfo-2.19.90-35.fc21.ppc64le.rpm glibc-debuginfo-common-2.19.90-35.fc21.ppc64le.rpm; /bin/rm *.rpm
[anaconda root@ppc64lehamzytest2 ~]# ps -efl | grep Xvnc
4 S root      2487  2387  0  80   0 -   764 poll_s Sep03 pts/0    00:00:00 Xvnc :1 -depth 16 -br IdleTimeout=0 -auth /dev/null -once DisconnectClients=false desktop=Fedora 21 installation on host ppc64lehamzytest2.rch.stglabs.ibm.com SecurityTypes=None rfbauth=0
0 S root     31762 31677  0  80   0 -    64 pipe_w 15:24 pts/6    00:00:00 grep Xvnc
[anaconda root@ppc64lehamzytest2 ~]# gdb --pid=2487
...
(gdb) bt
#0  0x00003fff85508a48 in ___newselect_nocancel () at ../sysdeps/unix/syscall-template.S:81
#1  0x0000000039f7d540 in WaitForSomething (pClientsReady=0x10025d14f90) at WaitFor.c:233
#2  0x0000000039f19630 in Dispatch () at dispatch.c:361
#3  0x0000000039f1f1e0 in dix_main (argc=<optimized out>, argv=0x3fffd56fb6d8, envp=<optimized out>) at main.c:296
#4  0x0000000039d924f8 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at stubmain.c:34

Comment 7 Tim Waugh 2014-09-05 11:06:58 UTC
Can you try connecting to it using different encodings? e.g.:

vncviewer -PreferredEncoding=raw ...

Also try ZRLE, Hextile, Tight.

Let's try to work out if it seems to be encoding-specific.

Comment 8 Mark Hamzy 2014-09-05 13:10:18 UTC
[hamzy@hamzy-tp-w510 ~]$ vncviewer -PreferredEncoding=raw ppc64lehamzytest2.rch.stglabs.ibm.com:1 &
WORKS
[hamzy@hamzy-tp-w510 ~]$ vncviewer -PreferredEncoding=ZRLE ppc64lehamzytest2.rch.stglabs.ibm.com:1 &
FAILS
[hamzy@hamzy-tp-w510 ~]$ vncviewer -PreferredEncoding=Hextile ppc64lehamzytest2.rch.stglabs.ibm.com:1 &
FAILS
[hamzy@hamzy-tp-w510 ~]$ vncviewer -PreferredEncoding=Tight ppc64lehamzytest2.rch.stglabs.ibm.com:1 &
FAILS

And by works, I mean that I could at least select "English (United Kingdom)." But it is so slow as to be unusable to continue further.

Comment 9 Tim Waugh 2014-09-05 16:29:59 UTC
There seems to be quite a lot of packet loss on this link judging by the retransmissions. Could it simply be that? Can you check what 'netstat -nto' says on the system running Xvnc at the point the desktop painting stops?

Was vnc-traffic.pcap captured on the server?

Comment 10 Mark Hamzy 2014-09-05 17:44:15 UTC
So packet loss means non-working VNC?  I am able to ssh into the system with no problems so TCP communication is not a problem. Does VNC use UDP?

[anaconda root@ppc64lehamzytest2 ~]# netstat -nto
Active Internet connections (w/o servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       Timer
tcp        0      0 9.5.178.25:5901         9.50.21.177:48971       ESTABLISHED off (0.00/0/0)
tcp        0    224 9.5.178.25:22           9.50.21.177:38160       ESTABLISHED on (0.34/0/0)

Yes. vnc-traffic.pcap captured on the server.

Comment 11 Tim Waugh 2014-09-08 10:08:43 UTC
No, it doesn't use UDP, just I'm trying to rule things out. OK, so the server isn't even trying to send data and failing, it's just not sending it.

This thread on the Fedora test mailing list looks like it's describing the same problem:
https://lists.fedoraproject.org/pipermail/test/2014-September/122655.html

I haven't been able to reproduce it here. :-(

Comment 12 Mark Hamzy 2014-09-08 14:53:58 UTC
Can you build a scratch build on koji of tigervnc-1.3.0-7 (which was the version on Fedora 20 DVD) on f21? I've tried a couple of different versions around that release and they all fail to apply patches...

Comment 13 Tim Waugh 2014-09-09 12:41:01 UTC
I think I've spotted the problem. When I re-based xserver114.patch for a second time (16645cca41087a3e8df7c6c9c4151944fd8524b2), one hunk was missed.

Comment 14 Tim Waugh 2014-09-09 13:42:32 UTC
https://github.com/TigerVNC/tigervnc/pull/29

Comment 15 Fedora Update System 2014-09-09 13:44:33 UTC
tigervnc-1.3.1-11.fc21 has been submitted as an update for Fedora 21.
https://admin.fedoraproject.org/updates/FEDORA-2014-10143/tigervnc-1.3.1-11.fc21

Comment 16 Fedora Update System 2014-09-10 02:13:20 UTC
Package tigervnc-1.3.1-11.fc21:
* should fix your issue,
* was pushed to the Fedora 21 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing tigervnc-1.3.1-11.fc21'
as soon as you are able to.
Please go to the following url:
https://admin.fedoraproject.org/updates/FEDORA-2014-10143/tigervnc-1.3.1-11.fc21
then log in and leave karma (feedback).

Comment 17 Mark Hamzy 2014-09-10 15:20:23 UTC
(In reply to Tim Waugh from comment #13)
> I think I've spotted the problem. When I re-based xserver114.patch for a
> second time (16645cca41087a3e8df7c6c9c4151944fd8524b2), one hunk was missed.

That was indeed the problem. Good catch!

Comment 18 Tim Waugh 2014-09-10 16:05:29 UTC
Great, and thanks for reporting this early. Made it a lot easier to catch.

Comment 19 Tim Waugh 2014-09-11 09:43:40 UTC
Setting Hardware to All as this is not ppc64le-specific.

Comment 20 Stephen Gallagher 2014-09-11 19:26:30 UTC
Voting -1 Freeze Exception because while this is certainly annoying to users, it's not critical to the overall desktop experience. Furthermore, we've slipped Alpha so far that I'm hesitant to recommend any changes besides blockers to go in until we cut a release.

Comment 21 Adam Williamson 2014-09-11 22:27:08 UTC
This was discussed at today's go/no-go meeting and there was a strong +1 FE consensus. It may in fact be a blocker issue, but we decided to just go for FE for now, as the fix is available anyway.

If VNC is completely or substantially broken it'd be a blocker per Alpha criterion "When using a dedicated installer image, the installer must be able to complete an installation using the text, graphical and VNC installation interfaces." - https://fedoraproject.org/wiki/Fedora_21_Alpha_Release_Criteria#Installation_interfaces

Comment 22 Fedora Update System 2014-09-23 02:41:36 UTC
tigervnc-1.3.1-11.fc21 has been pushed to the Fedora 21 stable repository.  If problems still persist, please make note of it in this bug report.