Bug 490515

Summary: VNC display on ppc (power5) shows only black screen
Product: [Fedora] Fedora Reporter: James Laska <jlaska>
Component: iscsi-initiator-utilsAssignee: Mike Christie <mchristi>
Status: CLOSED RAWHIDE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: low    
Version: rawhideCC: agrover, ak, atkac, dcantrell, hdegoede, jturner, kmcmartin, mchristi, pjones, rmaximo, vanmeeuwen+fedora
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 489698 Environment:
Last Closed: 2009-03-23 14:36:54 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 489698    
Bug Blocks: 476774    
Attachments:
Description Flags
anaconda-logs.tgz (anaconda.log, syslog, vncserver.log, program.log etc...)
none
vncserver.log (F-11-Alpha-ppc) none

Description James Laska 2009-03-16 19:17:18 UTC
Created attachment 335401 [details]
anaconda-logs.tgz (anaconda.log, syslog, vncserver.log, program.log etc...)

+++ This bug was initially created as a clone of Bug #489698 +++

Description of problem:

Start a vnc install of rawhide on an IBM Power5 ppc system.  Once you connect to the vnc session (from another rawhide system) ... the display is all black.

Version-Release number of selected component (if applicable):

 * anaconda version 11.5.0.29 on ppc
 * Xvnc TigerVNC 0.0.90 - built Mar  3 2009 16:14:12

How reproducible:

 * Everytime

Steps to Reproduce:
 1. Boot rawhide vmlinuz+ramdisk.image.gz on a IBM Power5 system (ibm-505-lp1.test.redhat.com)
 2. Select VNC install (no passwd)
 3. Connect to the vnc session with vncviewer from rawhide or F10
  
Actual results:

 * Launcing vncviewer from a rawhide client displays an empty black window

Expected results:

 * I expect to see the installer on the vncviewer window

Additional info:

 * using vncviewer from vnc-4.1.3-1.fc10 also has this problem, so I suspect it's not the client

 * On anaconda, /tmp/vncserver.log shows the following upon connection.

sh-4.0# tail -fn0 /tmp/vncserver.log 
Wed Mar 11 13:37:30 2009
 Connections: accepted: 10.11.228.34::48267
 SConnection: Client needs protocol version 3.8
 SConnection: Client requests security type None(1)
 JpegEncoder: no hardware JPEG compressor available
 VNCSConnST:  Server default pixel format depth 16 (16bpp) big-endian rgb565
 VNCSConnST:  Client pixel format depth 24 (32bpp) little-endian rgb888

 * See attached anaconda-logs.tgz for all files in /tmp

 * originally this was thought to be the zlib-corruption issue (https://www.redhat.com/archives/fedora-devel-list/2009-March/msg00746.html), but that is now fixed and this problem remains.

Comment 1 James Laska 2009-03-17 13:58:13 UTC
Created attachment 335526 [details]
vncserver.log (F-11-Alpha-ppc)

Works in F-11-Alpha-ppc (which uses tightVNC).  Could this be related to tigerVNC server running in anaconda install environment?

Comment 2 Adam Tkac 2009-03-17 18:00:36 UTC
After some testing problem is not in Xorg/Xvnc so it must be somewhere in anaconda, I think.

Comment 3 Jesse Keating 2009-03-17 22:42:10 UTC
I can reproduce with a ppc mini with today's rawhide.  However it works on the ps3 which is ppc64.

Comment 4 James Laska 2009-03-18 12:16:12 UTC
Odd, the system I'm unable to get a display on is an IBM power5 (ppc64).  I'm not able to see miniwm over X, or VNC ... nothing obvious jumps out from the logs so I'm somewhat perplexed on how to proceed.

Comment 5 James Laska 2009-03-18 13:21:37 UTC
Tested VNC, X server and text-mode install ... all hang while transitioning to stage#2 anaconda.  I've pasted sysrq-t output at http://fpaste.org/paste/6404 which shows a *ton* of  mv88e6131_switch_driver+0xfea ... not sure what that is..

Comment 6 James Laska 2009-03-19 14:39:39 UTC
Still present anaconda version 11.5.0.33

= The Summary =

Unable to proceed to stage#2 anaconda.  Tested stage#2 UI methods include X, text-mode, and VNC.  The screen just sits at a black screen with a cursor (VNC and X).  And nothing happens for text-mode.

Comment 7 Kyle McMartin 2009-03-19 17:08:08 UTC
For completeness, the mv88e6131_switch_driver is just detritus in the calltrace, accidently chosen because it's likely the nearest symbol. It should probably just be ignored (in fact, it's not a function at all, but a static struct. :) in the traces.

(The mv88e6131 driver is, sadly, boolean so built in everywhere, whether that's sensible is another bug, but none of its codepaths should be involved in what's going on.)

cheers, Kyle

Comment 8 Jesse Keating 2009-03-19 17:25:25 UTC
On my ppc mini where I see this, one of the anaconda processes is taking up all the cpu.  Stracing it shows a never ending stream of:

 --- SIGSEGV (Segmentation fault) @ 0 (0) ---
 rt_sigaction(SIGSEGV, {0xff33e00, [], 0}, {0xff33e00, [], 0}, 8) = 0

Comment 9 James Laska 2009-03-19 17:32:55 UTC
I see the same segfaults while strace'ing anaconda.

# ps aux | grep anaconda 
root      2049 99.7  0.8  67756 31120 hvc0     R    14:22 185:44 /usr/bin/python /usr/bin/anaconda --stage2 http://gromit.redhat.com/pub/fedora/linux/development/ppc/os/images/install.img --serial -T --selinux --virtpconsole /dev/hvc0 --lang en_US.UTF-8 --repo http://gromit.redhat.com/pub/fedora/linux/development/ppc/os

# /tmp/strace -f -p  2049

--- SIGSEGV (Segmentation fault) @ 0 (0) ---
rt_sigaction(SIGSEGV, {0xff33e00, [], 0}, {0xff33e00, [], 0}, 8) = 0
sigreturn()                             = ? (mask now [ILL ABRT FPE SEGV ALRM STKFLT CHLD CONT STOP TSTP TTIN TTOU URG XCPU XFSZ VTALRM PROF IO PWR UNUSED RTMIN])
^C--- SIGSEGV (Segmentation fault) @ 0 (0) ---

Comment 10 Kyle McMartin 2009-03-19 18:21:09 UTC
Oddly, turning on print-fatal-signals (/proc/sys/kernel/print-fatal-signals) doesn't seem to display anything, but clumens tried gdb which produced this tombstone:

gdb/2248: potentially unexpected fatal signal 11.

NIP: 000000000fab85f8 LR: 000000000fab7e28 CTR: 000000000fabb160
REGS: c0000000e2c13ea0 TRAP: 0300   Not tainted  (2.6.29-0.258.rc8.git2.fc11.ppc64)
MSR: 000000000000f032 <EE,PR,FP,ME,IR,DR>  CR: 44044844  XER: 00000000
DAR: 00000000cccccce0, DSISR: 0000000042000000
TASK = c0000000e288cce0[2248] 'gdb' THREAD: c0000000e2c10000 CPU: 2
GPR00: 00000000cccccccc 00000000ffab44a0 00000000f80004a0 000000000fbe2028 
GPR04: 0000000000000048 0000000000000000 fffffffffefefeff 000000000fbe2250 
GPR08: 0000000000001000 000000000fbe20b0 00000000107e8980 00000000cccccccc 
GPR12: 0000000024044842 00000000104050b8 0000000010400000 0000000000000000 
GPR16: 000000000fbe10a8 000000000fb9a458 000000000000000a 000000000000000a 
GPR20: 0000000000000001 000000000fbe2060 0000000000000003 000000000fbe2058 
GPR24: 0000000067000060 0000000067000010 00000000107e8980 0000000000000048 
GPR28: 000000000fbe201c 0000000000000050 000000000fbe0ff4 000000000fbe2028 
NIP [000000000fab85f8] 0xfab85f8
LR [000000000fab7e28] 0xfab7e28

Comment 11 Chris Lumens 2009-03-19 19:56:06 UTC
Running from a shell on the machine in question:

>>> libiscsi.get_firmware_initiator_name()

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
IOError: Unknown error
>>> try: 
...     libiscsi.get_firmware_initiator_name()
... except:
...     print "whoops"
... 
Segmentation fault

Comment 12 Chris Lumens 2009-03-19 20:09:16 UTC
And yet if you catch just the IOError:

>>> try:
...     libiscsi.get_firmware_initiator_name()
... except IOError:
...     print "whoops"
... 

whoops


And apparently this is not reproducable in userland.

Comment 13 James Laska 2009-03-20 13:59:48 UTC
Testing anaconda-11.5.0.34 with http://people.atrpms.net/~hdegoede/updates.img resolves the issue.

Comment 14 Chris Lumens 2009-03-20 14:01:28 UTC
*** Bug 491306 has been marked as a duplicate of this bug. ***

Comment 15 Hans de Goede 2009-03-20 16:12:39 UTC
This is worked around in iscsi-initiator-utils-6.2.0.870-7.fc11, which is being tagged in to the beta. I'm opening a new bug to track the real issue, as this workaround is supposed to be a temporary solution.

Comment 16 Hans de Goede 2009-03-20 16:16:44 UTC
Note the work around is to replace fw_get_entry() calls in libiscsi with fwparam_ibft_sysfs() calls, so that fwparam_ppc() no longer gets called,
fwparam_ppc() evidently causes memory corruption on ppc machines.

This means we loose the capability to automatically discover firmware configured iscsi disks on ppc, but that is way better then not installing :)

Comment 17 Hans de Goede 2009-03-20 16:17:35 UTC
The new bug for tracking the fwparam_ppc() issue is bug 491363 .

Comment 18 Alex Kanavin 2009-03-22 13:16:52 UTC
Latest rawhide boot.iso gets through to the graphical welcome screen, thanks :)

Comment 19 James Laska 2009-03-23 14:36:54 UTC
Confirmed fixed as well on IBM power5 ppc64 with 2009-03-23 rawhide.