Red Hat Bugzilla – Bug 749604
boot stops with gnome3 "Something has gone wrong" (PowerMac G5)
Last modified: 2013-02-13 21:10:16 EST
Description of problem:
I installed F16 PPC alpha, and after the post-reboot installation steps (e.g. configure time), I briefly saw a desktop image followed immediately by a "Oh no! Something has gone wrong" screen.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. Install F16 PPC alpha
2. Reboot and follow post-install configuration steps
Black screen that says "Oh no! Something has gone wrong."
A Gnome login screen or desktop.
Subsequent reboots all fail in the same way. I briefly see a submarine desktop image, but never see a login screen.
Have you tried waiting a bit and then hitting Alt-F4 to see if that makes the screen go away? I know that's not a solution, but it might be a usable workaround to allow testing to proceed..
Alt-F4 has no effect.
nVidia Corporation NV43 [GeForce 6600] (rev a2)
Created attachment 530527 [details]
X.org is using the nouveau driver, and I don't see any errors from it.
If you log into another terminal (ssh/switch VTs) and do "DISPLAY=:0 glxinfo | grep renderer", what happens?
(Without the grep) glxinfo prints one line saying something about using display :0, then stops. It doesn't exit, nor does it display more information. It can be control-C'ed.
I am able to reproduce this problem.
I booted into init3, created a .xinitrc with gnome-session in it and I can see a little more of the error messages. Unfortunately they don't appear to be logged, only console output, so I am unable to post them unless you want a picture of them.
I pulled the source for gjs-1.30.0-1.fc16.ppc64 and rebuilt it. The rebuild was successful. I then edited the spec and uncommented the make check so it would run tests. You will need to add an undocumented dependancy for the tests in 'cairo-gobject-devel '.
Upon rebuild with tests enabled, I get the following error which is sufficiently similar to what is in the console errors:
Running with dbus: gtester --verbose gjs-tests gjs-unit with stderr to test_user_data/logs/stderr.log
TEST: gjs-tests... (pid=16980)
TEST: gjs-unit... (pid=17009)
GTester: last random seed: R02S52742d2b6b493b7f3b210c98b5f4c36f
killing message bus 16971
killing message bus 16976
./test/run-with-dbus: script "gtester" failed
make: *** [test] Error 1
error: Bad exit status from /var/tmp/rpm-tmp.XTIpuU (%check)
I'll post more of the test details shortly.
What I really need for this is a stack trace.
Alright, stack trace from power04:
[root@power04 gjs-1.30.0]# gdb /root/rpmbuild/BUILD/gjs-1.30.0/.libs/lt-gjs-unit
GNU gdb (GDB) Fedora (220.127.116.1110722-9.fc16)
Copyright (C) 2011 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "ppc64-redhat-linux-gnu".
For bug reporting instructions, please see:
Reading symbols from /root/rpmbuild/BUILD/gjs-1.30.0/.libs/lt-gjs-unit...done.
Starting program: /root/rpmbuild/BUILD/gjs-1.30.0/.libs/lt-gjs-unit
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
/js/0010basic: [New Thread 0xfffb178f130 (LWP 8382)]
/js/0020importer: [Thread 0xfffb178f130 (LWP 8382) exited]
[New Thread 0xfffb178f130 (LWP 8383)]
JS ERROR: !!! Exception was: TypeError: framedata is undefined
JS ERROR: !!! lineNumber = '282'
JS ERROR: !!! fileName = '"/root/rpmbuild/BUILD/gjs-1.30.0/modules/jsUnit.js"'
JS ERROR: !!! stack = '"parseErrorStack([object Error])@/root/rpmbuild/BUILD/gjs-1.30.0/modules/jsUnit.js:282
JsUnitException(2,"Expected 2 (number) but was undefined (undefined)")@/root/rpmbuild/BUILD/gjs-1.30.0/modules/jsUnit.js:307
_assert(2,false,"Expected 2 (number) but was undefined (undefined)")@/root/rpmbuild/BUILD/gjs-1.30.0/modules/jsUnit.js:113
JS ERROR: !!! message = '"framedata is undefined"'
** ERROR **: TypeError: framedata is undefined
Program received signal SIGTRAP, Trace/breakpoint trap.
0x000000801e427618 in raise (sig=<optimized out>) at ../nptl/sysdeps/unix/sysv/linux/pt-raise.c:42
#0 0x000000801e427618 in raise (sig=<optimized out>) at ../nptl/sysdeps/unix/sysv/linux/pt-raise.c:42
#1 0x000000801e6959b8 in g_logv (log_domain=0x0, log_level=G_LOG_LEVEL_ERROR, format=0x10002448 "%s", args1=0xfffffffeab8 "") at gmessages.c:570
#2 0x000000801e695d94 in g_log (log_domain=<optimized out>, log_level=<optimized out>, format=<optimized out>) at gmessages.c:591
#3 0x0000000010001ff4 in test (fix=<optimized out>, test_data=<optimized out>) at test/gjs-unit.c:97
#4 0x000000801e6bd02c in test_case_run (tc=0x1004d030) at gtestutils.c:1227
#5 g_test_run_suite_internal (suite=0x10030d00, path=0x801e75f318 "") at gtestutils.c:1280
#6 0x000000801e6bd1e8 in g_test_run_suite_internal (suite=<optimized out>, path=0x801e75f318 "") at gtestutils.c:1291
#7 0x000000801e6bd698 in g_test_run_suite (suite=0x10030ca0) at gtestutils.c:1336
#8 0x000000801e6bd6f8 in g_test_run () at gtestutils.c:887
#9 0x0000000010001a44 in main (argc=1, argv=0xffffffff408) at test/gjs-unit.c:286
If you need access to the machine to debug further Colin just let me know and i'll give the the details privately.
So, "for" loops in spidermonkey on PPC64 seems to be totally broken:
for (var i = 0; i < 10; i++)
Results in an infinite:
So grabbing the upstream https://developer.mozilla.org/en/SpiderMonkey/1.8.5 and adding the Fedora patches on top, and then looking at the output of "make" doesn't give me warm fuzzy feelings about how well tested spidermonkey is when the JIT code paths are disabled.
jsinterp.cpp:2356:10: warning: unused variable ‘useMethodJIT’ [-Wunused-variable]
jsinterp.cpp:2910:3: warning: label ‘jit_return’ defined but not used [-Wunused-label]
Running 'make check' in spidermonkey:
make: Entering directory `/home/test/js-1.8.5/js/src/jsapi-tests'
testCustomIterator.cpp:76:CHECK failed: JSVAL_TO_INT(result) == 100
TEST-UNEXPECTED-FAIL | testCustomIterator_bug612523 | CHECK failed: JSVAL_TO_INT(result) == 100
TEST-PASS | testXDR_bug516827 | ok
TEST-PASS | testXDR_bug506491 | ok
TEST-PASS | testEvalPropagatesOverride | ok
TEST-PASS | testReturnLosesOverride | ok
TEST-PASS | testEntryLosesOverride | ok
TEST-PASS | testOptionsAreUsedForVersionFlags | ok
TEST-PASS | testUTF8_bug589917 | ok
And then it appears to hang forever.
So I think it's related to this bug:
You can reproduce the problem pretty easily by just doing:
var i = 0;
The problem seems to be here in jsinterp.cpp (do_incop):
if (cs->format & JOF_POST)
ref.getInt32Ref() = tmp + incr;
getInt32Ref() returns the 32-bit member of a union that overlaps with the tag member. So writing to that member corrupts the tag information of the value.
The above bug adds padding so the tag member doesn't get overwritten. I haven't tried the patch yet to see if it helps.
to be clearer, ref is a jsval_layout union which is defined as:
typedef union jsval_layout
JSValueTag tag : 17;
uint64 payload47 : 47
getInt32Ref just returns the address of the u32 member. So by writing to that address, it overwrites the tag information for the value which occupies the same place in memory.
The patches in the upstream bug there, add padding to the s union so that the tag member overlaps with the padding instead of the u32 member. The patches also remove the word member since it's 64-bit and wouldn't allow enough space for the padding to be added without growing the struct.
I just came off a small holiday break, but wanted to give an update here.
The patches mentioned in Comment 13 fix the original issue, but "make check" in the js package still fails later on when running the garbage collector after an object is "frozen" (some other test later).
Colin and I started to look into that more, but at some point I had to leave on monday evening to make an appointment and then I was out until today.
So that's where we're at (as far as I know, I haven't talked to Colin recently), but we should have more news soon.
I talked to Colin, and he says that he's filed the upstream bug:
to track the follow up failure I alluded to in comment 15.
So I looked at this a little more today.
The patches worked flawlessly against the js test suite today. I'm going to write off the pre-break testing as experimental error (probably from not installing the fixed library, only running it in tree).
We're still hitting issues with gjs, though. It uses ffi incorrectly. I have a patch that fixes a first round of issues, but there's still some problems with enumerations that I'm investigating.
Created attachment 537700 [details]
the preliminary, incomplete patch
Created attachment 537701 [details]
the upstream js patch
and just for completeness, here is the upstream js patch that fixes the js test suite regressions.
i have to go now, but i'll look into the remaining issues tomorrow.
Okay i've done a little more digging.
There were a few more problems with gobject-introspection and gjs's handling of enumerations.
I have an update for the patch in comment 18 and a new patch against gobject-introspection.
With those patches the test suite passes, but there are probably other lingering issues.
In particular the public function gi_type_tag_get_ffi_type doesn't work for enum types and can't work, so there probably needs to be an audit of all callers of that function.
Will post two patches now.
Created attachment 538194 [details]
This fixes the g_type_info_get_ffi_type function to return the proper ffi type for enums
Created attachment 538196 [details]
This fixes gjs to convert between GArgument and ffi arguments/return values as appropriate.
The main problem is highlighted in the ffi_call man page:
rvalue must point to storage that is sizeof(long) or larger. For smaller return value sizes, the ffi_arg or ffi_sarg integral type must be used to hold the return value
The code passes a GArgument to ffi_call directly, but GArgument is a union of different types, some of them less than sizeof(long). That means once ffi_call() is finished, it's only okay to access members of the union that are sizeof(long).
fixed packages are js-1.8.5-7.kh.fc16.ppc64.rpm gjs-1.30.0-1.kh.fc16.ppc64.rpm and gobject-introspection-1.30.0-1.kh.fc16.ppc64.rpm
Confirmed fixed. Thanks!
It looks like these fixes got dropped in fc17, I'm seeing this issue with:
(In reply to comment #26)
> It looks like these fixes got dropped in fc17, I'm seeing this issue with:
The fixes should be in those packages. Are you sure you aren't getting the "fail whale" due to an entirely different problem?
Fedora 16 changed to end-of-life (EOL) status on 2013-02-12. Fedora 16 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.
If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version.
Thank you for reporting this bug and we are sorry it could not be fixed.