Bug 749604 - boot stops with gnome3 "Something has gone wrong" (PowerMac G5)
boot stops with gnome3 "Something has gone wrong" (PowerMac G5)
Status: CLOSED WONTFIX
Product: Fedora
Classification: Fedora
Component: js (Show other bugs)
16
ppc64 Unspecified
unspecified Severity urgent
: ---
: ---
Assigned To: Pavel Alexeev
Fedora Extras Quality Assurance
:
Depends On:
Blocks: F16Betappc
  Show dependency treegraph
 
Reported: 2011-10-27 12:27 EDT by Hollis Blanchard
Modified: 2013-02-13 21:10 EST (History)
11 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2013-02-13 21:10:10 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Xorg.0.log (31.47 KB, text/plain)
2011-10-27 12:46 EDT, Hollis Blanchard
no flags Details
the preliminary, incomplete patch (2.93 KB, patch)
2011-11-28 18:21 EST, Ray Strode [halfline]
no flags Details | Diff
the upstream js patch (1.34 KB, patch)
2011-11-28 18:24 EST, Ray Strode [halfline]
no flags Details | Diff
gobject-introspection patch (2.59 KB, patch)
2011-11-29 14:05 EST, Ray Strode [halfline]
no flags Details | Diff
gjs fixes (4.44 KB, patch)
2011-11-29 14:10 EST, Ray Strode [halfline]
no flags Details | Diff


External Trackers
Tracker ID Priority Status Summary Last Updated
GNOME Desktop 665150 None None None Never
GNOME Desktop 665152 None None None Never
Mozilla Foundation 674522 None None None Never
Mozilla Foundation 704302 None None None Never

  None (edit)
Description Hollis Blanchard 2011-10-27 12:27:16 EDT
Description of problem:
I installed F16 PPC alpha, and after the post-reboot installation steps (e.g. configure time), I briefly saw a desktop image followed immediately by a "Oh no! Something has gone wrong" screen.

Version-Release number of selected component (if applicable):
gnome-session-3.2.0-1.fc16.ppc64
gnome-shell-3.2.0-2.fc16.ppc64

How reproducible:
Always.

Steps to Reproduce:
1. Install F16 PPC alpha
2. Reboot and follow post-install configuration steps

Actual results:
Black screen that says "Oh no! Something has gone wrong."

Expected results:
A Gnome login screen or desktop.

Additional info:
Subsequent reboots all fail in the same way. I briefly see a submarine desktop image, but never see a login screen.
Comment 1 Will Woods 2011-10-27 12:32:05 EDT
Have you tried waiting a bit and then hitting Alt-F4 to see if that makes the screen go away? I know that's not a solution, but it might be a usable workaround to allow testing to proceed..
Comment 2 Hollis Blanchard 2011-10-27 12:37:21 EDT
Alt-F4 has no effect.
Comment 3 Hollis Blanchard 2011-10-27 12:38:06 EDT
lspci reports:
nVidia Corporation NV43 [GeForce 6600] (rev a2)
Comment 4 Hollis Blanchard 2011-10-27 12:46:51 EDT
Created attachment 530527 [details]
Xorg.0.log

X.org is using the nouveau driver, and I don't see any errors from it.
Comment 5 Will Woods 2011-10-27 12:54:08 EDT
If you log into another terminal (ssh/switch VTs) and do "DISPLAY=:0 glxinfo | grep renderer", what happens?
Comment 6 Hollis Blanchard 2011-10-27 12:56:41 EDT
(Without the grep) glxinfo prints one line saying something about using display :0, then stops. It doesn't exit, nor does it display more information. It can be control-C'ed.
Comment 7 Brent Baude 2011-11-16 12:29:06 EST
I am able to reproduce this problem.  

I booted into init3, created a .xinitrc with gnome-session in it and I can see a little more of the error messages.  Unfortunately they don't appear to be logged, only console output, so I am unable to post them unless you want a picture of them.

The errors are related to javascript errors and references a stack failure on '/usr/share/gjs-1.0/dbus.js'

I pulled the source for gjs-1.30.0-1.fc16.ppc64 and rebuilt it.  The rebuild was successful.  I then edited the spec and uncommented the make check so it would run tests.  You will need to add an undocumented dependancy for the tests in 'cairo-gobject-devel '.

Upon rebuild with tests enabled, I get the following error which is sufficiently similar to what is in the console errors:

Running with dbus: gtester --verbose gjs-tests gjs-unit with stderr to test_user_data/logs/stderr.log
TEST: gjs-tests... (pid=16980)
  /gjs/context/construct/destroy:                                      OK
  /gjs/context/construct/eval:                                         OK
  /gjs/jsapi/util/array:                                               OK
  /gjs/jsapi/util/error/throw:                                         OK
  /gjs/jsapi/util/string/js/string/utf8:                               OK
  /gjs/jsapi/util/string/get/ascii:                                    OK
  /gjs/jsapi/util/string/get/binary:                                   OK
  /gjs/stack/dump:                                                     OK
  /util/glib/strv/concat/null:                                         OK
  /util/glib/strv/concat/pointers:                                     OK
PASS: gjs-tests
TEST: gjs-unit... (pid=17009)
  /js/0010basic:                                                       OK
  /js/0020importer:                                                    FAIL
GTester: last random seed: R02S52742d2b6b493b7f3b210c98b5f4c36f
Terminated
killing message bus 16971
killing message bus 16976
./test/run-with-dbus: script "gtester" failed
make: *** [test] Error 1
error: Bad exit status from /var/tmp/rpm-tmp.XTIpuU (%check)

I'll post more of the test details shortly.
Comment 8 Colin Walters 2011-11-21 11:19:13 EST
What I really need for this is a stack trace.
Comment 9 Phil Knirsch 2011-11-21 11:51:09 EST
Alright, stack trace from power04:

[root@power04 gjs-1.30.0]# gdb /root/rpmbuild/BUILD/gjs-1.30.0/.libs/lt-gjs-unit
GNU gdb (GDB) Fedora (7.3.50.20110722-9.fc16)
Copyright (C) 2011 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "ppc64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /root/rpmbuild/BUILD/gjs-1.30.0/.libs/lt-gjs-unit...done.
(gdb) run
Starting program: /root/rpmbuild/BUILD/gjs-1.30.0/.libs/lt-gjs-unit
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
/js/0010basic: [New Thread 0xfffb178f130 (LWP 8382)]
OK
/js/0020importer: [Thread 0xfffb178f130 (LWP 8382) exited]
[New Thread 0xfffb178f130 (LWP 8383)]
    JS ERROR: !!!   Exception was: TypeError: framedata is undefined
    JS ERROR: !!!     lineNumber = '282'
    JS ERROR: !!!     fileName = '"/root/rpmbuild/BUILD/gjs-1.30.0/modules/jsUnit.js"'
    JS ERROR: !!!     stack = '"parseErrorStack([object Error])@/root/rpmbuild/BUILD/gjs-1.30.0/modules/jsUnit.js:282
getStackTrace()@/root/rpmbuild/BUILD/gjs-1.30.0/modules/jsUnit.js:255
JsUnitException(2,"Expected 2 (number) but was undefined (undefined)")@/root/rpmbuild/BUILD/gjs-1.30.0/modules/jsUnit.js:307
_assert(2,false,"Expected 2 (number) but was undefined (undefined)")@/root/rpmbuild/BUILD/gjs-1.30.0/modules/jsUnit.js:113
assertEquals(2,2)@/root/rpmbuild/BUILD/gjs-1.30.0/modules/jsUnit.js:150
testImporter1()@/root/rpmbuild/BUILD/gjs-1.30.0/test/js/test0020importer.js:4
gjstestRun()@/root/rpmbuild/BUILD/gjs-1.30.0/modules/jsUnit.js:471
@/root/rpmbuild/BUILD/gjs-1.30.0/test/js/test0020importer.js:7
"'
    JS ERROR: !!!     message = '"framedata is undefined"'

** ERROR **: TypeError: framedata is undefined

Program received signal SIGTRAP, Trace/breakpoint trap.
0x000000801e427618 in raise (sig=<optimized out>) at ../nptl/sysdeps/unix/sysv/linux/pt-raise.c:42
42                               sig);
(gdb) where
#0  0x000000801e427618 in raise (sig=<optimized out>) at ../nptl/sysdeps/unix/sysv/linux/pt-raise.c:42
#1  0x000000801e6959b8 in g_logv (log_domain=0x0, log_level=G_LOG_LEVEL_ERROR, format=0x10002448 "%s", args1=0xfffffffeab8 "") at gmessages.c:570
#2  0x000000801e695d94 in g_log (log_domain=<optimized out>, log_level=<optimized out>, format=<optimized out>) at gmessages.c:591
#3  0x0000000010001ff4 in test (fix=<optimized out>, test_data=<optimized out>) at test/gjs-unit.c:97
#4  0x000000801e6bd02c in test_case_run (tc=0x1004d030) at gtestutils.c:1227
#5  g_test_run_suite_internal (suite=0x10030d00, path=0x801e75f318 "") at gtestutils.c:1280
#6  0x000000801e6bd1e8 in g_test_run_suite_internal (suite=<optimized out>, path=0x801e75f318 "") at gtestutils.c:1291
#7  0x000000801e6bd698 in g_test_run_suite (suite=0x10030ca0) at gtestutils.c:1336
#8  0x000000801e6bd6f8 in g_test_run () at gtestutils.c:887
#9  0x0000000010001a44 in main (argc=1, argv=0xffffffff408) at test/gjs-unit.c:286

If you need access to the machine to debug further Colin just let me know and i'll give the the details privately.

Thanks!

Regards, Phil
Comment 10 Colin Walters 2011-11-21 13:36:14 EST
So, "for" loops in spidermonkey on PPC64 seems to be totally broken:

for (var i = 0; i < 10; i++)
  print(""+i);

Results in an infinite:

4.2439915824e-314
1
Comment 11 Colin Walters 2011-11-21 13:58:01 EST
So grabbing the upstream https://developer.mozilla.org/en/SpiderMonkey/1.8.5 and adding the Fedora patches on top, and then looking at the output of "make" doesn't give me warm fuzzy feelings about how well tested spidermonkey is when the JIT code paths are disabled.

jsinterp.cpp:2356:10: warning: unused variable ‘useMethodJIT’ [-Wunused-variable]
jsinterp.cpp:2910:3: warning: label ‘jit_return’ defined but not used [-Wunused-label]
Comment 12 Colin Walters 2011-11-21 14:04:10 EST
Running 'make check' in spidermonkey:

make[1]: Entering directory `/home/test/js-1.8.5/js/src/jsapi-tests'
../dist/bin/jsapi-tests
testCustomIterator_bug612523
testCustomIterator.cpp:76:CHECK failed: JSVAL_TO_INT(result) == 100
TEST-UNEXPECTED-FAIL | testCustomIterator_bug612523 | CHECK failed: JSVAL_TO_INT(result) == 100
testXDR_bug516827
TEST-PASS | testXDR_bug516827 | ok
testXDR_bug506491
TEST-PASS | testXDR_bug506491 | ok
testEvalPropagatesOverride
TEST-PASS | testEvalPropagatesOverride | ok
testReturnLosesOverride
TEST-PASS | testReturnLosesOverride | ok
testEntryLosesOverride
TEST-PASS | testEntryLosesOverride | ok
testOptionsAreUsedForVersionFlags
TEST-PASS | testOptionsAreUsedForVersionFlags | ok
testUTF8_bug589917
TEST-PASS | testUTF8_bug589917 | ok
testTrap_gc

And then it appears to hang forever.
Comment 13 Ray Strode [halfline] 2011-11-21 15:00:42 EST
So I think it's related to this bug:

https://bugzilla.mozilla.org/show_bug.cgi?id=674522

You can reproduce the problem pretty easily by just doing:

/usr/bin/js
var i = 0;
i++;
i++;

The problem seems to be here in jsinterp.cpp (do_incop):

        if (cs->format & JOF_POST)
            ref.getInt32Ref() = tmp + incr;

getInt32Ref() returns the 32-bit member of a union that overlaps with the tag member.  So writing to that member corrupts the tag information of the value.

The above bug adds padding so the tag member doesn't get overwritten.  I haven't tried the patch yet to see if it helps.
Comment 14 Ray Strode [halfline] 2011-11-21 15:06:02 EST
to be clearer, ref is a jsval_layout union which is defined as:

typedef union jsval_layout               
{                                        
    uint64 asBits;                       
    struct {                             
        JSValueTag         tag : 17;     
        uint64             payload47 : 47
    } debugView;                         
    struct {                             
        union {                          
            int32          i32;          
            uint32         u32;          
            JSWhyMagic     why;
            jsuword        word;                               
        } payload;                       
    } s;                                 

    double asDouble;                     
    void *asPtr;                         
}

getInt32Ref just returns the address of the u32 member. So by writing to that address, it overwrites the tag information for the value which occupies the same place in memory.

The patches in the upstream bug there, add padding to the s union so that the tag member overlaps with the padding instead of the u32 member.  The patches also remove the word member since it's 64-bit and wouldn't allow enough space for the padding to be added without growing the struct.
Comment 15 Ray Strode [halfline] 2011-11-28 10:18:20 EST
I just came off a small holiday break, but wanted to give an update here.

The patches mentioned in Comment 13 fix the original issue, but "make check" in the js package still fails later on when running the garbage collector after an object is "frozen" (some other test later).

Colin and I started to look into that more, but at some point I had to leave on monday evening to make an appointment and then I was out until today.

So that's where we're at (as far as I know, I haven't talked to Colin recently), but we should have more news soon.
Comment 16 Ray Strode [halfline] 2011-11-28 10:36:15 EST
I talked to Colin, and he says that he's filed the upstream bug:

https://bugzilla.mozilla.org/show_bug.cgi?id=704302

to track the follow up failure I alluded to in comment 15.
Comment 17 Ray Strode [halfline] 2011-11-28 18:17:49 EST
So I looked at this a little more today.

The patches worked flawlessly against the js test suite today. I'm going to write off the pre-break testing as experimental error (probably from not installing the fixed library, only running it in tree).

We're still hitting issues with gjs, though.  It uses ffi incorrectly.  I have a patch that fixes a first round of issues, but there's still some problems with enumerations that I'm investigating.
Comment 18 Ray Strode [halfline] 2011-11-28 18:21:02 EST
Created attachment 537700 [details]
the preliminary, incomplete patch
Comment 19 Ray Strode [halfline] 2011-11-28 18:24:27 EST
Created attachment 537701 [details]
the upstream js patch

and just for completeness, here is the upstream js patch that fixes the js test suite regressions.
Comment 20 Ray Strode [halfline] 2011-11-28 18:25:34 EST
i have to go now, but i'll look into the remaining issues tomorrow.
Comment 21 Ray Strode [halfline] 2011-11-29 14:01:06 EST
Okay i've done a little more digging.

There were a few more problems with gobject-introspection and gjs's handling of enumerations.

I have an update for the patch in comment 18 and a new patch against gobject-introspection.

With those patches the test suite passes, but there are probably other lingering issues.

In particular the public function gi_type_tag_get_ffi_type doesn't work for enum types and can't work, so there probably needs to be an audit of all callers of that function.

Will post two patches now.
Comment 22 Ray Strode [halfline] 2011-11-29 14:05:16 EST
Created attachment 538194 [details]
gobject-introspection patch

This fixes the g_type_info_get_ffi_type function to return the proper ffi type for enums
Comment 23 Ray Strode [halfline] 2011-11-29 14:10:06 EST
Created attachment 538196 [details]
gjs fixes

This fixes gjs to convert between GArgument and ffi arguments/return values as appropriate.

The main problem is highlighted in the ffi_call man page:

rvalue must point to storage that is sizeof(long) or larger. For smaller return value sizes, the ffi_arg or ffi_sarg integral type must be used to hold the return value

The code passes a GArgument to ffi_call directly, but GArgument is a union of different types, some of them less than sizeof(long). That means once ffi_call() is finished, it's only okay to access members of the union that are sizeof(long).
Comment 24 Karsten Hopp 2011-12-02 12:01:45 EST
fixed packages are js-1.8.5-7.kh.fc16.ppc64.rpm gjs-1.30.0-1.kh.fc16.ppc64.rpm and gobject-introspection-1.30.0-1.kh.fc16.ppc64.rpm
Comment 25 Hollis Blanchard 2011-12-09 00:33:55 EST
Confirmed fixed. Thanks!
Comment 26 David Aquilina 2012-05-10 11:07:58 EDT
It looks like these fixes got dropped in fc17, I'm seeing this issue with: 

js-1.8.5-9.fc17.ppc64
gjs-1.32.0-1.fc17.ppc64
gobject-introspection-1.32.1-1.fc17.ppc64
Comment 27 Colin Walters 2012-05-10 15:12:28 EDT
(In reply to comment #26)
> It looks like these fixes got dropped in fc17, I'm seeing this issue with: 
> 
> js-1.8.5-9.fc17.ppc64
> gjs-1.32.0-1.fc17.ppc64
> gobject-introspection-1.32.1-1.fc17.ppc64

The fixes should be in those packages.  Are you sure you aren't getting the "fail whale" due to an entirely different problem?
Comment 28 Fedora End Of Life 2013-02-13 21:10:16 EST
Fedora 16 changed to end-of-life (EOL) status on 2013-02-12. Fedora 16 is 
no longer maintained, which means that it will not receive any further 
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of 
Fedora please feel free to reopen this bug against that version.

Thank you for reporting this bug and we are sorry it could not be fixed.

Note You need to log in before you can comment on or make changes to this bug.