Bug 1034467 (js-shape-finalize) - [abrt] gnome-shell-3.11.2-3.fc21: js::Shape::finalize: Process /usr/bin/gnome-shell was killed by signal 11 (SIGSEGV)
Summary: [abrt] gnome-shell-3.11.2-3.fc21: js::Shape::finalize: Process /usr/bin/gnome...
Keywords:
Status: CLOSED WORKSFORME
Alias: js-shape-finalize
Product: Fedora
Classification: Fedora
Component: gnome-shell
Version: 22
Hardware: x86_64
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Owen Taylor
QA Contact: Fedora Extras Quality Assurance
URL: https://retrace.fedoraproject.org/faf...
Whiteboard: abrt_hash:ac6e677411b779a753c0f01979f...
: 1034468 1035285 1037692 1052964 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-11-25 22:12 UTC by Nicolas Mailhot
Modified: 2016-03-20 10:17 UTC (History)
19 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-03-20 10:17:48 UTC


Attachments (Terms of Use)
File: backtrace (45.99 KB, text/plain)
2013-11-25 22:12 UTC, Nicolas Mailhot
no flags Details
File: cgroup (172 bytes, text/plain)
2013-11-25 22:12 UTC, Nicolas Mailhot
no flags Details
File: core_backtrace (16.27 KB, text/plain)
2013-11-25 22:12 UTC, Nicolas Mailhot
no flags Details
File: dso_list (22.25 KB, text/plain)
2013-11-25 22:12 UTC, Nicolas Mailhot
no flags Details
File: environ (1.65 KB, text/plain)
2013-11-25 22:13 UTC, Nicolas Mailhot
no flags Details
File: exploitable (82 bytes, text/plain)
2013-11-25 22:13 UTC, Nicolas Mailhot
no flags Details
File: limits (1.29 KB, text/plain)
2013-11-25 22:13 UTC, Nicolas Mailhot
no flags Details
File: maps (191.86 KB, text/plain)
2013-11-25 22:13 UTC, Nicolas Mailhot
no flags Details
File: open_fds (2.94 KB, text/plain)
2013-11-25 22:13 UTC, Nicolas Mailhot
no flags Details
File: proc_pid_status (953 bytes, text/plain)
2013-11-25 22:13 UTC, Nicolas Mailhot
no flags Details
File: var_log_messages (13.70 KB, text/plain)
2013-11-25 22:13 UTC, Nicolas Mailhot
no flags Details
valgrind log on gnome-shell startup (3.35 KB, text/x-log)
2014-02-12 14:29 UTC, Seppo Yli-Olli
no flags Details
valgrind log of a crash (5.84 MB, text/plain)
2014-02-14 02:05 UTC, Adam Williamson
no flags Details
better valgrind log of a crash (6.72 MB, text/plain)
2014-02-14 05:20 UTC, Adam Williamson
no flags Details


Links
System ID Priority Status Summary Last Updated
Red Hat Bugzilla 1028813 None None None Never
Red Hat Bugzilla 1052964 None None None Never

Internal Links: 1028813 1052964

Description Nicolas Mailhot 2013-11-25 22:12:35 UTC
Version-Release number of selected component:
gnome-shell-3.11.2-3.fc21

Additional info:
reporter:       libreport-2.1.9
backtrace_rating: 4
cmdline:        /usr/bin/gnome-shell
crash_function: js::Shape::finalize
executable:     /usr/bin/gnome-shell
kernel:         3.13.0-0.rc1.git0.1.fc21.x86_64
runlevel:       N 5
type:           CCpp
uid:            1000

Truncated backtrace:
Thread no. 1 (10 frames)
 #0 js::Shape::finalize at /usr/src/debug/mozjs17.0.0/js/src/jspropertytree.cpp:210
 #1 finalize<js::Shape> at /usr/src/debug/mozjs17.0.0/js/src/jsgc.cpp:355
 #2 FinalizeTypedArenas<js::Shape> at /usr/src/debug/mozjs17.0.0/js/src/jsgc.cpp:419
 #3 js::gc::FinalizeArenas at /usr/src/debug/mozjs17.0.0/js/src/jsgc.cpp:460
 #4 foregroundFinalize at /usr/src/debug/mozjs17.0.0/js/src/jsgc.cpp:3803
 #5 SweepPhase at /usr/src/debug/mozjs17.0.0/js/src/jsgc.cpp:3823
 #6 IncrementalCollectSlice at /usr/src/debug/mozjs17.0.0/js/src/jsgc.cpp:4245
 #7 GCCycle at /usr/src/debug/mozjs17.0.0/js/src/jsgc.cpp:4408
 #8 Collect at /usr/src/debug/mozjs17.0.0/js/src/jsgc.cpp:4516
 #9 js_InvokeOperationCallback at /usr/src/debug/mozjs17.0.0/js/src/jscntxt.cpp:1028

Potential duplicate: bug 1028813

Comment 1 Nicolas Mailhot 2013-11-25 22:12:45 UTC
Created attachment 828936 [details]
File: backtrace

Comment 2 Nicolas Mailhot 2013-11-25 22:12:50 UTC
Created attachment 828937 [details]
File: cgroup

Comment 3 Nicolas Mailhot 2013-11-25 22:12:54 UTC
Created attachment 828938 [details]
File: core_backtrace

Comment 4 Nicolas Mailhot 2013-11-25 22:12:58 UTC
Created attachment 828939 [details]
File: dso_list

Comment 5 Nicolas Mailhot 2013-11-25 22:13:01 UTC
Created attachment 828940 [details]
File: environ

Comment 6 Nicolas Mailhot 2013-11-25 22:13:07 UTC
Created attachment 828941 [details]
File: exploitable

Comment 7 Nicolas Mailhot 2013-11-25 22:13:12 UTC
Created attachment 828944 [details]
File: limits

Comment 8 Nicolas Mailhot 2013-11-25 22:13:17 UTC
Created attachment 828946 [details]
File: maps

Comment 9 Nicolas Mailhot 2013-11-25 22:13:21 UTC
Created attachment 828948 [details]
File: open_fds

Comment 10 Nicolas Mailhot 2013-11-25 22:13:25 UTC
Created attachment 828950 [details]
File: proc_pid_status

Comment 11 Nicolas Mailhot 2013-11-25 22:13:28 UTC
Created attachment 828952 [details]
File: var_log_messages

Comment 12 Fabio Valentini 2013-12-08 22:54:33 UTC
This seems to happen quite randomly ... the only thing I can say is that it happens more often when under high CPU load or when many windows are open.

reporter:       libreport-2.1.9
backtrace_rating: 4
cmdline:        /usr/bin/gnome-shell
crash_function: js::Shape::finalize
executable:     /usr/bin/gnome-shell
kernel:         3.12.3-1.fc21.x86_64
package:        gnome-shell-3.11.2-3.fc21
reason:         Process /usr/bin/gnome-shell was killed by signal 11 (SIGSEGV)
runlevel:       N 5
type:           CCpp
uid:            1000

Comment 13 Adam Williamson 2013-12-20 02:58:29 UTC
Another user experienced a similar problem:

Just switching windows from xchat to abrt (while it was busy reporting the *last* time Shell crashed :>)

reporter:       libreport-2.1.10
backtrace_rating: 4
cmdline:        /usr/bin/gnome-shell
crash_function: js::Shape::finalize
executable:     /usr/bin/gnome-shell
kernel:         3.13.0-0.rc4.git1.1.fc21.x86_64
package:        gnome-shell-3.11.2-3.fc21
reason:         gnome-shell killed by SIGSEGV
runlevel:       N 5
type:           CCpp
uid:            1001

Comment 14 Adam Williamson 2014-01-08 21:20:11 UTC
Another user experienced a similar problem:

Just happened during regular use of the desktop, no specific trigger.

reporter:       libreport-2.1.10
backtrace_rating: 4
cmdline:        /usr/bin/gnome-shell
crash_function: js::Shape::finalize
executable:     /usr/bin/gnome-shell
kernel:         3.13.0-0.rc7.git1.1.fc21.x86_64
package:        gnome-shell-3.11.3-1.fc21
reason:         gnome-shell killed by SIGSEGV
runlevel:       N 5
type:           CCpp
uid:            1001

Comment 15 Adam Williamson 2014-01-08 23:22:16 UTC
*** Bug 1034468 has been marked as a duplicate of this bug. ***

Comment 16 Adam Williamson 2014-01-08 23:22:36 UTC
*** Bug 1037692 has been marked as a duplicate of this bug. ***

Comment 17 Adam Williamson 2014-01-08 23:22:46 UTC
*** Bug 1035285 has been marked as a duplicate of this bug. ***

Comment 18 Adam Williamson 2014-01-16 19:40:35 UTC
<owen> adamw: the js crashes are some sort of memory corruption issue, and aren't going to be debuggable without a high quality valgrind log or a triggerable-at-will reproducer in the hands of a developer
<owen> adamw: That is, the backtrace at time of crash is unlikely to provide a useful clue

Comment 19 Adam Williamson 2014-01-16 23:17:26 UTC
So, another attempt to pin down a common thread between people hitting this: what graphics adapter does everyone have?

Comment 20 Nicolas Mailhot 2014-01-17 10:24:00 UTC
VGA compatible controller: ATI Technologies Inc RV770 [Radeon HD 4850] (prog-if 00 [VGA controller])
	Subsystem: PC Partner Limited Sapphire HD 4850 512MB GDDR3 PCI-E Dual Slot Fansink

Comment 21 Igor Gnatenko 2014-01-21 23:59:36 UTC
00:02.0 VGA compatible controller [0300]: Intel Corporation 3rd Gen Core processor Graphics Controller [8086:0166] (rev 09)
	Subsystem: Lenovo Device [17aa:21f9]
	Kernel driver in use: i915

Comment 22 Adam Williamson 2014-02-02 21:08:01 UTC
and I have an NVIDIA, so *that* ain't it either.

Owen, could you perhaps give us a bit more detail on exactly what getting a 'high quality valgrind log' would entail? I'm willing to sit in front of a slow-as-molasses Shell for a while to try and get one, but I don't want to suffer through that if there's a danger the result would be useless :)

https://wiki.gnome.org/Valgrind seems mostly focused on debugging memory leaks, not corruption - would one of the invocations there be appropriate for this case, or should we use something different? Thanks!

Comment 23 Seppo Yli-Olli 2014-02-05 09:25:49 UTC
Btw, does this reproduce at all when there is *not* high system load? Seems was triggered by a yum upgrade for me today. High system load would sound like a plausible reason to reveal race conditions.

Comment 24 Victor Antonovich 2014-02-05 09:43:43 UTC
In my case gnome-shell sometimes killed by SIGSEGV in js::Shape::finalize while computer is in idle state.

Comment 25 Mathieu Bridon 2014-02-05 09:45:26 UTC
Adam, would this help?

https://wiki.gnome.org/Projects/GnomeShell/Debugging

Comment 26 Adam Williamson 2014-02-05 17:03:24 UTC
seppo: yeah, happens to me all the time on something as simple as an alt-tab when the system is otherwise idle.

mathieu: I'm not sure that's specific enough to this case. I asked the devs on IRC yesterday and they told me to ask the mozjs devs, so I'll do that, just haven't got around to it yet.

Comment 27 Adam Williamson 2014-02-07 02:25:20 UTC
so I'm working on this now, but in case I get distracted, here's what I got from the devs:

<jimb> adamw: You need to build SpiderMonkey with --enable-valgrind
<jimb> adamw: https://developer.mozilla.org/en-US/docs/Debugging_Mozilla_with_Valgrind

so, the plan is to build mozjs24 with --enable-valgrind - I don't think I then need to rebuild gjs or gnome-shell - then try and get a log here. My scratch build of mozjs24 is at http://koji.fedoraproject.org/koji/taskinfo?taskID=6503082 .

Comment 28 Adam Williamson 2014-02-07 02:45:15 UTC
http://koji.fedoraproject.org/koji/taskinfo?taskID=6503123 is a scratch build which is actually *working*.

Comment 29 Adam Williamson 2014-02-07 03:27:37 UTC
well, I installed that mozjs24 plus the gjs and gnome-shell debuginfo packages and tried:

env G_SLICE=always-malloc valgrind --tool=memcheck --leak-check=full --leak-resolution=high --num-callers=20 --log-file=/home/adamw/gnome-shell_valgrind.log --smc-check=all-non-file gnome-shell --replace

but that doesn't seem to launch gnome-shell fully, the log cuts off early and shell never seems to start. anyone else have any luck?

Comment 30 Seppo Yli-Olli 2014-02-12 14:29:51 UTC
Created attachment 862340 [details]
valgrind log on gnome-shell startup

Insanely slow. When waiting patiently for a while there's a couple of memory errors in valgrind. I'd expect the Gnome-shell would eventually have come up but running this without a really really powerful machine with a lot of memory isn't a good idea. My laptop with 4G of memory definitely fell short. HTH
ps. Needed to install also cogl and clutter debuginfos to full information in valgrind

Comment 31 Adam Williamson 2014-02-14 02:05:02 UTC
Finally managed to catch a crash in valgrind. Ran:

env G_SLICE=always-malloc valgrind --tool=memcheck --leak-check=full --leak-resolution=high --num-callers=12 --log-file=/home/adamw/gnome-shell_valgrind.log --smc-check=all-non-file gnome-shell --replace

note, 12 callers not 20. With 12 callers it runs just about bearably, with 16GB of RAM. I just hope the crash was the JS one.

The valgrind log is here:

https://www.happyassassin.net/temp/gnome-shell_valgrind.log

I'll try and attach it as well, but it may be too large.

Comment 32 Adam Williamson 2014-02-14 02:05:40 UTC
Created attachment 863053 [details]
valgrind log of a crash

Comment 33 Adam Williamson 2014-02-14 02:08:32 UTC
I see this:

==6838== Process terminating with default action of signal 11 (SIGSEGV)
==6838==  Access not within mapped region at address 0x1D9C4000
==6838==    at 0x3692F68C78: js::jit::BaselineScript::pcForReturnOffset(JSScript*, unsigned int) (BaselineJIT.cpp:683)
==6838==    by 0x3692FDACE9: js::jit::IonFrameIterator::baselineScriptAndPc(JSScript**, unsigned char**) const (IonFrames.cpp:217)
==6838==    by 0x3692FDC961: js::jit::GetPcScript(JSContext*, JSScript**, unsigned char**) (IonFrames.cpp:1051)
==6838==    by 0x3692E4D3AF: js_InferFlags(JSContext*, unsigned int) (jscntxtinlines.h:536)
==6838==    by 0x3692E565F9: js::GetPropertyHelper(JSContext*, JS::Handle<JSObject*>, JS::Handle<long>, unsigned int, JS::MutableHandle<JS::Value>) (jsobj.cpp:3538)
==6838==    by 0x3692F6518F: js::jit::DoGetPropFallback(JSContext*, js::jit::BaselineFrame*, js::jit::ICGetProp_Fallback*, JS::MutableHandle<JS::Value>, JS::MutableHandle<JS::Value>) (BaselineIC.cpp:5453)
==6838==    by 0xCB7C0AD: ???
==6838==    by 0xCB8193E: ???
==6838==    by 0x37970B37: ???
==6838==    by 0xCB73853: ???
==6838==    by 0x3692F67384: EnterBaseline(JSContext*, js::jit::EnterJitData&) [clone .part.191] (BaselineJIT.cpp:105)
==6838==    by 0x3692F676D9: js::jit::EnterBaselineMethod(JSContext*, js::RunState&) (BaselineJIT.cpp:81)
==6838==  If you believe this happened as a result of a stack
==6838==  overflow in your program's main thread (unlikely but
==6838==  possible), you can try to increase the size of the
==6838==  main thread stack using the --main-stacksize= flag.
==6838==  The main thread stack size used in this run was 8388608.

Comment 34 Adam Williamson 2014-02-14 05:19:54 UTC
mozjs folks asked me to add another valgrind parameter and do it again, so...

==14803== Process terminating with default action of signal 11 (SIGSEGV)
==14803==  Bad permissions for mapped region at address 0x17D54060
==14803==    at 0x17D54060: ???
==14803==    by 0x3692F67384: EnterBaseline(JSContext*, js::jit::EnterJitData&) [clone .part.191] (BaselineJIT.cpp:105)
==14803==    by 0x3692F676D9: js::jit::EnterBaselineMethod(JSContext*, js::RunState&) (BaselineJIT.cpp:81)
==14803==    by 0x3692CFD182: Interpret(JSContext*, js::RunState&) (Interpreter.cpp:2334)
==14803==    by 0x3692CFDBC7: js::RunScript(JSContext*, js::RunState&) (Interpreter.cpp:438)
==14803==    by 0x3692CEFD2B: js::Invoke(JSContext*, JS::CallArgs, js::MaybeConstruct) (Interpreter.cpp:500)
==14803==    by 0x3692E0EADE: js::CallOrConstructBoundFunction(JSContext*, unsigned int, JS::Value*) (jsfun.cpp:1212)
==14803==    by 0x3692CEFD72: js::Invoke(JSContext*, JS::CallArgs, js::MaybeConstruct) (jscntxtinlines.h:321)
==14803==    by 0x3692E0EEDF: js_fun_apply(JSContext*, unsigned int, JS::Value*) (jsfun.cpp:982)
==14803==    by 0x3692CEFD72: js::Invoke(JSContext*, JS::CallArgs, js::MaybeConstruct) (jscntxtinlines.h:321)
==14803==    by 0x3692CFFBED: js::Invoke(JSContext*, JS::Value const&, JS::Value const&, unsigned int, JS::Value*, JS::Value*) (Interpreter.cpp:531)
==14803==    by 0x3692F5F9DF: js::jit::DoCallFallback(JSContext*, js::jit::BaselineFrame*, js::jit::ICCall_Fallback*, unsigned int, JS::Value*, JS::MutableHandle<JS::Value>) (BaselineIC.cpp:7007)

Comment 35 Adam Williamson 2014-02-14 05:20:38 UTC
Created attachment 863090 [details]
better valgrind log of a crash

Comment 36 Adam Williamson 2014-02-14 05:33:17 UTC
Also filed with mozilla as https://bugzilla.mozilla.org/show_bug.cgi?id=972725 , as the mozilla folks suggested an upstream report may be appropriate.

Comment 37 Adam Williamson 2014-02-17 19:08:47 UTC
*** Bug 1052964 has been marked as a duplicate of this bug. ***

Comment 38 Adam Williamson 2014-02-17 19:26:08 UTC
So at this point mozjs folks say "So, this is probably an error which need to be fixed in the way the JS API is used in gjs_value_from_g_argument, the trick with --vgdb-error=0, is something which can be used by developers, to find out more about the context of valgrind error reports. (such as running 'where full' in gdb, instead of 'bt')"

I can't get a run through gdb to work, though - it never fully initializes Shell, it's like when I tried running it through valgrind with num-callers=20, like it's just too much work or something. It kills the existing Shell, and the new one starts to start up, but never seems to reach the point of actually running (I never see window decorations). No, gdb wasn't at a break point, I checked.

They also asked me to build mozjs24 with --enable-debug, and get yet another valgrind crash log with that. I tried. The mozjs24 with --enable-debug is here:

http://koji.fedoraproject.org/koji/taskinfo?taskID=6531078

gjs rebuilt against that mozjs24 - which you need, or else Shell will crash on start - is here:

https://www.happyassassin.net/temp/gjs-1.39.3-2.1.fc21.x86_64.rpm
https://www.happyassassin.net/temp/gjs-debuginfo-1.39.3-2.1.fc21.x86_64.rpm

but when running Shell through valgrind with those builds, it didn't crash once for me - heisenbug!. It *does* still crash when not running through valgrind.

If anyone else wants to try, please do. I'll try again shortly.

Comment 39 Nicolas Dufresne 2014-02-28 22:53:31 UTC
Hmm, could it be an infinit recursion loop on our side, or some cleanup cycle on the C side ?

Comment 40 Adam Williamson 2014-03-21 00:39:52 UTC
Latest from upstream (02-18, this isn't new) is:

-----

(In reply to Nicolas B. Pierron [:nbp] from comment #16)
> Terrence: Any idea what could cause AssertHeapIsIdleOrIterating to fail
> during an interruption callback.

This asserts that we are not actively in a GC or otherwise tracing the heap, e.g. for JS_IterateCompartmentsCellIters for about:memory, JS_DumpHeap for debugging, etc. So this would imply that script is running from a finalizer, a JSTracer callback, while using CellIter, or perhaps someone is just calling the interrupt hook manually in one of this places. This is not allowed and, as far as I know there is nowhere in SpiderMonkey proper where this can happen. However, I think in the past we've had trouble with finalizers in Gecko accidentally trying to run scripts.

----

Today I managed to catch one of the crashes in gdb (but not the gdb-via-valgrind arrangement someone requested, just direct in gdb) with a --debug build of mozjs24, and got this:

Assertion failure: !rt->isHeapCollecting(), at /builddir/build/BUILD/mozjs-24.2.0/js/src/jsapi.cpp:206

Program received signal SIGSEGV, Segmentation fault.
0x0000003ac9652590 in AssertHeapIsIdleOrIterating (rt=<optimized out>)
    at /usr/src/debug/mozjs-24.2.0/js/src/jsapi.cpp:206
206	    JS_ASSERT(!rt->isHeapCollecting());

is that any use to anyone?

Comment 41 Adam Williamson 2014-04-08 20:15:32 UTC
also an interesting comment from upstream:

https://bugzilla.mozilla.org/show_bug.cgi?id=972725#c24

"SpiderMonkey embeddings /must not/ call back into the API from a finalizer, full stop. We do allow API usage, including running (almost) arbitrary script code, during GC, but /only/ during the JSGCCallback when the phase is JSGC_END. Gecko has the same need: it implements something called "delayed finalization." The idea is that when finalizers need to interact with SpiderMonkey they push the operation into a list, then run these operations in order when they get the JSGC_END callback. I guess gnome-shell needs something similar."

But interestingly, I haven't seen this bug lately on either of my Rawhide systems, I'm almost sure. Seems like it magically got solved, somehow or other. mozjs hasn't changed, but gjs has had a few bumps in March and April. I can't recall exactly when it stopped happening to me.

Comment 42 Jaroslav Reznik 2015-03-03 15:14:04 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 22 development cycle.
Changing version to '22'.

More information and reason for this action is here:
https://fedoraproject.org/wiki/Fedora_Program_Management/HouseKeeping/Fedora22

Comment 43 Igor Gnatenko 2016-03-20 10:17:48 UTC
no comments for long time.


Note You need to log in before you can comment on or make changes to this bug.