Bug 648319
Summary: | SIGSEGV in FixedVMPoolAllocator from webkitgtk when starting surf, uzbl or luakit (WIP package) | ||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Pierre Carrier <prc> | ||||||||||||||||
Component: | webkitgtk | Assignee: | Kevin Fenzi <kevin> | ||||||||||||||||
Status: | CLOSED WONTFIX | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||||||||||||
Severity: | urgent | Docs Contact: | |||||||||||||||||
Priority: | low | ||||||||||||||||||
Version: | 14 | CC: | andrew, boeuf32, damahevi, dkovalsk, ericosaucedo, farrellj, fedora, JimShip, jlcasado, kaloyan_petrov, kevin, kmcmartin, mariuszw, martin.sourada, mtasaka, nobody+296696, pbatkowski, petersen, roumano, schaiba, schplat, scott, scottt.tw, skaturn, social, steve.koppelman, theo148, uahello, vpvainio, zaitcev | ||||||||||||||||
Target Milestone: | --- | ||||||||||||||||||
Target Release: | --- | ||||||||||||||||||
Hardware: | x86_64 | ||||||||||||||||||
OS: | Linux | ||||||||||||||||||
Whiteboard: | |||||||||||||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||||||||||||
Doc Text: | Story Points: | --- | |||||||||||||||||
Clone Of: | Environment: | ||||||||||||||||||
Last Closed: | 2012-08-16 22:01:57 UTC | Type: | --- | ||||||||||||||||
Regression: | --- | Mount Type: | --- | ||||||||||||||||
Documentation: | --- | CRM: | |||||||||||||||||
Verified Versions: | Category: | --- | |||||||||||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||||||||||
Embargoed: | |||||||||||||||||||
Attachments: |
|
Description
Pierre Carrier
2010-11-01 01:52:20 UTC
Created attachment 456776 [details]
Output of rpm -qa
Created attachment 456777 [details]
core from surf
Created attachment 456779 [details]
core from luakit
=== core.15482.bt === [New Thread 15482] [New Thread 15483] [New Thread 15484] Core was generated by `surf'. Program terminated with signal 11, Segmentation fault. #0 0x00007f3c5e7bc7c3 in FixedVMPoolAllocator (this=<value optimized out>) at JavaScriptCore/jit/ExecutableAllocatorFixedVMPool.cpp:308 308 CRASH(); #0 0x00007f3c5e7bc7c3 in FixedVMPoolAllocator (this=<value optimized out>) at JavaScriptCore/jit/ExecutableAllocatorFixedVMPool.cpp:308 #1 JSC::ExecutableAllocator::isValid (this=<value optimized out>) at JavaScriptCore/jit/ExecutableAllocatorFixedVMPool.cpp:460 #2 0x00007f3c5e72960d in ExecutableAllocator (this=0x7f3c5f458a00, globalDataType=<value optimized out>, threadStackType=JSC::ThreadStackTypeLarge) at JavaScriptCore/jit/ExecutableAllocator.h:170 #3 JSC::JSGlobalData::JSGlobalData (this=0x7f3c5f458a00, globalDataType=<value optimized out>, threadStackType=JSC::ThreadStackTypeLarge) at JavaScriptCore/runtime/JSGlobalData.cpp:151 #4 0x00007f3c5e729ff3 in JSC::JSGlobalData::create (type=JSC::ThreadStackTypeLarge) at JavaScriptCore/runtime/JSGlobalData.cpp:240 #5 0x00007f3c5e72a042 in JSC::JSGlobalData::createLeaked (type=JSC::ThreadStackTypeLarge) at JavaScriptCore/runtime/JSGlobalData.cpp:246 #6 0x00007f3c5d979462 in WebCore::JSDOMWindowBase::commonJSGlobalData () at WebCore/bindings/js/JSDOMWindowBase.cpp:160 #7 0x00007f3c5d951905 in WebCore::mainThreadNormalWorld () at WebCore/bindings/js/DOMWrapperWorld.cpp:81 #8 0x00007f3c5e13c0d6 in webkit_web_frame_get_global_context (frame=<value optimized out>) at WebKit/gtk/webkit/webkitwebframe.cpp:697 #9 0x0000000000404ae4 in newclient () at surf.c:493 #10 0x0000000000404f6d in main (argc=<value optimized out>, argv=<value optimized out>) at surf.c:839 === core.15488.bt === [New Thread 15488] [New Thread 15490] [New Thread 15491] Core was generated by `luakit'. Program terminated with signal 11, Segmentation fault. #0 0x00007fea6f64e7c3 in FixedVMPoolAllocator (this=<value optimized out>) at JavaScriptCore/jit/ExecutableAllocatorFixedVMPool.cpp:308 308 CRASH(); #0 0x00007fea6f64e7c3 in FixedVMPoolAllocator (this=<value optimized out>) at JavaScriptCore/jit/ExecutableAllocatorFixedVMPool.cpp:308 #1 JSC::ExecutableAllocator::isValid (this=<value optimized out>) at JavaScriptCore/jit/ExecutableAllocatorFixedVMPool.cpp:460 #2 0x00007fea6f5bb60d in ExecutableAllocator (this=0x7fea5c4c0a00, globalDataType=<value optimized out>, threadStackType=JSC::ThreadStackTypeLarge) at JavaScriptCore/jit/ExecutableAllocator.h:170 #3 JSC::JSGlobalData::JSGlobalData (this=0x7fea5c4c0a00, globalDataType=<value optimized out>, threadStackType=JSC::ThreadStackTypeLarge) at JavaScriptCore/runtime/JSGlobalData.cpp:151 #4 0x00007fea6f5bbff3 in JSC::JSGlobalData::create (type=JSC::ThreadStackTypeLarge) at JavaScriptCore/runtime/JSGlobalData.cpp:240 #5 0x00007fea6f5bc042 in JSC::JSGlobalData::createLeaked (type=JSC::ThreadStackTypeLarge) at JavaScriptCore/runtime/JSGlobalData.cpp:246 #6 0x00007fea6e80b462 in WebCore::JSDOMWindowBase::commonJSGlobalData () at WebCore/bindings/js/JSDOMWindowBase.cpp:160 #7 0x00007fea6e85d5bc in WebCore::ScriptController::getAllWorlds (worlds=...) at WebCore/bindings/js/ScriptController.cpp:187 #8 0x00007fea6eba5792 in WebCore::FrameLoader::dispatchDidClearWindowObjectsInAllWorlds (this=0x7fea5c482450) at WebCore/loader/FrameLoader.cpp:3340 #9 0x00007fea6eba59e7 in WebCore::FrameLoader::receivedFirstData (this=0x7fea5c482450) at WebCore/loader/FrameLoader.cpp:616 #10 0x00007fea6eb9e228 in WebCore::DocumentWriter::setEncoding (this=0x7fea5c4825a0, name=..., userChosen=false) at WebCore/loader/DocumentWriter.cpp:236 #11 0x00007fea6eb924c6 in WebCore::DocumentLoader::commitData (this=0x7fea5c464000, bytes=0x2126ed0 "<!DOCTYPE html PUBLIC \"-//W3C//DTD XHTML 1.0 Transitional//EN\" \"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd\">\n<html xmlns=\"http://www.w3.org/1999/xhtml\" xml:lang=\"en\">\n<head>\n<meta http-eq"..., length=3670) at WebCore/loader/DocumentLoader.cpp:300 #12 0x00007fea6efb9265 in WebKit::FrameLoaderClient::committedLoad (this=0x7fea5c46cc60, loader=0x7fea5c464000, data=0x2126ed0 "<!DOCTYPE html PUBLIC \"-//W3C//DTD XHTML 1.0 Transitional//EN\" \"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd\">\n<html xmlns=\"http://www.w3.org/1999/xhtml\" xml:lang=\"en\">\n<head>\n<meta http-eq"..., length=3670) at WebKit/gtk/WebCoreSupport/FrameLoaderClientGtk.cpp:252 #13 0x00007fea6eb930e6 in WebCore::DocumentLoader::commitLoad (this=0x7fea5c464000, data=0x2126ed0 "<!DOCTYPE html PUBLIC \"-//W3C//DTD XHTML 1.0 Transitional//EN\" \"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd\">\n<html xmlns=\"http://www.w3.org/1999/xhtml\" xml:lang=\"en\">\n<head>\n<meta http-eq"..., length=3670) at WebCore/loader/DocumentLoader.cpp:287 #14 0x00007fea6ebe1a31 in WebCore::ResourceLoader::didReceiveData (this=0x7fea5c4dbb00, data=0x2126ed0 "<!DOCTYPE html PUBLIC \"-//W3C//DTD XHTML 1.0 Transitional//EN\" \"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd\">\n<html xmlns=\"http://www.w3.org/1999/xhtml\" xml:lang=\"en\">\n<head>\n<meta http-eq"..., length=3670, lengthReceived=0, allAtOnce=<value optimized out>) at WebCore/loader/ResourceLoader.cpp:263 #15 0x00007fea6ebcf925 in WebCore::MainResourceLoader::didReceiveData (this=0x7fea5c4dbb00, data=0x2126ed0 "<!DOCTYPE html PUBLIC \"-//W3C//DTD XHTML 1.0 Transitional//EN\" \"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd\">\n<html xmlns=\"http://www.w3.org/1999/xhtml\" xml:lang=\"en\">\n<head>\n<meta http-eq"..., length=3670, lengthReceived=0, allAtOnce=false) at WebCore/loader/MainResourceLoader.cpp:420 #16 0x00007fea6ebdfe4a in WebCore::ResourceLoader::didReceiveData (this=0x7fea5c4dbb00, data=0x2126ed0 "<!DOCTYPE html PUBLIC \"-//W3C//DTD XHTML 1.0 Transitional//EN\" \"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd\">\n<html xmlns=\"http://www.w3.org/1999/xhtml\" xml:lang=\"en\">\n<head>\n<meta http-eq"..., length=3670, lengthReceived=0) at WebCore/loader/ResourceLoader.cpp:434 #17 0x00007fea6ef9cde5 in WebCore::gotChunkCallback (msg=<value optimized out>, chunk=0x208acc0, data=0x7fea5c4d7150) at WebCore/platform/network/soup/ResourceHandleSoup.cpp:295 #18 0x00007fea6be9d03e in g_closure_invoke (closure=0x2061390, return_value=0x0, n_param_values=2, param_values=0x2065560, invocation_hint=0x7fffc66e3de0) at gclosure.c:766 #19 0x00007fea6beade87 in signal_emit_unlocked_R (node=<value optimized out>, detail=0, instance=0x1fc9ae0, emission_return=0x0, instance_and_params=0x2065560) at gsignal.c:3252 #20 0x00007fea6beb77b5 in g_signal_emit_valist (instance=<value optimized out>, signal_id=<value optimized out>, detail=<value optimized out>, var_args=<value optimized out>) at gsignal.c:2983 #21 0x00007fea6beb7983 in g_signal_emit (instance=<value optimized out>, signal_id=<value optimized out>, detail=<value optimized out>) at gsignal.c:3040 #22 0x00007fea6db521c6 in ?? () from /usr/lib64/libsoup-2.4.so.1 #23 0x00007fea6db52e10 in ?? () from /usr/lib64/libsoup-2.4.so.1 #24 0x00007fea6db53838 in ?? () from /usr/lib64/libsoup-2.4.so.1 #25 0x00007fea6be9d03e in g_closure_invoke (closure=0x1fabf10, return_value=0x0, n_param_values=1, param_values=0x20623a0, invocation_hint=0x7fffc66e6340) at gclosure.c:766 #26 0x00007fea6beade87 in signal_emit_unlocked_R (node=<value optimized out>, detail=0, instance=0x1fd6ce0, emission_return=0x0, instance_and_params=0x20623a0) at gsignal.c:3252 #27 0x00007fea6beb77b5 in g_signal_emit_valist (instance=<value optimized out>, signal_id=<value optimized out>, detail=<value optimized out>, var_args=<value optimized out>) at gsignal.c:2983 #28 0x00007fea6beb7983 in g_signal_emit (instance=<value optimized out>, signal_id=<value optimized out>, detail=<value optimized out>) at gsignal.c:3040 #29 0x00007fea6db5f501 in ?? () from /usr/lib64/libsoup-2.4.so.1 #30 0x00007fea6b5b9e33 in g_main_dispatch (context=0x1ef2890) at gmain.c:2149 #31 g_main_context_dispatch (context=0x1ef2890) at gmain.c:2702 #32 0x00007fea6b5ba610 in g_main_context_iterate (context=0x1ef2890, block=1, dispatch=1, self=<value optimized out>) at gmain.c:2780 #33 0x00007fea6b5bac82 in g_main_loop_run (loop=0x1ef6e90) at gmain.c:2988 #34 0x00007fea6dec70b7 in IA__gtk_main () at gtkmain.c:1237 #35 0x00000000004075df in main (argc=1, argv=0x7fffc66e6998) at luakit.c:160 === core.15944.bt === [New Thread 15986] [New Thread 15983] [New Thread 15982] [New Thread 15944] Core was generated by `/usr/bin/uzbl-core'. Program terminated with signal 11, Segmentation fault. #0 pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:216 216 62: movq %rax, %r14 #0 pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:216 #1 0x00007ffff37db352 in g_cond_timed_wait_posix_impl (cond=<value optimized out>, entered_mutex=<value optimized out>, abs_time=<value optimized out>) at gthread-posix.c:242 #2 0x00007ffff32e0c8f in g_async_queue_pop_intern_unlocked (queue=0x6441c0, try=0, end_time=0x7fffdba9dc10) at gasyncqueue.c:423 #3 0x00007ffff3334bf9 in g_thread_pool_wait_for_new_task (data=<value optimized out>) at gthreadpool.c:274 #4 g_thread_pool_thread_proxy (data=<value optimized out>) at gthreadpool.c:308 #5 0x00007ffff3332446 in g_thread_create_proxy (data=0x7e2100) at gthread.c:1897 #6 0x00007ffff2d73d5b in start_thread (arg=0x7fffdba9e700) at pthread_create.c:301 #7 0x00007ffff2aad27d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115 The last core is bigger, generated from gdb with gcore as "ulimit -c unlimited" cut the cheese. Please find it on http://pcarrier.fedorapeople.org/core.15944.xz if needed, and ping me once I can safely remove it :) Cheers, -- Pierre Reproduced with webkitgtk-1.3.5-1.fc14 from update FEDORA-2010-17027. huh. uzbl and surf both work fine here. Did they work ok with previous webkitgtk versions? Which one(s)? Can you create a new fresh user and run them from there and see if they crash (ie, is it a user config issue, or something system wide?) Hmm. I'm using a webkitgtk built here to remove dependencies on things I don't use (geolocation, gstreamer, and gobject-introspection) based on Fedora's 1.3.4 and haven't seen it with that one. I was getting crashes from 1.3.5 and was planning on doing debugging this week (I left negative karma on the 1.3.5 update). > Did they work ok with previous webkitgtk versions? Which one(s)? No, I was using chromium on this install so far. I installed webkitgtk while packaging luakit. > Can you create a new fresh user and run them from there and see if they crash Created a new user, reproduced the issue. Odd. I am not sure whats going on here... x86_64 and all those work fine for me here. ;( There must be some variable between all our installs, but not sure what it could be. Hi Kevin, What can I do to help you investigate further? Not sure. It's most puzzling that it doesn't happen to me. If you pull down a f14 x86_64 live image and boot off of it, does the crash still happen there? Ben: You are seeing the exact same crash? or something else? FYI it's in a VMware Fusion 2 virtual machine. It seems limited to this system. I made sure I only have F14 packages, nothing from updates-testing, rebooted just in case, still the same issue. Tried on a new install on a physical machine, I don't have the issue. Tried with a new user, with a new homedir, in a Gnome session (after yum groupinstall 'Gnome Desktop Environment'). I'm keeping the VM for as long as needed, I'd really like to get to the bottom of this. Well, I'm not sure. I don't have any vmware here, so I can't duplicate this. ;( Perhaps it's worth going now to webkitgtk upstream and seeing if they can make anything out of this? (In reply to comment #12) > Ben: You are seeing the exact same crash? or something else? Won't be able to debug until tomorrow at the earliest. Well, seeing as 1.3.6 has made 1.3.5 obsolete and I am not seeing crashes with it, I'd call this "fixed". Of course my issue may not have been related. Never got around to debugging it fully. Can the reporter test 1.3.6 to see if it fixes things. I think I may have one machine on 1.3.4 yet if not to check my crashes. Reproduced with webkitgtk-1.3.6-1.fc14.x86_64 in the same VM. Both gwibber (bug 654677) and liferea (bug 655459) crash for me on start with very similar traceback. There's also a crash in epiphany (bug 651796). I suggest we reassign this bug to Webkit. Crashing code: // Cook up an address to allocate at, using the following recipe: // 17 bits of zero, stay in userspace kids. // 26 bits of randomness for ASLR. // 21 bits of zero, at least stay aligned within one level of the pagetables. // // But! - as a temporary workaround for some plugin problems (rdar://problem/6812854), // for now instead of 2^26 bits of ASLR lets stick with 25 bits of randomization plus // 2^24, which should put up somewhere in the middle of usespace (in the address range // 0x200000000000 .. 0x5fffffffffff). intptr_t randomLocation = arc4random() & ((1 << 25) - 1); randomLocation += (1 << 24); randomLocation <<= 21; m_base = mmap(reinterpret_cast<void*>(randomLocation), m_totalHeapSize, INITIAL_PROTECTION_FLAGS, MAP_PRIVATE | MAP_ANON, VM_TAG_FOR_EXECUTABLEALLOCATOR_MEMORY, 0); if (!m_base) CRASH(); OH MY GOD Rebooting under kernel-2.6.34.7-61.fc13.x86_64 fixes the issue. I take it back. Applications can launch once or twice after reboot, then continue crashing as usual, even with F13 kernel. Pete: are you in a vmware guest there? This bug seemed to be restricted to vmware guests... I use no hypervisor. Pete: you are seeing this with any version of webkitgtk? Have you tried 1.3.6 from updates-testing? I'm alread on webkitgtk-1.3.6-1.fc15, there's nothing newer in Koji. Pete: You are seeing this on rawhide? or f14? or both? Crashing started after upgrade from F13 to F14. I replaced the webkitgtk with one from Rawhide just in case, but it seems to be identical. I looked closer at the tip version we're shipping (webkitgtk-1.3.6-1.fc15), and contrary to what I quoted in comment #19, the Apple's ASLR appears to be disabled in the new code. The mmap should be invoked with 0 for address. Same crash though. So, I suspect it's the new heap size that blows us up, not the location. *** Bug 654677 has been marked as a duplicate of this bug. *** *** Bug 655459 has been marked as a duplicate of this bug. *** As a quick workaround, I reduced the fixed size from 2GB to 1GB, with this patch: --- webkit-1.3.6/JavaScriptCore/jit/ExecutableAllocatorFixedVMPool.cpp 2010-09-10 03:04:16.000000000 -0600 +++ webkit-1.3.6-p3/JavaScriptCore/jit/ExecutableAllocatorFixedVMPool.cpp 2010-11-24 19:33:15.920310398 -0700 @@ -40,7 +40,8 @@ #if CPU(X86_64) // These limits suitable on 64-bit platforms (particularly x86-64, where we require all jumps to have a 2Gb max range). - #define VM_POOL_SIZE (2u * 1024u * 1024u * 1024u) // 2Gb + /* #define VM_POOL_SIZE (2u * 1024u * 1024u * 1024u) */ // 2Gb + #define VM_POOL_SIZE (1u * 1024u * 1024u * 1024u) // 1Gb - bz#648319 #define COALESCE_LIMIT (16u * 1024u * 1024u) // 16Mb #else // These limits are hopefully sensible on embedded platforms. RPMs available for testing from http://people.redhat.com/zaitcev/ftp/648319/ After applying 1GB patch, /proc/N/maps looks like this: 7f59347e0000-7f59348e8000 rw-p 00000000 00:00 0 7f59348e8000-7f59748e8000 rwxp 00000000 00:00 0 <-------- JIT area 7f59748e8000-7f5978000000 r--p 00000000 fd:00 25482 /usr/share/icons/gnome/icon-theme.cache ..... 7f599d6d4000-7f59a3565000 r--p 00000000 fd:00 12786 /usr/lib/locale/locale-archive 7f59a3565000-7f59a36ff000 r-xp 00000000 fd:00 12796 /lib64/libc-2.12.90.so ..... If 2GB were requested the end would've been 7f59a48e8000 (assuming allocation up). If libc were mapped previously, the new mapping would have no space. So, it looks to me that these fixed mappings only go into a limited area (instead of being allocated all over the address space). I guess we could file this upstream. I am quite confused however why everything works just great for me and some others and some folks get this crash. ;( I have a dell d820 laptop, f14, x86_64 here. Could it be hardware differences? This crash is in the JIT compiler... I could make a build without that to confirm? Could it be a specific set of pages that causes it for people that I simply haven't hit? Thoughts? Upstream is going to ask us just why 1GB works, but not 2GB, and they will be right. Something is preventing allocation of the VMA. Note that F13 kernel is exactly the same. I think in F13 we still used ASLR in webkitgtk, so we never noticed that automatic allocation would fail. This needs to get investigated in kernel. Maybe bounce the bug to kernel component? Hmm, I've been seeing something similar but also little different maybe. I'll put here for another data point... System is x86_64 / Fedora 14 I've been seeing crashes in webkitgtk epiphany[23848]: segfault at bbadbeef ip 00000039b2ed5b13 sp 00007fffa6aeca40 error 6 in libwebkitgtk-1.0.so.0.3.1[39b1c00000+16f2000] Stracing a simple vala webkitgtk app shows this mmap(NULL, 2147483648, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = -1 ENOMEM (Cannot allocate memory) --- SIGSEGV (Segmentation fault) @ 0 (0) --- +++ killed by SIGSEGV +++ Segmentation fault These always seem to occur after having applied some updates. Or maybe that's just coincidence. My workstation is on all the time, maybe it's just the first time trying to run a webkitgtk app after the night... What I have found that fix's the problem, until the next update/day, is restarting firefox (well just exiting it really) If I stop firefox I can then start say epiphany and after I've started firefox again I can still run epiphany et al. At first I thought it was that firefox was using some library that libwebkitgtk was using got updated and was causing webkitgtk apps to try and load the wrong thing. But I don't think that really makes sense. I know firefox baulks when it gets updated while its running due to its versioned directory installation. Looking at when I first noticed this problem (the 23rd) and then just now and what things got updated at those times and deleted entries in /proc/firefox/maps, there isn't any common thing that stands out Dunno if this is a help or a hindrance... Andrew Hi Pete, if you're seeing this on X86_64, then it's unlikely to be our ASLR patch (linux-2.6-32bit-mmap-exec-randomization.patch) since that only mucks with get_unmapped_area if mmap_is_ia32 which is only set if 32-bit, or we're running a compat process on X86_64... I won't rule it out though... do you have selinux enabled? Ah, gimme a minute to dust off and nuke from orbit, I'll build some images without any Fedora patch-goo. http://kyle.fedorapeople.org/kernel/vanilla-2.6.35.9-62.fc14.bz648319/x86_64/kernel-vanilla-2.6.35.9-62.fc14.x86_64.rpm Does it go away running that? No, still crashes. I verified that I reinstalled stock webkitgtk-1.3.6-1.fc14 by running gwibber once (if you recall, it starts ok once) and capturing the /proc/NNNN/maps. It shows 2GB area: 7f986ac44000-7f986b054000 rw-p 00000000 00:00 0 7f986b054000-7f98eb054000 rwxp 00000000 00:00 0 7f98eb054000-7f98eb0ef000 r--p 00000000 fd:00 34140 /usr/share/fonts/dejavu/DejaVuSans-Bold.ttf Running gwibber again crashes it. /proc/version: Linux version 2.6.35.9-62.fc14.x86_64 (kyle.bos.redhat.com) (gcc version 4.5.1 20100924 (Red Hat 4.5.1-4) (GCC) ) #1 SMP Fri Nov 26 12:59:36 EST 2010 Ugh, that's really weird, looks like we have an upstream problem. :( I'll poke more at this tomorrow morning. Hrm, I wonder, could you instrument webkit to make it print /proc/self/smaps when the mmap fails? I honestly can't see a trivial way that mmap could possibly fail... the anonymous space on x86_64 for a 64-bit process is so staggeringly huge that it's unlikely we could find a 2GB space even with a bunch of other anonymous mappings. Or perhaps I'm just being dumb and missing something obvious, I'm trying to reproduce it locally now. I suspect the VA space is somehow segmented not in our favour, or limited (the ulimit was the first thing I checked), so the allocating process is not getting the freedom. Or it may be a bug somewhere. I was looking into creating an easy reproducer and then have kernel instrumented. I'll get you the smaps dump too, that should be easy. I've not been able to break it by mapping a bunch of shlibs and then trying the mmap either. :/ The most bizarre thing is that the reporters say it only happens intermittently (ie: after an upgrade, but not before.) Puzzling. Created attachment 463889 [details]
capture of /proc/13459/smaps after mmap failure
Created attachment 463890 [details]
patch used to capture dump
This is so Kyle knows exactly what's printed and when.
*** Bug 650567 has been marked as a duplicate of this bug. *** *** Bug 652100 has been marked as a duplicate of this bug. *** *** Bug 651617 has been marked as a duplicate of this bug. *** Thanks Pete! That's really weird... ~650 VMAs making up some ~760MB of heap shouldn't be enough to cause this. The virtual space on x86_64 hundreds of GB... I'll poke more today and try to find some way to instrument to narrow this down. --Kyle *** Bug 661797 has been marked as a duplicate of this bug. *** I am seeing this on F14 on a physical machine when launching DevHelp: [89048.181768] devhelp[8829]: segfault at bbadbeef ip 0000003f48cd5b13 sp 00007fffb91a31b0 error 6 in libwebkitgtk-1.0.so.0.3.1[3f47a00000+16f2000] mmap(NULL, 2147483648, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = -1 ENOMEM (Cannot allocate memory) --- SIGSEGV (Segmentation fault) @ 0 (0) --- +++ killed by SIGSEGV (core dumped) +++ Segmentation fault (core dumped) $ rpm -qa | grep webkit webkitgtk3-1.3.3-1.fc14.x86_64 webkitgtk-1.3.6-1.fc14.x86_64 $ uname -r 2.6.35.9-64.fc14.x86_64 $ free -m total used free shared buffers cached Mem: 1974 1797 176 0 52 728 -/+ buffers/cache: 1016 957 Swap: 511 9 502 There's a scratch build of 1.3.7 here: http://koji.fedoraproject.org/koji/taskinfo?taskID=2658851 I have no idea if it will solve or help here, but worth trying. *** Bug 651745 has been marked as a duplicate of this bug. *** *** Bug 654110 has been marked as a duplicate of this bug. *** *** Bug 649216 has been marked as a duplicate of this bug. *** Package: midori-0.2.8-1.fc14.1 Architecture: x86_64 OS Release: Fedora release 14 (Laughlin) How to reproduce ----- 1. Just clicking on the icon on fedora desktop 2. 3. *** Bug 663906 has been marked as a duplicate of this bug. *** 1.3.8 scratch build: http://koji.fedoraproject.org/koji/taskinfo?taskID=2683278 Package: midori-0.2.8-1.fc14.1 Architecture: x86_64 OS Release: Fedora release 14 (Laughlin) How to reproduce ----- 1.Al iniciar la aplicacion 2. 3. Package: midori-0.2.8-1.fc14.1 Architecture: x86_64 OS Release: Fedora release 14 (Laughlin) How to reproduce ----- 1.Lanzandolo 2. 3. webkitgtk-1.3.8-1.fc14 crashes as usual. Kevin, I think it makes no sense to retest all those revisions unless upstream does something about the gigantic allocation (e.g. implements a fallback). BTW, an easy workaround if you don't want to use my RPM from comment #31 is: echo 1 > /proc/sys/vm/overcommit_memory Well, or just buy a couple gigs of RAM. Sure. Filed upstream. Package: midori-0.2.8-1.fc14.1 Architecture: x86_64 OS Release: Fedora release 14 (Laughlin) How to reproduce ----- 1. Launch it (from the GNOME Applications -> Internet menu) 2. Midori crashes on startup 3. Comment ----- A few minutes ago I installed the latest midori package on my Fedora 14 64 bit computer. This was a fresh install; I haven't had midori installed before. It crashed at the first, and all the repeated attempts to launch it. *** Bug 656512 has been marked as a duplicate of this bug. *** FYI, I saw Bug 651617 (the one from gramps that was moved to duplicate of this one) when I was running Gramps in Fedora 14 in a QEMU virtual machine (libvirt). Now I'm running Fedora 14 on the physical machine, and there is no crash of gramps. *** Bug 666774 has been marked as a duplicate of this bug. *** *** Bug 667056 has been marked as a duplicate of this bug. *** I've been running into this annoying webkit bug for months when using Miro: kernel: [471036.173304] miro.real[24792]: segfault at bbadbeef ip 0000003598ed5b13 sp 00007fffe5b49e60 error 6 in libwebkitgtk-1.0.so.0.3.1[3597c00000+16f2000] Similar to comment #35 above, Miro will work fine for a few days (or hours) after a fresh boot, then just die and crash on start, sometimes coming back to life after restarting firefox. I can also confirm that Pete's workaround in comment #63 works for me, so I'll be adding "echo 1 > /proc/sys/vm/overcommit_memory" to my /etc/rc.local until webkit is properly fixed. (And Pete: not sure what the 'or buy a couple gigs of ram' quip was for, as I've got 4GB, and am running f14 x86_64.) Oh, and if it makes a difference: I have no swap. It's hard to tell what may be eating all that RAM on any particular system. Some of it sizes with the available RAM legitimately. The "couple of gigs" remark referred to the amount of additional allocation that Webkit JIT requests, on the theory that if system had N gigs and Webkit wanted 2, installing N+2 would work. Your example shows that it does not work like that. I think that upstream may be amenable to a dynamically sized JIT aperture, with a divide-by-half back-off. I suppose I can try implement it, so Kevin could slap it on RPMs and get it tested, and ask upstream to incorporate it. It should work well on OSX too, I think, which is probably upstream's main concern regarding setting arbitrary small size. Of course if anyone else on cc: list writes such patch, I'll be very happy too. Package: midori-0.2.8-1.fc14.1 Architecture: x86_64 OS Release: Fedora release 14 (Laughlin) How to reproduce ----- 1. first start of midori on my system 2. I don't know what happened 3. Comment ----- just wanted to start midori the first time on my system Changing upstream bug to point to the new upstream one (the one I reported was marked a dup of that one): https://bugs.webkit.org/show_bug.cgi?id=42756 Created attachment 473256 [details]
back-off allocation over 1.3.10
This is a counter-proposal that I sent to Gavin against Xavier's
/proc-poking code.
Package: midori-0.2.8-1.fc14.1 Architecture: x86_64 OS Release: Fedora release 14 (Laughlin) How to reproduce ----- 1. yum install midori 2.midori 3. Comment ----- first run from command line: $ midori Segmentation fault (core dumped) not pretty. :( Gavin insists that we go with Xan's patch (that reads /proc). Honestly I don't care all that much, as long as it works. (In reply to comment #77) > Gavin insists that we go with Xan's patch (that reads /proc). Honestly > I don't care all that much, as long as it works. Agreed. For goodness sakes, just get it fixed. I hit a *first execution* **segfault** on a bug that's nearly three months old, for a package I had just now installed? Seriously? *** Bug 673730 has been marked as a duplicate of this bug. *** Package: midori-0.2.9-4.fc14 Architecture: x86_64 OS Release: Fedora release 14 (Laughlin) How to reproduce ----- 1.first-time start of midori on a freshly installed and up-to-date F14 2. 3. Bitten by (likely) the same issue with liferea: $ valgrind liferea ==31960== Memcheck, a memory error detector ==31960== Copyright (C) 2002-2009, and GNU GPL'd, by Julian Seward et al. ==31960== Using Valgrind-3.5.0 and LibVEX; rerun with -h for copyright info ==31960== Command: liferea ==31960== --31960-- Warning: DWARF2 CFI reader: unhandled DW_OP_ opcode 0x8 --31960-- Warning: DWARF2 CFI reader: unhandled DW_OP_ opcode 0x8 --31960-- Warning: DWARF2 CFI reader: unhandled DW_OP_ opcode 0x8 --31960-- Warning: DWARF2 CFI reader: unhandled DW_OP_ opcode 0x8 --31960-- Warning: DWARF2 CFI reader: unhandled DW_OP_ opcode 0x8 --31960-- Warning: DWARF2 CFI reader: unhandled DW_OP_ opcode 0x8 --31960-- Warning: DWARF2 CFI reader: unhandled DW_OP_ opcode 0x8 --31960-- Warning: DWARF2 CFI reader: unhandled DW_OP_ opcode 0x8 --31960-- Warning: DWARF2 CFI reader: unhandled DW_OP_ opcode 0x8 --31960-- Warning: DWARF2 CFI reader: unhandled DW_OP_ opcode 0x8 --31960-- Warning: DWARF2 CFI reader: unhandled DW_OP_ opcode 0x8 ==31960== Syscall param write(buf) points to uninitialised byte(s) ==31960== at 0x31DE40DF1D: ??? (syscall-template.S:82) ==31960== by 0x31E7408F2E: _IceTransSocketWrite (Xtranssock.c:2163) ==31960== by 0x31E740D707: _IceWrite (misc.c:352) ==31960== by 0x31E740D7F3: IceFlush (misc.c:80) ==31960== by 0x44AC7E: ??? (in /usr/bin/liferea) ==31960== by 0x44ACEB: ??? (in /usr/bin/liferea) ==31960== by 0x44B188: session_init (in /usr/bin/liferea) ==31960== by 0x4318B1: main (in /usr/bin/liferea) ==31960== Address 0xd54a82c is 12 bytes inside a block of size 1,024 alloc'd ==31960== at 0x4A04896: calloc (vg_replace_malloc.c:418) ==31960== by 0x31E7405CD8: IceOpenConnection (connect.c:210) ==31960== by 0x3B9C40270A: SmcOpenConnection (sm_client.c:135) ==31960== by 0x44B113: session_init (in /usr/bin/liferea) ==31960== by 0x4318B1: main (in /usr/bin/liferea) ==31960== ==31960== Invalid write of size 4 ==31960== at 0x399D364025: ??? (in /usr/lib64/libwebkitgtk-1.0.so.0.5.2) ==31960== by 0x399D362212: ??? (in /usr/lib64/libwebkitgtk-1.0.so.0.5.2) ==31960== by 0x399D2C6D3C: ??? (in /usr/lib64/libwebkitgtk-1.0.so.0.5.2) ==31960== by 0x399D2C7AB2: ??? (in /usr/lib64/libwebkitgtk-1.0.so.0.5.2) ==31960== by 0x399D2C7B01: ??? (in /usr/lib64/libwebkitgtk-1.0.so.0.5.2) ==31960== by 0x399C5AA131: ??? (in /usr/lib64/libwebkitgtk-1.0.so.0.5.2) ==31960== by 0x399C60404B: ??? (in /usr/lib64/libwebkitgtk-1.0.so.0.5.2) ==31960== by 0x399C986DB1: ??? (in /usr/lib64/libwebkitgtk-1.0.so.0.5.2) ==31960== by 0x399C987021: ??? (in /usr/lib64/libwebkitgtk-1.0.so.0.5.2) ==31960== by 0x399C9800F7: ??? (in /usr/lib64/libwebkitgtk-1.0.so.0.5.2) ==31960== by 0x399C973E75: ??? (in /usr/lib64/libwebkitgtk-1.0.so.0.5.2) ==31960== by 0x399C485CD4: ??? (in /usr/lib64/libwebkitgtk-1.0.so.0.5.2) ==31960== Address 0xbbadbeef is not stack'd, malloc'd or (recently) free'd ==31960== Liferea did receive signal 11 (Segmentation fault). ==31960== ==31960== HEAP SUMMARY: ==31960== in use at exit: 10,548,398 bytes in 154,760 blocks ==31960== total heap usage: 273,218 allocs, 118,458 frees, 30,625,751 bytes allocated ==31960== ==31960== LEAK SUMMARY: ==31960== definitely lost: 6,441 bytes in 12 blocks ==31960== indirectly lost: 17,856 bytes in 555 blocks ==31960== possibly lost: 9,362,969 bytes in 144,671 blocks ==31960== still reachable: 1,161,132 bytes in 9,522 blocks ==31960== suppressed: 0 bytes in 0 blocks ==31960== Rerun with --leak-check=full to see details of leaked memory ==31960== ==31960== For counts of detected and suppressed errors, rerun with: -v ==31960== Use --track-origins=yes to see where uninitialised values come from ==31960== ERROR SUMMARY: 13 errors from 2 contexts (suppressed: 16 from 8) $ rpm -qf /usr/lib64/libwebkitgtk-1.0.so.0.5.2 webkitgtk-1.3.10-1.fc14.x86_64 liferea-1.6.5-1.fc14.x86_64 Blah, blah, blah. I see no light at the end of this tunnel. This package has changed ownership in the Fedora Package Database. Reassigning to the new owner of this component. Package: midori-0.2.9-4.fc14 Architecture: x86_64 OS Release: Fedora release 14 (Laughlin) How to reproduce ----- 1. f -> Network -> midori 2. 3. Comment ----- Normal install via yum, just trying to start it. I just ran into this bug on fedora 14 and decided to fix it on my own. Keep in mind that this is totally unsupported: http://people.redhat.com/jleddy/webkitovercommit/ Created attachment 488938 [details] patch from upstream reducing from 2G -> 32M I built it with this patch from upstream: From: Xan Lopez <xan> Date: Wed, 12 Jan 2011 23:25:23 +0100 Subject: [PATCH] 2011-01-12 Xan Lopez <xlopez> Reviewed by NOBODY (OOPS!). JIT requires VM overcommit (particularly on x86-64), Linux does not by d efault support this without swap? https://bugs.webkit.org/show_bug.cgi?id=42756 The FixedVMPool Allocator does not work well on systems where allocating very large amounts of memory upfront is not reasonable, like Linux without overcommit enabled. As a workaround, on Linux, default to the values used in embedded environments (in the MB range), and only jump to the GB range if we detect at runtime that overcommit is enabled. Should fix crashes on Linux/x86_64 with less than 3 or 4GB of RAM. * jit/ExecutableAllocatorFixedVMPool.cpp: (JSC::FixedVMPoolAllocator::free): use new variables for VM pool size and coalesce limit. (JSC::ExecutableAllocator::isValid): swap the variables from embedded to generic values at runtime, on linux, if overcommit is enabled. (JSC::ExecutableAllocator::underMemoryPressure): use new variables for VM pool size and coalesce limit. *** Bug 665371 has been marked as a duplicate of this bug. *** *** Bug 694935 has been marked as a duplicate of this bug. *** *** Bug 700236 has been marked as a duplicate of this bug. *** (In reply to comment #63) > BTW, an easy workaround if you don't want to use my RPM from comment #31 is: > echo 1 > /proc/sys/vm/overcommit_memory > > Well, or just buy a couple gigs of RAM. The setting above makes gnucash reports work with the file in question w/out crashing. So I guess this is the bug I had run into... -Scott Package: gnucash-2.4.7-1.fc14 Architecture: x86_64 OS Release: Fedora release 14 (Laughlin) How to reproduce ----- 1. Imported a few months worth of qif files 2. Selected Report > Income & Expense > Expense Piechart 3. Blammo. Comment ----- Not sure. Same issue on a RHEL 6.2 x86_64 VMWare guest with 1.5GB RAM used as a Jenkins node. In this case, the browser is a headless Qtwebkit instance invoked through a capybara test suite. I see the current EPEL qtwebkit packages were built in May 2011, which would explain why the problem is lingering on RH/Fedora distros. This message is a notice that Fedora 14 is now at end of life. Fedora has stopped maintaining and issuing updates for Fedora 14. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At this time, all open bugs with a Fedora 'version' of '14' have been closed as WONTFIX. (Please note: Our normal process is to give advanced warning of this occurring, but we forgot to do that. A thousand apologies.) Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, feel free to reopen this bug and simply change the 'version' to a later Fedora version. Bug Reporter: Thank you for reporting this issue and we are sorry that we were unable to fix it before Fedora 14 reached end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged to click on "Clone This Bug" (top right of this page) and open it against that version of Fedora. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. The process we are following is described here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping |