Created attachment 1366093 [details] Trace generated by gnome-weather Description of problem: gnome-weather gives a segmentation fault on startup if libproxy-mozjs is installed. Version-Release number of selected component (if applicable): gnome-weather-3.26.0-2.fc27.noarch libproxy-mozjs-0.4.15-4.fc27.x86_64 How reproducible: Always Steps to Reproduce: 1. Install libproxy-mozjs 2. Start gnome-weather Actual results: Segmentation fault Expected results: Should give the weather forecast Additional info: Uninstalling libproxy-mozjs causes gnome-weather to work properly. The problem seems have been fixed upstream with the commit "Add symbol versions - be ready to introduce new APIs as needed". I applied this patch to 0.4.15 gnome-weather works even if libproxy-mozjs is installed. https://github.com/libproxy/libproxy/commit/2fb80a007a1de4cf8603d7f686fdbaf46ded961a I have only triggered this with gnome-weather, but I guess it impacts quite a lot of other gnome stuff as well (anywhere, where libproxy is being used). That is why I've given it a high severity. I've seen lots of tracebacks which look a bit like mine, but start with a different program.
The problem persists in rawhide.
I recommend reporting this upstream at https://github.com/libproxy/libproxy/issues, since nobody will notice it here
But it was already fixed upstream _before_ I reported it here. I even gave a link to the commit, which fixes it! I thought someone might apply that fix on Fedora. The fix works for me.
Ah wow cool, the problem here is that gjs and libproxy are loading incompatible versions of mozjs in the same process: it's a guaranteed explosion. I've been rambling on about this theoretical possibility for a while now, and was confused why nobody had ever reported it before. This is actually the first bug report I've seen. You must be trying to run gnome-weather outside of GNOME, right? In that case, libproxy gets dlopened by glib-networking at runtime, then libproxy dlopens mozjs38, and meanwhile gjs of course links to mozjs52... boom. The fact that more people have not reported this probably indicates how rare it is that people try to run gjs applications outside of GNOME. So the fact that the crashes went away after that commit is actually a really bad sign. I haven't investigated, but I suspect it means libproxy can't load its own builtin modules anymore, perhaps because the symbols are no longer exported. It looks like libproxy doesn't have any testsuite and hasn't had a release since that commit, which might explain why this hasn't been noticed yet. So we should definitely not backport that commit to Fedora. And this would actually be a *really good* issue to report upstream, because this entire situation is nuts. There is https://github.com/libproxy/libproxy/issues/71 but that doesn't solve the underlying problem, which is that dlopening mozjs is inherently unsafe as applications don't expect it. More relevant discussion in: https://github.com/libproxy/libproxy/issues/68 But I would recommend reporting a new issue specifically for this. Actually, two new issues: one for the crash, one for the symbol versioning breaking the module loader.
I'll open them
Yes, I'm running gnome-weather outside of gnome. So I guess the problem doesn't occur, when gnome-weather starts within gnome... at least not in this specific way. From reading your post here and the ones you linked to, it sounds like quite a deep problem, of the sort that gives weird apparently unconnected problems all over the place. Clearly the "it works for me" workaround is not generally useful. It would be nice, if a proper fix could be done. Indeed if I run strace I see libmozjs-52 is opened once and libmozjs-38 twice: openat(AT_FDCWD, "/lib64/libmozjs-52.so.0", O_RDONLY|O_CLOEXEC) = 3 openat(AT_FDCWD, "/lib64/libmozjs-38.so", O_RDONLY|O_CLOEXEC) = 12 openat(AT_FDCWD, "/lib64/libmozjs-38.so", O_RDONLY|O_CLOEXEC) = 12 (with the workaround). Without the workaround it only loads each one once. I suspect you are right, that adding the symbols has just broken the loading of libmozjs-38.so. If I delete this shared object, gnome-weather works, with or without the workaround. So definitely, libmozjs-38.so is causing the trouble. I guess that if libproxy could be ported to libmozjs-52, it would be safe, but some mechanism is needed to prevent similar breakage in the future. I guess px_proxy_factory_get_proxies needs to be made thread safe. In the patch you posted in issue 68, the code it removes is definitely not thread safe. There's a race between the delete this->pr; and the create immediately after. If it gets preempted between the two, by a thread on the same path, this->pr is not NULL, but points to a deleted object and bang! Probably, hard to trigger, but not impossible. The replacement certainly looks safer. However, I suspect this is only the tip of the iceberg. It doesn't look like it is ever safe to run two threads with libproxy simultaneously. I think this problem is separate from the original one I had though. i.e. issue 68 isn't an issue with the port to libmozjs-52 but a general multithreading issue.
(In reply to nvwarr from comment #6) > I guess px_proxy_factory_get_proxies needs to be made thread safe. In the > patch you posted in issue 68, the code it removes is definitely not thread > safe. There's a race between the delete this->pr; and the create immediately > after. If it gets preempted between the two, by a thread on the same path, > this->pr is not NULL, but points to a deleted object and bang! px_proxy_factory_get_proxies() uses a mutex already. The problem isn't a race condition with anything else in libproxy, it's just that the webkit/mozjs APIs get upset if you create a JSGlobalContextRef/JSContext object in one thread, and then later use it from a different thread, even if you aren't using it in more than one thread at the same time. (In the mozjs case, this is because it explicitly checks that you're calling from the right thread. I'm not sure if webkit does the same thing, or if it uses other threads internally and ends up racing with itself when you misuse it.)
This message is a reminder that Fedora 27 is nearing its end of life. On 2018-Nov-30 Fedora will stop maintaining and issuing updates for Fedora 27. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as EOL if it remains open with a Fedora 'version' of '27'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version. Thank you for reporting this issue and we are sorry that we were not able to fix it before Fedora 27 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged change the 'version' to a later Fedora version prior this bug is closed as described in the policy above. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete.
(In reply to Dan Winship from comment #7) > (In the mozjs case, this is because it explicitly checks that you're calling > from the right thread. I'm not sure if webkit does the same thing, or if it > uses other threads internally and ends up racing with itself when you misuse > it.) For the record: if you want to use JSC in multiple threads, you indeed need to spin up a new JSC VM for each thread, and never share anything between threads.
libproxy-0.4.15-10.fc29 has been submitted as an update to Fedora 29. https://bodhi.fedoraproject.org/updates/FEDORA-2019-7285a847c4
libproxy-0.4.15-10.fc29 has been pushed to the Fedora 29 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2019-7285a847c4
That update should fix it for F29, yes, but it will break again in a subsequent Fedora release unless we change libproxy to use a separate process for loading mozjs
This bug appears to have been reported against 'rawhide' during the Fedora 30 development cycle. Changing version to '30.
libproxy-0.4.15-10.fc29 has been pushed to the Fedora 29 stable repository. If problems still persist, please make note of it in this bug report.