For bug 1131382, I need to add some error handling in the case of a repo being unreachable - for instance, network problems or a typo in the location. This seems pretty easy. I just take pyanaconda/packaging/rpmostreepayload.py:install and make it look like so: try: repo.pull(ostreesetup.remote, [ostreesetup.ref], 0, progress, cancellable) except GLib.GError as e: exn = PayloadInstallError("Failed to pull from repository: %s" % e) log.error(str(exn)) if errors.errorHandler.cb(exn) == errors.ERROR_RAISE: progressQ.send_quit(1) iutil.ipmi_report(IPMI_ABORTED) sys.exit(1) An important thing to note here is that this should display an error dialog to the screen and then anaconda will exit. Most of the time, what is actually happening is that the UI completely freezes. I can log the error, but any attempt at displaying an error message results in the UI being frozen. Additionally if I replace repo.pull with just raise GLib.GError("whatever"), everything works as it should. What it seems to me is that something about the error handling in the repo.pull path is causing problems for GTK. Unfortunately, I do not have any simpler way of reproducing this problem but I am hoping this is enough to start with. I am seeing this with both rawhide and RHEL7 nightly, which is where I eventually need my error handling fix to show up.
Just to add some more information about the potential cause of the issue: * the repo.pull() call is performed from a different thread than the thread running Anaconda's Gtk main loop (otherwise it would freeze the GUI) * running multiple GLib/Gtk main loops from different threads causes various issues ranging from hangs/freezes to segfaults * the ostree library seems to run and quit a GLib main loop in case repo.pull() fails which probably triggers all the hell happening in Anaconda's GUI * a library should not run a GLib/Gtk main loop * a running GLib/Gtk main loop should be detected and the idle_add function should be used to run the code that needs to run in the main loop in the existing main loop * an existing main loop should be used for stuff that needs a main loop but not particulary a new one being started
Ah yes, we need to create a new main context for the thread. (Also, with only a little bit of work I could make the pull API fully async) How about this (untested) patch? diff --git a/pyanaconda/packaging/rpmostreepayload.py b/pyanaconda/packaging/rpmostreepayload.py index 1dca94a..48b8d45 100644 --- a/pyanaconda/packaging/rpmostreepayload.py +++ b/pyanaconda/packaging/rpmostreepayload.py @@ -85,6 +85,8 @@ class RPMOSTreePayload(ArchivePayload): progressQ.send_message("Writing objects") def install(self): + mainctx = GLib.MainContext.new() + mainctx.push_thread_default() cancellable = None from gi.repository import OSTree ostreesetup = self.data.ostreesetup @@ -160,6 +162,7 @@ class RPMOSTreePayload(ArchivePayload): if os.path.isdir(srcpath): log.info("Copying bootloader data: " + fname) shutil.copytree(srcpath, os.path.join(physboot, fname)) + mainctx.pop_thread_default() def prepareMountTargets(self, storage): ostreesetup = self.data.ostreesetup
This appears to have fixed it up. Thanks! I'll deal with all the bug wrangling and committing tomorrow, but the brief outline is I'll just close this bug and add those lines as part of what's going in for bug 1131382.
If you want to do it a bit cleaner, maybe a context manager @newmaincontext More generally though, what I'd do is push a new GLib thread context wherever Anaconda spawns a thread. This will allow them to make sync (and async calls) without touching the default (i.e. Gtk) main context.
(In reply to Colin Walters from comment #4) > If you want to do it a bit cleaner, maybe a context manager > > @newmaincontext > > More generally though, what I'd do is push a new GLib thread context > wherever Anaconda spawns a thread. This will allow them to make sync (and > async calls) without touching the default (i.e. Gtk) main context. Can you please provide us a link to some documentation of the push/pop_thread_default() functions? I tried searching them in the official GLib documentation with no luck. In general, is there any documentation on how the GLib main loop and contexts work? Is it possible to do sync calls of Gtk functions from a thread without issues if new context is created for the thread? We were always told that the only thread-safe thing in the whole GLib/Gtk stack is idle_add so I'm quite confused by your suggestion.
(In reply to Vratislav Podzimek from comment #5) > Can you please provide us a link to some documentation of the > push/pop_thread_default() functions? I tried searching them in the official > GLib documentation with no luck. https://developer.gnome.org/glib/stable/glib-The-Main-Event-Loop.html#g-main-context-push-thread-default Right? > In general, is there any documentation on how the GLib main loop and > contexts work? In the same page, yes. However the simplest way to think about this is that a main loop is just a wrapper for a context with a little bit of convenience API. > Is it possible to do sync calls of Gtk functions from a > thread without issues if new context is created for the thread? No, not with Gtk. Not every GLib-based library exposes threadsafe objects, and Gtk is one that does not. A main context is just a wrapper for poll() really. It doesn't help if one's library or object has significant global state (as Gtk does, via the X server, among other things). > We were > always told that the only thread-safe thing in the whole GLib/Gtk stack is > idle_add so I'm quite confused by your suggestion. No; modern Gio APIs like GDBus are also thread-safe. Most of the Gio I/O APIs can be used sync and async from multiple threads. So are several other libraries like libsoup. However, now that I think about this more, this is still a libostree bug. The patch we're adding to Anaconda is still correct, but the libostree sync pull API shouldn't have been iterating the main context it was provided, but create a new one.
Okay, I've sent off a patch to anaconda that handles this from our end, just in case this bug doesn't end up going anywhere (especially for RHEL7). We'll be doubly protected if it does, though. Thanks again for the patch.
(In reply to Colin Walters from comment #6) > (In reply to Vratislav Podzimek from comment #5) > > > Can you please provide us a link to some documentation of the > > push/pop_thread_default() functions? I tried searching them in the official > > GLib documentation with no luck. > > https://developer.gnome.org/glib/stable/glib-The-Main-Event-Loop.html#g-main- > context-push-thread-default > > Right? Interesting I didn't see it in that page. Is it something new so that it is possible that I had some old cached version of that page? Even google says nothing about the stable docs when searching "glib push_thread_default" just about some prototyping.gnome.org stuff. > > > In general, is there any documentation on how the GLib main loop and > > contexts work? > > In the same page, yes. However the simplest way to think about this is that > a main loop is just a wrapper for a context with a little bit of convenience > API. > > > Is it possible to do sync calls of Gtk functions from a > > thread without issues if new context is created for the thread? > > No, not with Gtk. Not every GLib-based library exposes threadsafe objects, > and Gtk is one that does not. > > A main context is just a wrapper for poll() really. It doesn't help if > one's library or object has significant global state (as Gtk does, via the X > server, among other things). > > > We were > > always told that the only thread-safe thing in the whole GLib/Gtk stack is > > idle_add so I'm quite confused by your suggestion. > > No; modern Gio APIs like GDBus are also thread-safe. Most of the Gio I/O > APIs can be used sync and async from multiple threads. So are several other > libraries like libsoup. That's nice and good to now. I think we have seen some issues with GDBus because we didn't provide it a new context in non-main threads. I'll have a look at that and use new/separate contexts where possible. But still, why is GDBus not creating separate contexts for its calls on its own? > > However, now that I think about this more, this is still a libostree bug. > The patch we're adding to Anaconda is still correct, but the libostree sync > pull API shouldn't have been iterating the main context it was provided, but > create a new one. That sounds like a valid point. And if I understand it correctly we will be able to remove the added lines once it is fixed in libostree, right?
(In reply to Vratislav Podzimek from comment #8) > Interesting I didn't see it in that page. Is it something new * Since: 2.22 which is: CommitDate: Tue Sep 22 16:57:08 2009 -0400 > That's nice and good to now. I think we have seen some issues with GDBus > because we didn't provide it a new context in non-main threads. That is very likely. Particularly if you want to make async GDBus calls from a non-main thread, unless you push a new context, it will queue the reply for the default main context (i.e. the Gtk thread). > I'll have a > look at that and use new/separate contexts where possible. But still, why is > GDBus not creating separate contexts for its calls on its own? This is a highly complex topic, but the essential piece is that it maximizes flexibility for the APIs to consume the thread-default main context for async calls. Evolution for example has a main thread that makes async calls, and worker threads that also make async calls, each with independent contexts. > That sounds like a valid point. And if I understand it correctly we will be > able to remove the added lines once it is fixed in libostree, right? Yes. (But we still want Anaconda to push a new context for its worker threads)
This is in anaconda now.
This bug appears to have been reported against 'rawhide' during the Fedora 22 development cycle. Changing version to '22'. More information and reason for this action is here: https://fedoraproject.org/wiki/Fedora_Program_Management/HouseKeeping/Fedora22
Fedora 22 changed to end-of-life (EOL) status on 2016-07-19. Fedora 22 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. If you are unable to reopen this bug, please file a new report against the current release. If you experience problems, please add a comment to this bug. Thank you for reporting this bug and we are sorry it could not be fixed.