Version-Release number of selected component: gnome-software-42.0-1.fc36 Additional info: reporter: libreport-2.17.1 backtrace_rating: 3 cgroup: 0::/user.slice/user-1000.slice/user/app.slice/app-gnome-org.gnome.Software-6438.scope cmdline: /usr/bin/gnome-software --gapplication-service crash_function: dnf_context_invalidate_full executable: /usr/bin/gnome-software journald_cursor: s=b91dbf804a664a4ba9764e42ddeeefe9;i=147c0;b=91ec8075647348f283520c40867825cf;m=32d66f89a;t=5db6f7e6b4cd6;x=d148deb755509f65 kernel: 5.17.0-300.fc36.x86_64 rootdir: / runlevel: N 5 type: CCpp uid: 1000 Truncated backtrace: Thread no. 1 (9 frames) #0 dnf_context_invalidate_full at /usr/src/debug/libdnf-0.66.0-1.fc36.x86_64/libdnf/dnf-context.cpp:2746 #2 signal_emit_unlocked_R.isra.0 at ../gobject/gsignal.c:3743 #5 g_cclosure_marshal_VOID__OBJECTv at ../gobject/gmarshal.c:1910 #6 _g_closure_invoke_va at ../gobject/gclosure.c:893 #9 ?? #10 ?? #14 g_main_context_iterate.constprop.0 at ../glib/gmain.c:4211 #15 g_main_context_iteration at ../glib/gmain.c:4276 #16 g_application_run at ../gio/gapplication.c:2569
Created attachment 1869436 [details] File: backtrace
Created attachment 1869437 [details] File: core_backtrace
Created attachment 1869438 [details] File: cpuinfo
Created attachment 1869439 [details] File: dso_list
Created attachment 1869440 [details] File: environ
Created attachment 1869441 [details] File: exploitable
Created attachment 1869442 [details] File: limits
Created attachment 1869443 [details] File: maps
Created attachment 1869444 [details] File: mountinfo
Created attachment 1869445 [details] File: open_fds
Created attachment 1869446 [details] File: proc_pid_status
Created attachment 1869447 [details] File: var_log_messages
Thanks for a bug report. If I read the backtrace properly, then the dnf_context_invalidate_full() is called by dnf_repo_loader_invalidate(), which is called from a signal callback dnf_repo_loader_mount_changed_cb(), which is connected inside dnf_repo_loader_init(), but it's not disconnected from it in the dnf_repo_loader_finalize(). Either the dnf_repo_loader_init() should disconnect in the finalize(), or it can use g_signal_connect_object() instead. I do not know libdnf, this is only a rough guess.
*** Bug 2071465 has been marked as a duplicate of this bug. ***
Thank you Milan for the analysis, disconnecting the signal in dnf_repo_loader_finalize seems like the right approach however I have not been able to reproduce the crash. I can see I can trigger the signal by mounting but it never crashed for me. Any idea how to ensure the context is NULL so that we can reproduce it?
I did not try to reproduce it myself. My idea of a reproducer would be to trigger the callback after the context is freed. It won't set the context to NULL (it cannot), but it can trigger a use-after-free, which can be caught by tools like valgrind or libasan, thus being written in a valgring log when you run your 'reproducer' under it: $ G_SLICE=always-malloc valgrind --track-origins=yes --aspace-minaddr=0x100000000 ./reproducer A very naive debug patch would add debug prints to the context constructor and destructor (printing the context address), with a similar print in the callback to verify the callback was called after the object was freed.
Similar problem has been detected: Laptop was suspended at the time. reporter: libreport-2.17.1 backtrace_rating: 4 cgroup: 0::/user.slice/user-1000.slice/user/app.slice/dbus-:1.2-org.gnome.Software cmdline: /usr/bin/gnome-software --gapplication-service crash_function: dnf_context_invalidate_full executable: /usr/bin/gnome-software journald_cursor: s=4d9209d19c614f01ad854f5039827b78;i=2b668;b=65e28c8a8b774e5185a74a4fe0ef9108;m=28f22a34c;t=5df2711603525;x=dc1b2b2159be29c5 kernel: 5.17.6-300.fc36.x86_64 package: gnome-software-42.1-1.fc36 reason: gnome-software killed by SIGSEGV rootdir: / runlevel: N 5 type: CCpp uid: 1000
Created attachment 1880289 [details] File: backtrace
*** Bug 2089723 has been marked as a duplicate of this bug. ***
Similar problem has been detected: Nothing; crashes when starting Gnome reporter: libreport-2.17.1 backtrace_rating: 4 cgroup: 0::/user.slice/user-1000.slice/user/app.slice/app-gnome-org.gnome.Software-1850.scope cmdline: /usr/bin/gnome-software --gapplication-service crash_function: dnf_context_invalidate_full executable: /usr/bin/gnome-software journald_cursor: s=1794480aea06419fb643eea69e6ed8ae;i=3872;b=dea4a017d28b474eb2ff022687d35f62;m=6ceeb16;t=5e1172d47ed58;x=5f88130de6d84e19 kernel: 5.17.12-300.fc36.x86_64 package: gnome-software-42.2-1.fc36 reason: gnome-software killed by SIGSEGV rootdir: / runlevel: N 5 type: CCpp uid: 1000
Similar problem has been detected: Happens just by itself, in the background. Seems to correlate with unmounting external disks, but hard to tell. reporter: libreport-2.17.1 backtrace_rating: 4 cgroup: 0::/user.slice/user-1000.slice/user/app.slice/app-gnome-org.gnome.Software-2974.scope cmdline: /usr/bin/gnome-software --gapplication-service crash_function: dnf_context_invalidate_full executable: /usr/bin/gnome-software journald_cursor: s=f85d087d84bf49beaa6f50595ce40b07;i=132600;b=19fecc490cf844c584dc2de551dc5b1f;m=37155c1;t=5e46a68152b3a;x=912e68afefdf7bf7 kernel: 5.18.11-200.fc36.x86_64 package: gnome-software-42.3-1.fc36 reason: gnome-software killed by SIGSEGV rootdir: / runlevel: N 5 type: CCpp uid: 1000
*** Bug 2131281 has been marked as a duplicate of this bug. ***
@amatej There are people hitting this, not talking that use-after-free is a bad thing. There are usually filled CVE-s for such things, because their consequences can be disastrous (like due to it modifying unpredictable part of the code). I can understand the need for a reproducer, to be able to confirm that the change really works, but I have nothing than valgrind or a custom build with debug prints confirming it works (for such debug prints, I'd add a print at the place where the signal is added, where would be two handlers added, one for a debug print that the signal handler had been called and with what contenxt and the original one; and then a print at the context free function. The sequence of the prints will show when was done what.)
Thank you very much for highlighting the issue. The issue is in the part that originate in LIBHIF - a library of PackageKit. I know that the issue is in the LIBDNF library, but I guess that only experts on PackageKit and workflows there could resolve the issue, because there is no reproducer, and no other hint from valgrin. Please take it as a request for a help - I am changing the component to PackageKit.
Created attachment 1916468 [details] reproducer The first line contains a comment how to compile & run it. Current output: > Before first pass > After first pass > > (process:67130): GLib-GObject-WARNING **: 14:29:43.721: instance with invalid (NULL) class pointer > > (process:67130): GLib-GObject-CRITICAL **: 14:29:43.721: g_signal_emit_valist: assertion 'G_TYPE_CHECK_INSTANCE (instance)' failed > Segmentation fault (core dumped) while it should end with "After second pass", no runtime warnings and no crash.
Thank you again for helping @mcrha! I made a PR for libdnf: https://github.com/rpm-software-management/libdnf/pull/1582
FEDORA-2023-d3d8f7571a has been submitted as an update to Fedora 38. https://bodhi.fedoraproject.org/updates/FEDORA-2023-d3d8f7571a
FEDORA-2023-05391614c9 has been submitted as an update to Fedora 37. https://bodhi.fedoraproject.org/updates/FEDORA-2023-05391614c9
FEDORA-2023-d3d8f7571a has been pushed to the Fedora 38 testing repository. Soon you'll be able to install the update with the following command: `sudo dnf upgrade --enablerepo=updates-testing --refresh --advisory=FEDORA-2023-d3d8f7571a` You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2023-d3d8f7571a See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.
FEDORA-2023-05391614c9 has been pushed to the Fedora 37 testing repository. Soon you'll be able to install the update with the following command: `sudo dnf upgrade --enablerepo=updates-testing --refresh --advisory=FEDORA-2023-05391614c9` You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2023-05391614c9 See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.
FEDORA-2023-d3d8f7571a has been pushed to the Fedora 38 stable repository. If problem still persists, please make note of it in this bug report.
FEDORA-2023-05391614c9 has been pushed to the Fedora 37 stable repository. If problem still persists, please make note of it in this bug report.