Bug 1841855
Summary: | [LLNL 7.8 Bug] gnome-keyring-daemon eating 100% CPU | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Trent D'Hooge <tdhooge> | ||||||
Component: | gnome-keyring | Assignee: | David King <dking> | ||||||
Status: | CLOSED MIGRATED | QA Contact: | Desktop QE <desktop-qa-list> | ||||||
Severity: | high | Docs Contact: | |||||||
Priority: | unspecified | ||||||||
Version: | 7.8 | CC: | amike, brian.p.millett, desktop-qa-list, dima, dking, fernando, jdreese, marcel.nijenhof, mboisver, mcrha, morenodelgad1, pasteur, tgummels, tpelka, tpopela, vpakolu | ||||||
Target Milestone: | rc | Keywords: | MigratedToJIRA, Triaged | ||||||
Target Release: | --- | ||||||||
Hardware: | x86_64 | ||||||||
OS: | Linux | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | If docs needed, set a value | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | |||||||||
: | 2144712 (view as bug list) | Environment: | |||||||
Last Closed: | 2023-09-25 23:01:56 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Bug Depends On: | |||||||||
Bug Blocks: | 2144712 | ||||||||
Attachments: |
|
Description
Trent D'Hooge
2020-05-29 16:24:56 UTC
Bugzilla that seems to be related. https://bugzilla.redhat.com/show_bug.cgi?id=1636330 Could you possibly help test this issue with the new RPMs? (In reply to Michael Boisvert from comment #6) > Could you possibly help test this issue with the new RPMs? Hello, I've put the RPM's where Trent can get them and have asked for testing feedback. Thank you for your prompt work on this bug! Travis (In reply to Travis Gummels from comment #7) > (In reply to Michael Boisvert from comment #6) > > Could you possibly help test this issue with the new RPMs? > > Hello, > > I've put the RPM's where Trent can get them and have asked for testing > feedback. > > Thank you for your prompt work on this bug! > > Travis Feedback from LLNL is that so far it looks good. Travis Thanks, let me know if anything changes. Created attachment 1696455 [details]
screenshot
When I update to gnome-keyring-3.28.2-2.el7.x86_64 and restart the machine and log in to GNOME I'm greeted with a prompt to unlock keyring (see screenshot). When I downgrade to gnome-keyring-3.28.2-1.el7.x86_64 the prompt is gone. Checking in seahorse the 'Login' keyring is unlocked properly with the previous version of the gnome-keyring.
We have been testing the new packages. We rebooted the servers which forced everyone to re-login. When doing this, we found the problem is not fixed with the test packages. We do not know how to fix the issue as of right now, so this bug is going to have to be pushed to z-stream. I have been able to work around this issue for now by implementing a cronjob that kills any gnome-kerying-daemon using 20% CPU utilization. Does the patch implemented into Fedora not work in this case? Which patch was given to us? I'm seeing the same bug in Fedora 33, with gnome-keyring-3.36.0-4.fc33.x86_64 and gnome-keyring-pam-3.36.0-4.fc33.x86_64. Still happening in Fedora 34. Not only using 100% CPU, but also preventing anything that relies on it from working. E.g., "ssh" just hangs. (In reply to Dima Ryazanov from comment #17) > Still happening in Fedora 34. Not only using 100% CPU, but also preventing > anything that relies on it from working. E.g., "ssh" just hangs. I'm wondering whether the backtrace of the daemon in this state is different in the Fedora 34. Could you install debuginfo packages for glib2 libsecret and the gnome-keyring packages, then, when it happens, grab a backtrace, please? You can get the backtrace with command like this: $ gdb --batch --ex "t a a bt" --pid=`pidof gnome-keyring-daemon` &>bt.txt Please check the bt.txt for any private information, like passwords, email addresses, server addresses,... I usually search for "pass" at least (quotes for clarity only). I didn't spot the high CPU usage for some time now. Oh damn, I finally killed it just minutes before your response. Here's what I got from gdb earlier, but without debuginfo, so maybe not that useful: (gdb) t apply all bt Thread 5 (LWP 388865 "pool-gnome-keyr"): #0 0x00007f277b193178 in g_mutex_unlock () at /lib64/libglib-2.0.so.0 #1 0x00007f277b199249 in g_main_context_iterate.constprop () at /lib64/libglib-2.0.so.0 #2 0x00007f277b142c03 in g_main_context_iteration () at /lib64/libglib-2.0.so.0 #3 0x000055bdb6731b39 in gkd_ssh_agent_process_connect () #4 0x000055bdb673594e in on_run.lto_priv () #5 0x00007f277b2fae61 in _g_cclosure_marshal_BOOLEAN__OBJECT_OBJECTv () at /lib64/libgio-2.0.so.0 #6 0x00007f277b25b83a in g_signal_emit_valist () at /lib64/libgobject-2.0.so.0 #7 0x00007f277b25b983 in g_signal_emit () at /lib64/libgobject-2.0.so.0 #8 0x00007f277b32ffd2 in g_threaded_socket_service_func () at /lib64/libgio-2.0.so.0 #9 0x00007f277b176b24 in g_thread_pool_thread_proxy.lto_priv () at /lib64/libglib-2.0.so.0 #10 0x00007f277b173c32 in g_thread_proxy () at /lib64/libglib-2.0.so.0 #11 0x00007f277af9c299 in start_thread () at /lib64/libpthread.so.0 #12 0x00007f277aec46a3 in clone () at /lib64/libc.so.6 Thread 4 (LWP 387974 "timer"): #0 0x00007f277aebf15d in syscall () at /lib64/libc.so.6 #1 0x00007f277b193243 in g_cond_wait () at /lib64/libglib-2.0.so.0 #2 0x000055bdb678840f in timer_thread_func.lto_priv () #3 0x00007f277b173c32 in g_thread_proxy () at /lib64/libglib-2.0.so.0 #4 0x00007f277af9c299 in start_thread () at /lib64/libpthread.so.0 #5 0x00007f277aec46a3 in clone () at /lib64/libc.so.6 Thread 3 (LWP 387831 "gdbus"): #0 0x00007f277aeb99ff in poll () at /lib64/libc.so.6 #1 0x00007f277b19947c in g_main_context_iterate.constprop () at /lib64/libglib-2.0.so.0 #2 0x00007f277b144a93 in g_main_loop_run () at /lib64/libglib-2.0.so.0 #3 0x00007f277b395d2a in gdbus_shared_thread_func.lto_priv () at /lib64/libgio-2.0.so.0 #4 0x00007f277b173c32 in g_thread_proxy () at /lib64/libglib-2.0.so.0 #5 0x00007f277af9c299 in start_thread () at /lib64/libpthread.so.0 #6 0x00007f277aec46a3 in clone () at /lib64/libc.so.6 Thread 2 (LWP 387830 "gmain"): #0 0x00007f277aeb99ff in poll () at /lib64/libc.so.6 #1 0x00007f277b19947c in g_main_context_iterate.constprop () at /lib64/libglib-2.0.so.0 #2 0x00007f277b142c03 in g_main_context_iteration () at /lib64/libglib-2.0.so.0 #3 0x00007f277b142c51 in glib_worker_main () at /lib64/libglib-2.0.so.0 #4 0x00007f277b173c32 in g_thread_proxy () at /lib64/libglib-2.0.so.0 #5 0x00007f277af9c299 in start_thread () at /lib64/libpthread.so.0 #6 0x00007f277aec46a3 in clone () at /lib64/libc.so.6 Thread 1 (LWP 387829 "gnome-keyring-d"): #0 0x00007f277aebf15d in syscall () at /lib64/libc.so.6 #1 0x00007f277b19315c in g_mutex_lock_slowpath () at /lib64/libglib-2.0.so.0 #2 0x000055bdb67318ef in on_child_watch () #3 0x00007f277b141708 in g_child_watch_dispatch () at /lib64/libglib-2.0.so.0 #4 0x00007f277b1454cf in g_main_context_dispatch () at /lib64/libglib-2.0.so.0 #5 0x00007f277b1994e8 in g_main_context_iterate.constprop () at /lib64/libglib-2.0.so.0 #6 0x00007f277b144a93 in g_main_loop_run () at /lib64/libglib-2.0.so.0 #7 0x000055bdb6714d3f in main () I'll get a proper one next time it happens. That's okay. It looks like it's the ssh part (gkd_ssh_agent_process_connect), which you also named as one of the examples. I'm thinking whether it'll make sense to get a series of the backtraces, to see which thread does something and eventually what it is. I do not know whether it'll be helpful for the gnome-keyring developers. In any case, the command can look like this: for i in {1..9}; do gdb --batch --ex "t a a bt" --pid=`pidof gnome-keyring-daemon` &>bt$i.txt; sleep 0.1; done Created attachment 1786157 [details]
Backtraces
Just reproduced it again; attaching a zip with 9 backtraces.
Some distinct ones I noticed:
bt1.txt:
Thread 5 (LWP 110589 "pool-gnome-keyr"):
#0 0x00007f0a0655819d in g_mutex_lock (mutex=mutex@entry=0x55fc98213cd0) at ../glib/gthread-posix.c:1515
#1 0x00007f0a0655e28f in g_main_context_iterate.constprop.0 (context=context@entry=0x55fc98213cd0, block=block@entry=0, dispatch=dispatch@entry=1, self=<optimized out>) at ../glib/gmain.c:4085
#2 0x00007f0a06507c03 in g_main_context_iteration (context=0x55fc98213cd0, context@entry=0x0, may_block=may_block@entry=0) at ../glib/gmain.c:4196
#3 0x000055fc96677b39 in gkd_ssh_agent_process_connect (self=0x55fc98219c40, cancellable=0x55fc98219ce0, error=error@entry=0x7f09eaace758) at daemon/ssh-agent/gkd-ssh-agent-process.c:232
#4 0x000055fc9667b94e in on_run (service=service@entry=0x55fc98229320, connection=connection@entry=0x55fc982712b0, source_object=source_object@entry=0x0, user_data=0x7f09eaace760) at daemon/ssh-agent/gkd-ssh-agent-service.c:297
#5 0x00007f0a066bfe61 in _g_cclosure_marshal_BOOLEAN__OBJECT_OBJECTv (closure=0x55fc98234310, return_value=0x7f09eaace950, instance=<optimized out>, args=<optimized out>, marshal_data=<optimized out>, n_params=<optimized out>, param_types=0x55fc98233db0) at ../gio/gmarshal-internal.c:333
#6 0x00007f0a0662083a in _g_closure_invoke_va (param_types=0x55fc98233db0, n_params=<optimized out>, args=0x7f09eaacea00, instance=0x55fc98229320, return_value=0x7f09eaace950, closure=0x55fc98234310) at ../gobject/gclosure.c:873
#7 g_signal_emit_valist (instance=0x55fc98229320, signal_id=<optimized out>, detail=0, var_args=var_args@entry=0x7f09eaacea00) at ../gobject/gsignal.c:3406
#8 0x00007f0a06620983 in g_signal_emit (instance=<optimized out>, signal_id=<optimized out>, detail=detail@entry=0) at ../gobject/gsignal.c:3553
#9 0x00007f0a066f4fd2 in g_threaded_socket_service_func (job_data=0x55fc982665a0, user_data=<optimized out>) at ../gio/gthreadedsocketservice.c:96
#10 0x00007f0a0653bb24 in g_thread_pool_thread_proxy (data=<optimized out>) at ../glib/gthreadpool.c:354
bt4.txt:
Thread 5 (LWP 110589 "pool-gnome-keyr"):
#0 0x00007f0a06557800 in g_private_get_impl (key=0x7f0a065ed4c0 <g_thread_specific_private>) at ../glib/gthread-posix.c:1053
#1 g_private_get (key=0x7f0a065ed4c0 <g_thread_specific_private>) at ../glib/gthread-posix.c:1082
#2 0x00007f0a0653a3f8 in g_thread_self () at ../glib/gthread.c:1036
#3 0x00007f0a065077e0 in g_main_context_acquire (context=0x55fc98213cd0) at ../glib/gmain.c:3399
#4 0x00007f0a0655e281 in g_main_context_iterate.constprop.0 (context=context@entry=0x55fc98213cd0, block=block@entry=0, dispatch=dispatch@entry=1, self=<optimized out>) at ../glib/gmain.c:4081
#5 0x00007f0a06507c03 in g_main_context_iteration (context=0x55fc98213cd0, context@entry=0x0, may_block=may_block@entry=0) at ../glib/gmain.c:4196
#6 0x000055fc96677b39 in gkd_ssh_agent_process_connect (self=0x55fc98219c40, cancellable=0x55fc98219ce0, error=error@entry=0x7f09eaace758) at daemon/ssh-agent/gkd-ssh-agent-process.c:232
#7 0x000055fc9667b94e in on_run (service=service@entry=0x55fc98229320, connection=connection@entry=0x55fc982712b0, source_object=source_object@entry=0x0, user_data=0x7f09eaace760) at daemon/ssh-agent/gkd-ssh-agent-service.c:297
#8 0x00007f0a066bfe61 in _g_cclosure_marshal_BOOLEAN__OBJECT_OBJECTv (closure=0x55fc98234310, return_value=0x7f09eaace950, instance=<optimized out>, args=<optimized out>, marshal_data=<optimized out>, n_params=<optimized out>, param_types=0x55fc98233db0) at ../gio/gmarshal-internal.c:333
#9 0x00007f0a0662083a in _g_closure_invoke_va (param_types=0x55fc98233db0, n_params=<optimized out>, args=0x7f09eaacea00, instance=0x55fc98229320, return_value=0x7f09eaace950, closure=0x55fc98234310) at ../gobject/gclosure.c:873
#10 g_signal_emit_valist (instance=0x55fc98229320, signal_id=<optimized out>, detail=0, var_args=var_args@entry=0x7f09eaacea00) at ../gobject/gsignal.c:3406
#11 0x00007f0a06620983 in g_signal_emit (instance=<optimized out>, signal_id=<optimized out>, detail=detail@entry=0) at ../gobject/gsignal.c:3553
#12 0x00007f0a066f4fd2 in g_threaded_socket_service_func (job_data=0x55fc982665a0, user_data=<optimized out>) at ../gio/gthreadedsocketservice.c:96
#13 0x00007f0a0653bb24 in g_thread_pool_thread_proxy (data=<optimized out>) at ../glib/gthreadpool.c:354
#14 0x00007f0a06538c32 in g_thread_proxy (data=0x7f09f8016ea0) at ../glib/gthread.c:826
bt6.txt:
#0 0x00007ffedebf7a7a in clock_gettime ()
#1 0x00007f0a06246a25 in clock_gettime () at /lib64/libc.so.6
#2 0x00007f0a0655e265 in sysprof_clock_get_current_time () at /usr/include/sysprof-4/sysprof-clock.h:88
#3 g_main_context_iterate.constprop.0 (context=context@entry=0x55fc98213cd0, block=block@entry=0, dispatch=dispatch@entry=1, self=<optimized out>) at ../glib/gmain.c:4079
#4 0x00007f0a06507c03 in g_main_context_iteration (context=0x55fc98213cd0, context@entry=0x0, may_block=may_block@entry=0) at ../glib/gmain.c:4196
#5 0x000055fc96677b39 in gkd_ssh_agent_process_connect (self=0x55fc98219c40, cancellable=0x55fc98219ce0, error=error@entry=0x7f09eaace758) at daemon/ssh-agent/gkd-ssh-agent-process.c:232
#6 0x000055fc9667b94e in on_run (service=service@entry=0x55fc98229320, connection=connection@entry=0x55fc982712b0, source_object=source_object@entry=0x0, user_data=0x7f09eaace760) at daemon/ssh-agent/gkd-ssh-agent-service.c:297
#7 0x00007f0a066bfe61 in _g_cclosure_marshal_BOOLEAN__OBJECT_OBJECTv (closure=0x55fc98234310, return_value=0x7f09eaace950, instance=<optimized out>, args=<optimized out>, marshal_data=<optimized out>, n_params=<optimized out>, param_types=0x55fc98233db0) at ../gio/gmarshal-internal.c:333
#8 0x00007f0a0662083a in _g_closure_invoke_va (param_types=0x55fc98233db0, n_params=<optimized out>, args=0x7f09eaacea00, instance=0x55fc98229320, return_value=0x7f09eaace950, closure=0x55fc98234310) at ../gobject/gclosure.c:873
#9 g_signal_emit_valist (instance=0x55fc98229320, signal_id=<optimized out>, detail=0, var_args=var_args@entry=0x7f09eaacea00) at ../gobject/gsignal.c:3406
#10 0x00007f0a06620983 in g_signal_emit (instance=<optimized out>, signal_id=<optimized out>, detail=detail@entry=0) at ../gobject/gsignal.c:3553
#11 0x00007f0a066f4fd2 in g_threaded_socket_service_func (job_data=0x55fc982665a0, user_data=<optimized out>) at ../gio/gthreadedsocketservice.c:96
#12 0x00007f0a0653bb24 in g_thread_pool_thread_proxy (data=<optimized out>) at ../glib/gthreadpool.c:354
#13 0x00007f0a06538c32 in g_thread_proxy (data=0x7f09f8016ea0) at ../glib/gthread.c:826
I'll leave it alone for a bit, so I can do more debugging if needed.
All the backtraces contain a thread with gkd_ssh_agent_process_connect(), from which I'd guess there's something wrong with the ssh-agent. The thread seems to be alive, not stuck on anything, except of the waiting for a response from the ssh-agent. Looking into the code, it's very simple: 229 if (started && !self->ready) { 230 source = g_timeout_add_seconds (5, on_timeout, &timedout); 231 while (!self->ready && !timedout) 232 g_main_context_iteration (NULL, FALSE); 233 g_source_remove (source); 234 } I'd argue that the code should not iterate the default context in a dedicated thread, it should be done in the mian thread only. Whether it's the reason why the 5 seconds timeout is not delivered I do not know. Why does it require "gnome-keyring" daemon to be running? Redhat provided a way to turn off the gnome-keyring off ==> https://access.redhat.com/solutions/3560101 > Why does it require "gnome-keyring" daemon to be running?
What do you mean by "it" here, please?
The libsecret talks to anything what implements its D-Bus interface, that being gnome-keyring-daemon process usually. It's used to store/read secrets (passwords/credentials/...) in a secure way.
(In reply to comment #22) > 229 if (started && !self->ready) { > 230 source = g_timeout_add_seconds (5, on_timeout, &timedout); > 231 while (!self->ready && !timedout) > 232 g_main_context_iteration (NULL, FALSE); > 233 g_source_remove (source); > 234 } Thinking of it now, it could be possible to completely avoid the g_timeout_add_seconds() with something like: if (started && !self->ready) { gint64 start = g_get_monotonic_time (); while (!self->ready && (g_get_monotonic_time () - start) / G_USEC_PER_SEC < 5) { g_main_context_iteration (NULL, FALSE); g_usleep (100000); /* Do not try more than 10 times per second */ } } It's only a workaround - a more sophisticated solution could be to use a GCond or a similar structure, which can block the thread until some time or a condition is met, without using a busy loop, which burns the CPU. Issue migration from Bugzilla to Jira is in process at this time. This will be the last message in Jira copied from the Bugzilla bug. This BZ has been automatically migrated to the issues.redhat.com Red Hat Issue Tracker. All future work related to this report will be managed there. Due to differences in account names between systems, some fields were not replicated. Be sure to add yourself to Jira issue's "Watchers" field to continue receiving updates and add others to the "Need Info From" field to continue requesting information. To find the migrated issue, look in the "Links" section for a direct link to the new issue location. The issue key will have an icon of 2 footprints next to it, and begin with "RHEL-" followed by an integer. You can also find this issue by visiting https://issues.redhat.com/issues/?jql= and searching the "Bugzilla Bug" field for this BZ's number, e.g. a search like: "Bugzilla Bug" = 1234567 In the event you have trouble locating or viewing this issue, you can file an issue by sending mail to rh-issues. You can also visit https://access.redhat.com/articles/7032570 for general account information. |