Bug 1724551
Summary: | gnome-shell process eat 100% CPU after resume | ||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 8 | Reporter: | Tomas Pelka <tpelka> | ||||||||||||
Component: | mutter | Assignee: | Jonas Ådahl <jadahl> | ||||||||||||
Status: | CLOSED ERRATA | QA Contact: | Desktop QE <desktop-qa-list> | ||||||||||||
Severity: | urgent | Docs Contact: | |||||||||||||
Priority: | high | ||||||||||||||
Version: | 8.1 | CC: | fmuellner, hdegoede, jeff.burrell, jkoten, jwboyer, mkrajnak, rstrode, vbenes | ||||||||||||
Target Milestone: | rc | Keywords: | Regression | ||||||||||||
Target Release: | 8.1 | ||||||||||||||
Hardware: | Unspecified | ||||||||||||||
OS: | Unspecified | ||||||||||||||
Whiteboard: | |||||||||||||||
Fixed In Version: | mutter-3.32.2-7.el8 | Doc Type: | If docs needed, set a value | ||||||||||||
Doc Text: | Story Points: | --- | |||||||||||||
Clone Of: | Environment: | ||||||||||||||
Last Closed: | 2019-11-05 22:14:25 UTC | Type: | Bug | ||||||||||||
Regression: | --- | Mount Type: | --- | ||||||||||||
Documentation: | --- | CRM: | |||||||||||||
Verified Versions: | Category: | --- | |||||||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||||||
Embargoed: | |||||||||||||||
Bug Depends On: | |||||||||||||||
Bug Blocks: | 1660905, 1664096, 1678350 | ||||||||||||||
Attachments: |
|
Well actually gdm is not a process so it is rather gnome-shell Created attachment 1585495 [details]
gdm session journal
I'm also able to reproduce, gnome-shell running under gdm user is eating 100% cpu.
journalctl -b -u session-c1.scope
Created attachment 1585496 [details]
backtrace
probably mutter... jkoten, can you do (gdb) print (char *) g_source_get_name (source) in gdb? (In reply to Ray Strode [halfline] from comment #4) > probably mutter... > > jkoten, can you do > > (gdb) print (char *) g_source_get_name (source) > > in gdb? $1 = 0x7fdc38440a00 "[gio] emit_signal_instance_in_idle_cb" Also I most often got "No symbol "source" in current context." when I attach to the running proccess. The reproducer is changing screen configuration between suspend and resume e.g., use two external monitors; suspend; undock; resume okay so it seems like a stream of signals are coming in over the message bus. if you run busctl --user monitor does it have a constant flurry of activity? Created attachment 1591413 [details]
perf.log
tpopela kindly helped us to get a bit more info:
Random bt's from gdb:
(gdb) bt
#0 0x00007fe900ed11de in timerfd_settime () at ../sysdeps/unix/syscall-template.S:78
#1 0x00007fe8f6a31c57 in g_datetime_source_init_timerfd (unix_seconds=1563368880, expected_now_seconds=1563368879, datetime_source=0x562b5ead1b30) at ../libgnome-desktop/gnome-datetime-source.c:193
#2 _gnome_datetime_source_new (now=now@entry=0x562b5cb60600, expiry=expiry@entry=0x562b5cb61660, cancel_on_set=cancel_on_set@entry=1) at ../libgnome-desktop/gnome-datetime-source.c:275
#3 0x00007fe8f6a2c505 in update_clock (data=0x562b5dafa750, data@entry=<error reading variable: value has been optimized out>) at ../libgnome-desktop/gnome-wall-clock.c:411
#4 0x00007fe8f6a31a86 in g_datetime_source_dispatch (source=<optimized out>, callback=<optimized out>, user_data=<optimized out>) at ../libgnome-desktop/gnome-datetime-source.c:145
#5 0x00007fe902db667d in g_main_dispatch (context=0x562b5bfc9660) at gmain.c:3176
#6 g_main_context_dispatch (context=context@entry=0x562b5bfc9660) at gmain.c:3829
#7 0x00007fe902db6a48 in g_main_context_iterate (context=0x562b5bfc9660, block=block@entry=1, dispatch=dispatch@entry=1, self=<optimized out>) at gmain.c:3902
#8 0x00007fe902db6d72 in g_main_loop_run (loop=0x562b5c6d64f0) at gmain.c:4098
#9 0x00007fe90124bc50 in meta_run () at ../src/core/main.c:685
#10 0x0000562b5a245536 in main (argc=<optimized out>, argv=<optimized out>) at ../src/main.c:499
(gdb) c
Continuing.
^C
Thread 1 "gnome-shell" received signal SIGINT, Interrupt.
g_mutex_unlock (mutex=0x7fe903080688 <g.aliases_lock>) at gthread-posix.c:1355
1355 if G_UNLIKELY (prev != 1)
(gdb) bt
#0 g_mutex_unlock (mutex=0x7fe903080688 <g.aliases_lock>) at gthread-posix.c:1355
#1 0x00007fe902d90639 in g_get_charset (charset=charset@entry=0x0) at gcharset.c:196
#2 0x00007fe8f6a2c1c4 in date_time_format (format=0x7fe8f6a366fe "%b %-e_%R", datetime=0x562b5cb60600) at ../libgnome-desktop/gnome-wall-clock.c:380
#3 gnome_wall_clock_string_for_datetime (self=self@entry=0x562b5dafa750, now=now@entry=0x562b5cb60600, clock_format=clock_format@entry=G_DESKTOP_CLOCK_FORMAT_24H, show_weekday=show_weekday@entry=0,
show_full_date=show_full_date@entry=1, show_seconds=show_seconds@entry=0) at ../libgnome-desktop/gnome-wall-clock.c:380
#4 0x00007fe8f6a2c574 in update_clock (data=0x562b5dafa750, data@entry=<error reading variable: value has been optimized out>) at ../libgnome-desktop/gnome-wall-clock.c:418
#5 0x00007fe8f6a31a86 in g_datetime_source_dispatch (source=<optimized out>, callback=<optimized out>, user_data=<optimized out>) at ../libgnome-desktop/gnome-datetime-source.c:145
#6 0x00007fe902db667d in g_main_dispatch (context=0x562b5bfc9660) at gmain.c:3176
#7 g_main_context_dispatch (context=context@entry=0x562b5bfc9660) at gmain.c:3829
#8 0x00007fe902db6a48 in g_main_context_iterate (context=0x562b5bfc9660, block=block@entry=1, dispatch=dispatch@entry=1, self=<optimized out>) at gmain.c:3902
#9 0x00007fe902db6d72 in g_main_loop_run (loop=0x562b5c6d64f0) at gmain.c:4098
#10 0x00007fe90124bc50 in meta_run () at ../src/core/main.c:685
#11 0x0000562b5a245536 in main (argc=<optimized out>, argv=<optimized out>) at ../src/main.c:499
This is actually quite often seen
output from perf attached
(In reply to Ray Strode [halfline] from comment #8) > okay so it seems like a stream of signals are coming in over the message bus. > > if you run > > busctl --user monitor > > does it have a constant flurry of activity? Assuming this is meant to be executed from gdm? If so sudo -u gdm busctl --user monitor Failed to set address: No such file or directory If this was meant to be executed as root than no activity is displayed. busctl --user monitor Monitoring bus message stream. Even if I try to monitor the particular service it displays nothing: busctl monitor session-c1.scope Monitoring bus message stream. Created attachment 1591437 [details]
gdm bus
When I monitor the gdm bus while in user session, there is no activity - this is while the gnome-shell gdm process is eating 100% CPU.
When I switched the user I got blank screen instead of the login screen and there was an activity on the bus - please see attachment.
I can see similar backtrace from frozen GS user session after resume.. bt attached here: https://bugzilla.redhat.com/show_bug.cgi?id=1718807#c17 I've forced GDM not to run on Wayland and my user session is on X too. so I took at look at this today and found the issue. It's this code: priv->inhibit_sleep_fd = g_variant_get_handle (fd_variant); ... close (priv->inhibit_sleep_fd); The problem is the first line there isn't returning and fd, it's returning an index which is supposed to be used to fetch the fd. So mutter was doing close (0). Suspend once, and it closes stdin, suspend twice, and it closes the timerfd (which now takes fd 0) behind the timerfd's back leading an infinite loop. This bug existed in the 8.0 branch for a while, but I fixed it before release here: Fix bug in suspend/resume corruption patch leading to inhibit fd not getting fetched but when the patch got forward ported for the 3.32 rebase, it was the older version that got inadvertently rebased. fix should be trivial, just forward port the latest 8.0.0 patch, so devack *** Bug 1718807 has been marked as a duplicate of this bug. *** Thanks a lot to everyone who helped working on this. Fixed in mutter-3.32.2-7.el8 *** Bug 1732610 has been marked as a duplicate of this bug. *** Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2019:3553 |
Created attachment 1585112 [details] journal.log Description of problem: After resume gdm process work out CPU to 100% Version-Release number of selected component (if applicable): gdm-3.28.3-20.el8 gnome-shell-3.32.2-4.el8 gnome-session-3.28.1-6.el8 How reproducible: 50% Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info: Adding journal log, 8:29 starts the suspend