Bug 1600161
Summary: | gnome-settings-daemon account plugin polling accountsservice in a continous loop | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | jigar <jraising> | ||||||
Component: | gnome-settings-daemon | Assignee: | Ray Strode [halfline] <rstrode> | ||||||
Status: | CLOSED ERRATA | QA Contact: | Desktop QE <desktop-qa-list> | ||||||
Severity: | high | Docs Contact: | |||||||
Priority: | high | ||||||||
Version: | 7.5 | CC: | alessandro.suardi, amike, aph, bgollahe, cgarnach, cisley, cpippin, dkochuka, dmulford, ekanter, elliot.li.tech, emak, hbjung, jaeshin, jomurphy, jraising, jseunghw, jwright, leigh.orf, matzilla, mboisver, mclasen, mkolbas, pasteur, paulrichards321, rkudyba, rstrode, rterzi, sfroemer, tpelka, tsorense, vpakolu, yferszt, yuokada, yzheng | ||||||
Target Milestone: | rc | ||||||||
Target Release: | --- | ||||||||
Hardware: | x86_64 | ||||||||
OS: | Linux | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | gnome-settings-daemon-3.28.1-4.el7 | Doc Type: | If docs needed, set a value | ||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2019-08-06 12:37:45 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Bug Depends On: | |||||||||
Bug Blocks: | 1621561, 1656436 | ||||||||
Attachments: |
|
Description
jigar
2018-07-11 14:51:30 UTC
(In reply to jigar from comment #0) > Additional info: > > - appears to be polling 'GetPasswordExpirationPolicy' constantly Who polls this? > - strace reflects the polling issue Strace on what? > - no centralized authentication in use Can you elaborate what you mean with that? > - confirmed that the frequent logins of root are automation and normal > behaviour for their environment Which "frequent logins of root"? What do you mean with that? It is gsd-account that keeps polling polkit for GetPasswordExpirationPolicy. Following is my analysis. This is a bug in gnome-settings-daemon:plugins/account/gsd-account-manager.c. A timeout is added to check the password expiration time, and a notification would be shown a few days before the password expires. The timeout queues itself back after being triggered. However, the timeout relies on manager->priv->last_notify_time, which is _ONLY_ updated when a notification is shown, but not after a check. Therefore, the timeout is added with 0 second and is triggered again instantly after it is added, causing the plugin to keep busy checking password expiration policy. This bug should be reassigned to gsd maintainer. I can observe this on OL7.5 as well (2 laptops) and can add to observations at #3 that: a) this issue appears to pop up for me after suspend-to-memory/resume cycles (but not in 100% cases) b) strace'ing of /usr/libexec/gsd-account in a "normal" situation shows no ordinary activity on the daemon's threads (4 threads on restart_syscall) c) strace'ing of /usr/libexec/gsd-account in a "bad" situation shows looping of TID 4098 as described in #3 above: [asuardi@tippy ~]$ strace -fp 4098 strace: Process 4098 attached with 4 threads [pid 4138] restart_syscall(<... resuming interrupted restart_syscall ...> <unfinished ...> [pid 4112] restart_syscall(<... resuming interrupted poll ...> <unfinished ...> [pid 4110] restart_syscall(<... resuming interrupted restart_syscall ...> <unfinished ...> [pid 4098] restart_syscall(<... resuming interrupted poll ...> <unfinished ...> [pid 4112] <... restart_syscall resumed> ) = 1 [pid 4112] recvmsg(8, {msg_name(0)=NULL, msg_iov(1)=[{"l\2\1\0010\0\0\0\0257\16\0005\0\0\0", 16}], msg_controllen=0, msg_flags=MSG_CMSG_CLOEXEC}, MSG_CMSG_CLOEXEC) = 16 [pid 4112] poll([{fd=8, events=POLLIN}], 1, 0) = 1 ([{fd=8, revents=POLLIN}]) [pid 4112] recvmsg(8, {msg_name(0)=NULL, msg_iov(1)=[{"\10\1g\0\6xxxxxx\0\0\0\0\0\5\1u\0\0\275\4\0\6\1s\0\6\0\0\0"..., 104}], msg_controllen=0, msg_flags=MSG_CMSG_CLOEXEC}, MSG_CMSG_CLOEXEC) = 104 [pid 4112] poll([{fd=5, events=POLLIN}, {fd=6, events=POLLIN}], 2, 0) = 0 (Timeout) [pid 4112] write(3, "\1\0\0\0\0\0\0\0", 8) = 8 [pid 4098] <... restart_syscall resumed> ) = 1 [pid 4112] poll([{fd=8, events=POLLIN}], 1, 0 <unfinished ...> [pid 4098] read(3, <unfinished ...> [pid 4112] <... poll resumed> ) = 0 (Timeout) [pid 4098] <... read resumed> "\1\0\0\0\0\0\0\0", 16) = 8 [pid 4112] poll([{fd=5, events=POLLIN}, {fd=6, events=POLLIN}, {fd=8, events=POLLIN}], 3, -1 <unfinished ...> [pid 4098] access("MARK: gsd-account on_got_password_expiration_policy: start ", F_OK) = -1 ENOENT (No such file or directory) [pid 4098] access("MARK: gsd-account on_got_password_expiration_policy: end ", F_OK) = -1 ENOENT (No such file or directory) [pid 4098] poll([{fd=3, events=POLLIN}], 1, 0) = 0 (Timeout) [pid 4098] write(6, "\1\0\0\0\0\0\0\0", 8 <unfinished ...> [pid 4112] <... poll resumed> ) = 1 ([{fd=6, revents=POLLIN}]) [pid 4098] <... write resumed> ) = 8 [pid 4112] futex(0x1639c80, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...> [pid 4098] futex(0x1639c80, FUTEX_WAKE_PRIVATE, 1 <unfinished ...> [pid 4112] <... futex resumed> ) = -1 EAGAIN (Resource temporarily unavailable) [pid 4098] <... futex resumed> ) = 0 [pid 4112] read(6, <unfinished ...> [pid 4098] futex(0x1639c80, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...> [pid 4112] <... read resumed> "\1\0\0\0\0\0\0\0", 16) = 8 [pid 4112] futex(0x1639c80, FUTEX_WAKE_PRIVATE, 1) = 1 [pid 4098] <... futex resumed> ) = 0 [pid 4112] futex(0x7f8238003f50, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...> [pid 4098] futex(0x1639c80, FUTEX_WAKE_PRIVATE, 1) = 0 [pid 4098] futex(0x7f8238003f50, FUTEX_WAKE_PRIVATE, 1 <unfinished ...> [pid 4112] <... futex resumed> ) = 0 [pid 4098] <... futex resumed> ) = 1 [pid 4112] futex(0x7f8238003f50, FUTEX_WAKE_PRIVATE, 1 <unfinished ...> [pid 4098] poll([{fd=3, events=POLLIN}], 1, 25000 <unfinished ...> [pid 4112] <... futex resumed> ) = 0 [pid 4112] sendmsg(8, {msg_name(0)=NULL, msg_iov(1)=[{"l\1\0\1\0\0\0\0\1\275\4\0\225\0\0\0\10\1g\0\0\0\0\0\1\1o\0\"\0\0\0"..., 168}], msg_controllen=0, msg_flags=0}, MSG_NOSIGNAL) = 168 [pid 4112] poll([{fd=5, events=POLLIN}, {fd=6, events=POLLIN}, {fd=8, events=POLLIN}], 3, 0) = 0 (Timeout) [pid 4112] poll([{fd=5, events=POLLIN}, {fd=6, events=POLLIN}, {fd=8, events=POLLIN}], 3, -1) = 1 ([{fd=8, revents=POLLIN}]) [pid 4112] recvmsg(8, {msg_name(0)=NULL, msg_iov(1)=[{"l\2\1\0010\0\0\0\0307\16\0005\0\0\0", 16}], msg_controllen=0, msg_flags=MSG_CMSG_CLOEXEC}, MSG_CMSG_CLOEXEC) = 16 [pid 4112] poll([{fd=8, events=POLLIN}], 1, 0) = 1 ([{fd=8, revents=POLLIN}]) [pid 4112] recvmsg(8, {msg_name(0)=NULL, msg_iov(1)=[{"\10\1g\0\6xxxxxx\0\0\0\0\0\5\1u\0\1\275\4\0\6\1s\0\6\0\0\0"..., 104}], msg_controllen=0, msg_flags=MSG_CMSG_CLOEXEC}, MSG_CMSG_CLOEXEC) = 104 [pid 4112] poll([{fd=5, events=POLLIN}, {fd=6, events=POLLIN}], 2, 0) = 0 (Timeout) [pid 4112] write(3, "\1\0\0\0\0\0\0\0", 8) = 8 [pid 4098] <... poll resumed> ) = 1 ([{fd=3, revents=POLLIN}]) [pid 4112] poll([{fd=8, events=POLLIN}], 1, 0 <unfinished ...> [pid 4098] read(3, <unfinished ...> [pid 4112] <... poll resumed> ) = 0 (Timeout) [pid 4098] <... read resumed> "\1\0\0\0\0\0\0\0", 16) = 8 [pid 4112] poll([{fd=5, events=POLLIN}, {fd=6, events=POLLIN}, {fd=8, events=POLLIN}], 3, -1 <unfinished ...> [pid 4098] access("MARK: gsd-account on_got_password_expiration_policy: start ", F_OK) = -1 ENOENT (No such file or directory) [pid 4098] access("MARK: gsd-account on_got_password_expiration_policy: end ", F_OK) = -1 ENOENT (No such file or directory) [pid 4098] poll([{fd=3, events=POLLIN}], 1, 0) = 0 (Timeout) [pid 4098] write(6, "\1\0\0\0\0\0\0\0", 8 <unfinished ...> [pid 4112] <... poll resumed> ) = 1 ([{fd=6, revents=POLLIN}]) [pid 4098] <... write resumed> ) = 8 [pid 4112] futex(0x1639c80, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...> [pid 4098] futex(0x1639c80, FUTEX_WAKE_PRIVATE, 1 <unfinished ...> [pid 4112] <... futex resumed> ) = -1 EAGAIN (Resource temporarily unavailable) [etc.] d) [partly offtopic, but maybe can add further light to interactions between polkitd and this accounts daemon issue] restarting polkitd using instructions in comment #5 in bug 1235681 eventually causes my user to lose GNOME desktop permissions to: 1. shutdown/suspend the laptop 2. open up a VPN connection (both functionalities available in top right corner GNOME menu) Relevant RPMS - gnome-settings-daemon-3.26.2-9.el7.x86_64 accountsservice-0.6.45-7.el7.x86_64 This issue also appears to be reported on CentOS 7 systems (top CPU using processes are the same I can observe on my laptops): https://www.centos.org/forums/viewtopic.php?t=67547 (In reply to Yan Li from comment #3) > It is gsd-account that keeps polling polkit for GetPasswordExpirationPolicy. > > Following is my analysis. This is a bug in > gnome-settings-daemon:plugins/account/gsd-account-manager.c. A timeout is > added to check the password expiration time, and a notification would be > shown a few days before the password expires. The timeout queues itself back > after being triggered. However, the timeout relies on > manager->priv->last_notify_time, which is _ONLY_ updated when a notification > is shown, but not after a check. Therefore, the timeout is added with 0 > second and is triggered again instantly after it is added, causing the > plugin to keep busy checking password expiration policy. > > This bug should be reassigned to gsd maintainer. Same issue here with several RHEL 7.5 servers and Yan Li's workaround works fine. A user name mlogic on the centos forum has found a workaround that may or may not be suitable for your environment: "A workaround that I've been using is to remove /etc/xdg/autostart/org.gnome.SettingsDaemon.Account.desktop until the above mentioned bug is fixed. The only function you will lose is the notification of expiring account password." I can also verify this workaround resolved the high cpu usage of polkitd on two systems. easiest work around is to set a notify period of 0. gsettings set org.gnome.settings-daemon.plugins.account notify-period 0 The work around in comment #12 is effective. Obviously it disables the notification in question, but not an issue for me. This reduced both polkitd and dbus-daemon CPU usage to 0, while previously they had been the top CPU users on my system. Created attachment 1503706 [details]
account: don't poll more frequently than notification period
At the moment, if an account has no reason to receive a notification,
the account plugin ends up polling continuously, desperately,
and unsuccessfully trying to find a reason to notify. That leads
to unnecessary CPU utilization.
The reason is an apparent think-o in the code. The code tracks
the last time a notification was shown, so it knows when to show
the next notification later, even if the notification period, or
account policy is updated by the admin in the interim.
The problem is that it's wrong to look at the last notification
time if there's no reason to show a notification. In that case
the wakeup is merely to poll updates on the account policy.
This commit addresses the problem by only looking at the previous
notification time, if it was within the current notification period.
Yan Li, thank you for investigating this issue.
I believe attachment 1503706 [details] should address the problem, though I haven't tested it yet.
This problem also occurs with RHEL7.6. Created attachment 1510185 [details]
test patch packages from halfline
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:2044 |