Red Hat Bugzilla – Bug 1600161
polkitd process consuming a large amount of CPU resources
Last modified: 2018-11-01 22:55:11 EDT
Description of problem ( At Customer's end ) : On a VM cloned test environment, the process polkitd is constantly consuming about 15% of CPU. This behavior is not seen in the primary environment (which was recently upgraded to 7.5 and used as a template to create a VM clone). Since the upgrade to 7.5, it is noticed that gnome-shell is very active (you can constantly see it popping up as top CPU consumer on the "top" window. our "nomachine" client also kept on freezing once the screensaver kicks in and locks the window. This behaviour started after the upgrade to 7.5 from 7.4. Version-Release number of selected component (if applicable): polkit-0.112-14.el7 How reproducible: Frequently at customer's end Actual results: Polkitd consuming 15% of CPU Expected results: Polkitd shouldn't consume high CPU continuously Additional info: - appears to be polling 'GetPasswordExpirationPolicy' constantly - strace reflects the polling issue - no centralized authentication in use - confirmed that the frequent logins of root are automation and normal behaviour for their environment
(In reply to jigar from comment #0) > Additional info: > > - appears to be polling 'GetPasswordExpirationPolicy' constantly Who polls this? > - strace reflects the polling issue Strace on what? > - no centralized authentication in use Can you elaborate what you mean with that? > - confirmed that the frequent logins of root are automation and normal > behaviour for their environment Which "frequent logins of root"? What do you mean with that?
It is gsd-account that keeps polling polkit for GetPasswordExpirationPolicy. Following is my analysis. This is a bug in gnome-settings-daemon:plugins/account/gsd-account-manager.c. A timeout is added to check the password expiration time, and a notification would be shown a few days before the password expires. The timeout queues itself back after being triggered. However, the timeout relies on manager->priv->last_notify_time, which is _ONLY_ updated when a notification is shown, but not after a check. Therefore, the timeout is added with 0 second and is triggered again instantly after it is added, causing the plugin to keep busy checking password expiration policy. This bug should be reassigned to gsd maintainer.
I can observe this on OL7.5 as well (2 laptops) and can add to observations at #3 that: a) this issue appears to pop up for me after suspend-to-memory/resume cycles (but not in 100% cases) b) strace'ing of /usr/libexec/gsd-account in a "normal" situation shows no ordinary activity on the daemon's threads (4 threads on restart_syscall) c) strace'ing of /usr/libexec/gsd-account in a "bad" situation shows looping of TID 4098 as described in #3 above: [asuardi@tippy ~]$ strace -fp 4098 strace: Process 4098 attached with 4 threads [pid 4138] restart_syscall(<... resuming interrupted restart_syscall ...> <unfinished ...> [pid 4112] restart_syscall(<... resuming interrupted poll ...> <unfinished ...> [pid 4110] restart_syscall(<... resuming interrupted restart_syscall ...> <unfinished ...> [pid 4098] restart_syscall(<... resuming interrupted poll ...> <unfinished ...> [pid 4112] <... restart_syscall resumed> ) = 1 [pid 4112] recvmsg(8, {msg_name(0)=NULL, msg_iov(1)=[{"l\2\1\0010\0\0\0\0257\16\0005\0\0\0", 16}], msg_controllen=0, msg_flags=MSG_CMSG_CLOEXEC}, MSG_CMSG_CLOEXEC) = 16 [pid 4112] poll([{fd=8, events=POLLIN}], 1, 0) = 1 ([{fd=8, revents=POLLIN}]) [pid 4112] recvmsg(8, {msg_name(0)=NULL, msg_iov(1)=[{"\10\1g\0\6xxxxxx\0\0\0\0\0\5\1u\0\0\275\4\0\6\1s\0\6\0\0\0"..., 104}], msg_controllen=0, msg_flags=MSG_CMSG_CLOEXEC}, MSG_CMSG_CLOEXEC) = 104 [pid 4112] poll([{fd=5, events=POLLIN}, {fd=6, events=POLLIN}], 2, 0) = 0 (Timeout) [pid 4112] write(3, "\1\0\0\0\0\0\0\0", 8) = 8 [pid 4098] <... restart_syscall resumed> ) = 1 [pid 4112] poll([{fd=8, events=POLLIN}], 1, 0 <unfinished ...> [pid 4098] read(3, <unfinished ...> [pid 4112] <... poll resumed> ) = 0 (Timeout) [pid 4098] <... read resumed> "\1\0\0\0\0\0\0\0", 16) = 8 [pid 4112] poll([{fd=5, events=POLLIN}, {fd=6, events=POLLIN}, {fd=8, events=POLLIN}], 3, -1 <unfinished ...> [pid 4098] access("MARK: gsd-account on_got_password_expiration_policy: start ", F_OK) = -1 ENOENT (No such file or directory) [pid 4098] access("MARK: gsd-account on_got_password_expiration_policy: end ", F_OK) = -1 ENOENT (No such file or directory) [pid 4098] poll([{fd=3, events=POLLIN}], 1, 0) = 0 (Timeout) [pid 4098] write(6, "\1\0\0\0\0\0\0\0", 8 <unfinished ...> [pid 4112] <... poll resumed> ) = 1 ([{fd=6, revents=POLLIN}]) [pid 4098] <... write resumed> ) = 8 [pid 4112] futex(0x1639c80, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...> [pid 4098] futex(0x1639c80, FUTEX_WAKE_PRIVATE, 1 <unfinished ...> [pid 4112] <... futex resumed> ) = -1 EAGAIN (Resource temporarily unavailable) [pid 4098] <... futex resumed> ) = 0 [pid 4112] read(6, <unfinished ...> [pid 4098] futex(0x1639c80, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...> [pid 4112] <... read resumed> "\1\0\0\0\0\0\0\0", 16) = 8 [pid 4112] futex(0x1639c80, FUTEX_WAKE_PRIVATE, 1) = 1 [pid 4098] <... futex resumed> ) = 0 [pid 4112] futex(0x7f8238003f50, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...> [pid 4098] futex(0x1639c80, FUTEX_WAKE_PRIVATE, 1) = 0 [pid 4098] futex(0x7f8238003f50, FUTEX_WAKE_PRIVATE, 1 <unfinished ...> [pid 4112] <... futex resumed> ) = 0 [pid 4098] <... futex resumed> ) = 1 [pid 4112] futex(0x7f8238003f50, FUTEX_WAKE_PRIVATE, 1 <unfinished ...> [pid 4098] poll([{fd=3, events=POLLIN}], 1, 25000 <unfinished ...> [pid 4112] <... futex resumed> ) = 0 [pid 4112] sendmsg(8, {msg_name(0)=NULL, msg_iov(1)=[{"l\1\0\1\0\0\0\0\1\275\4\0\225\0\0\0\10\1g\0\0\0\0\0\1\1o\0\"\0\0\0"..., 168}], msg_controllen=0, msg_flags=0}, MSG_NOSIGNAL) = 168 [pid 4112] poll([{fd=5, events=POLLIN}, {fd=6, events=POLLIN}, {fd=8, events=POLLIN}], 3, 0) = 0 (Timeout) [pid 4112] poll([{fd=5, events=POLLIN}, {fd=6, events=POLLIN}, {fd=8, events=POLLIN}], 3, -1) = 1 ([{fd=8, revents=POLLIN}]) [pid 4112] recvmsg(8, {msg_name(0)=NULL, msg_iov(1)=[{"l\2\1\0010\0\0\0\0307\16\0005\0\0\0", 16}], msg_controllen=0, msg_flags=MSG_CMSG_CLOEXEC}, MSG_CMSG_CLOEXEC) = 16 [pid 4112] poll([{fd=8, events=POLLIN}], 1, 0) = 1 ([{fd=8, revents=POLLIN}]) [pid 4112] recvmsg(8, {msg_name(0)=NULL, msg_iov(1)=[{"\10\1g\0\6xxxxxx\0\0\0\0\0\5\1u\0\1\275\4\0\6\1s\0\6\0\0\0"..., 104}], msg_controllen=0, msg_flags=MSG_CMSG_CLOEXEC}, MSG_CMSG_CLOEXEC) = 104 [pid 4112] poll([{fd=5, events=POLLIN}, {fd=6, events=POLLIN}], 2, 0) = 0 (Timeout) [pid 4112] write(3, "\1\0\0\0\0\0\0\0", 8) = 8 [pid 4098] <... poll resumed> ) = 1 ([{fd=3, revents=POLLIN}]) [pid 4112] poll([{fd=8, events=POLLIN}], 1, 0 <unfinished ...> [pid 4098] read(3, <unfinished ...> [pid 4112] <... poll resumed> ) = 0 (Timeout) [pid 4098] <... read resumed> "\1\0\0\0\0\0\0\0", 16) = 8 [pid 4112] poll([{fd=5, events=POLLIN}, {fd=6, events=POLLIN}, {fd=8, events=POLLIN}], 3, -1 <unfinished ...> [pid 4098] access("MARK: gsd-account on_got_password_expiration_policy: start ", F_OK) = -1 ENOENT (No such file or directory) [pid 4098] access("MARK: gsd-account on_got_password_expiration_policy: end ", F_OK) = -1 ENOENT (No such file or directory) [pid 4098] poll([{fd=3, events=POLLIN}], 1, 0) = 0 (Timeout) [pid 4098] write(6, "\1\0\0\0\0\0\0\0", 8 <unfinished ...> [pid 4112] <... poll resumed> ) = 1 ([{fd=6, revents=POLLIN}]) [pid 4098] <... write resumed> ) = 8 [pid 4112] futex(0x1639c80, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...> [pid 4098] futex(0x1639c80, FUTEX_WAKE_PRIVATE, 1 <unfinished ...> [pid 4112] <... futex resumed> ) = -1 EAGAIN (Resource temporarily unavailable) [etc.] d) [partly offtopic, but maybe can add further light to interactions between polkitd and this accounts daemon issue] restarting polkitd using instructions in comment #5 in bug 1235681 eventually causes my user to lose GNOME desktop permissions to: 1. shutdown/suspend the laptop 2. open up a VPN connection (both functionalities available in top right corner GNOME menu) Relevant RPMS - gnome-settings-daemon-3.26.2-9.el7.x86_64 accountsservice-0.6.45-7.el7.x86_64 This issue also appears to be reported on CentOS 7 systems (top CPU using processes are the same I can observe on my laptops): https://www.centos.org/forums/viewtopic.php?t=67547
(In reply to Yan Li from comment #3) > It is gsd-account that keeps polling polkit for GetPasswordExpirationPolicy. > > Following is my analysis. This is a bug in > gnome-settings-daemon:plugins/account/gsd-account-manager.c. A timeout is > added to check the password expiration time, and a notification would be > shown a few days before the password expires. The timeout queues itself back > after being triggered. However, the timeout relies on > manager->priv->last_notify_time, which is _ONLY_ updated when a notification > is shown, but not after a check. Therefore, the timeout is added with 0 > second and is triggered again instantly after it is added, causing the > plugin to keep busy checking password expiration policy. > > This bug should be reassigned to gsd maintainer. Same issue here with several RHEL 7.5 servers and Yan Li's workaround works fine.
A user name mlogic on the centos forum has found a workaround that may or may not be suitable for your environment: "A workaround that I've been using is to remove /etc/xdg/autostart/org.gnome.SettingsDaemon.Account.desktop until the above mentioned bug is fixed. The only function you will lose is the notification of expiring account password." I can also verify this workaround resolved the high cpu usage of polkitd on two systems.