Bugzilla will be upgraded to version 5.0 on a still to be determined date in the near future. The original upgrade date has been delayed.
Bug 1600161 - polkitd process consuming a large amount of CPU resources [NEEDINFO]
polkitd process consuming a large amount of CPU resources
Status: NEW
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: gnome-settings-daemon (Show other bugs)
7.5
x86_64 Linux
unspecified Severity medium
: rc
: ---
Assigned To: Carlos Garnacho
Desktop QE
:
Depends On:
Blocks: 1621561
  Show dependency treegraph
 
Reported: 2018-07-11 10:51 EDT by jigar
Modified: 2018-11-01 22:55 EDT (History)
20 users (show)

See Also:
Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
dkochuka: needinfo? (cgarnach)


Attachments (Terms of Use)

  None (edit)
Description jigar 2018-07-11 10:51:30 EDT
Description of problem ( At Customer's end ) : On a VM cloned test environment, the process polkitd is constantly consuming about 15% of CPU.  This behavior is not seen in the primary environment (which was recently upgraded to 7.5 and used as a template to create a VM clone). Since the upgrade to 7.5, it is noticed that gnome-shell is very active (you can constantly see it popping up as top CPU consumer on the "top" window. our "nomachine" client also kept on freezing once the screensaver kicks in and locks the window. This behaviour started after the upgrade to 7.5 from 7.4.


Version-Release number of selected component (if applicable): 
polkit-0.112-14.el7

How reproducible: Frequently at customer's end

Actual results: Polkitd consuming 15% of CPU


Expected results: Polkitd shouldn't consume high CPU continuously 


Additional info:

- appears to be polling 'GetPasswordExpirationPolicy' constantly
- strace reflects the polling issue
- no centralized authentication in use
- confirmed that the frequent logins of root are automation and normal behaviour for their environment
Comment 2 David Herrmann 2018-07-12 10:57:55 EDT
(In reply to jigar from comment #0)
> Additional info:
> 
> - appears to be polling 'GetPasswordExpirationPolicy' constantly

Who polls this?

> - strace reflects the polling issue

Strace on what?

> - no centralized authentication in use

Can you elaborate what you mean with that?

> - confirmed that the frequent logins of root are automation and normal
> behaviour for their environment

Which "frequent logins of root"? What do you mean with that?
Comment 3 Yan Li 2018-07-17 20:24:30 EDT
It is gsd-account that keeps polling polkit for GetPasswordExpirationPolicy. 

Following is my analysis. This is a bug in gnome-settings-daemon:plugins/account/gsd-account-manager.c. A timeout is added to check the password expiration time, and a notification would be shown a few days before the password expires. The timeout queues itself back after being triggered. However, the timeout relies on manager->priv->last_notify_time, which is _ONLY_ updated when a notification is shown, but not after a check. Therefore, the timeout is added with 0 second and is triggered again instantly after it is added, causing the plugin to keep busy checking password expiration policy.

This bug should be reassigned to gsd maintainer.
Comment 4 Alessandro Suardi 2018-08-01 08:32:11 EDT
I can observe this on OL7.5 as well (2 laptops) and can add to observations at #3 that:

a) this issue appears to pop up for me after suspend-to-memory/resume cycles (but not in 100% cases)
b) strace'ing of /usr/libexec/gsd-account in a "normal" situation shows no ordinary activity on the daemon's threads (4 threads on restart_syscall)
c) strace'ing of /usr/libexec/gsd-account in a "bad" situation shows looping of TID 4098 as described in #3 above:


[asuardi@tippy ~]$ strace -fp 4098
strace: Process 4098 attached with 4 threads
[pid  4138] restart_syscall(<... resuming interrupted restart_syscall ...> <unfinished ...>
[pid  4112] restart_syscall(<... resuming interrupted poll ...> <unfinished ...>
[pid  4110] restart_syscall(<... resuming interrupted restart_syscall ...> <unfinished ...>
[pid  4098] restart_syscall(<... resuming interrupted poll ...> <unfinished ...>
[pid  4112] <... restart_syscall resumed> ) = 1
[pid  4112] recvmsg(8, {msg_name(0)=NULL, msg_iov(1)=[{"l\2\1\0010\0\0\0\0257\16\0005\0\0\0", 16}], msg_controllen=0, msg_flags=MSG_CMSG_CLOEXEC}, MSG_CMSG_CLOEXEC) = 16
[pid  4112] poll([{fd=8, events=POLLIN}], 1, 0) = 1 ([{fd=8, revents=POLLIN}])
[pid  4112] recvmsg(8, {msg_name(0)=NULL, msg_iov(1)=[{"\10\1g\0\6xxxxxx\0\0\0\0\0\5\1u\0\0\275\4\0\6\1s\0\6\0\0\0"..., 104}], msg_controllen=0, msg_flags=MSG_CMSG_CLOEXEC}, MSG_CMSG_CLOEXEC) = 104
[pid  4112] poll([{fd=5, events=POLLIN}, {fd=6, events=POLLIN}], 2, 0) = 0 (Timeout)
[pid  4112] write(3, "\1\0\0\0\0\0\0\0", 8) = 8
[pid  4098] <... restart_syscall resumed> ) = 1
[pid  4112] poll([{fd=8, events=POLLIN}], 1, 0 <unfinished ...>
[pid  4098] read(3,  <unfinished ...>
[pid  4112] <... poll resumed> )        = 0 (Timeout)
[pid  4098] <... read resumed> "\1\0\0\0\0\0\0\0", 16) = 8
[pid  4112] poll([{fd=5, events=POLLIN}, {fd=6, events=POLLIN}, {fd=8, events=POLLIN}], 3, -1 <unfinished ...>
[pid  4098] access("MARK: gsd-account on_got_password_expiration_policy: start ", F_OK) = -1 ENOENT (No such file or directory)
[pid  4098] access("MARK: gsd-account on_got_password_expiration_policy: end ", F_OK) = -1 ENOENT (No such file or directory)
[pid  4098] poll([{fd=3, events=POLLIN}], 1, 0) = 0 (Timeout)
[pid  4098] write(6, "\1\0\0\0\0\0\0\0", 8 <unfinished ...>
[pid  4112] <... poll resumed> )        = 1 ([{fd=6, revents=POLLIN}])
[pid  4098] <... write resumed> )       = 8
[pid  4112] futex(0x1639c80, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
[pid  4098] futex(0x1639c80, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
[pid  4112] <... futex resumed> )       = -1 EAGAIN (Resource temporarily unavailable)
[pid  4098] <... futex resumed> )       = 0
[pid  4112] read(6,  <unfinished ...>
[pid  4098] futex(0x1639c80, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
[pid  4112] <... read resumed> "\1\0\0\0\0\0\0\0", 16) = 8
[pid  4112] futex(0x1639c80, FUTEX_WAKE_PRIVATE, 1) = 1
[pid  4098] <... futex resumed> )       = 0
[pid  4112] futex(0x7f8238003f50, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
[pid  4098] futex(0x1639c80, FUTEX_WAKE_PRIVATE, 1) = 0
[pid  4098] futex(0x7f8238003f50, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
[pid  4112] <... futex resumed> )       = 0
[pid  4098] <... futex resumed> )       = 1
[pid  4112] futex(0x7f8238003f50, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
[pid  4098] poll([{fd=3, events=POLLIN}], 1, 25000 <unfinished ...>
[pid  4112] <... futex resumed> )       = 0
[pid  4112] sendmsg(8, {msg_name(0)=NULL, msg_iov(1)=[{"l\1\0\1\0\0\0\0\1\275\4\0\225\0\0\0\10\1g\0\0\0\0\0\1\1o\0\"\0\0\0"..., 168}], msg_controllen=0, msg_flags=0}, MSG_NOSIGNAL) = 168
[pid  4112] poll([{fd=5, events=POLLIN}, {fd=6, events=POLLIN}, {fd=8, events=POLLIN}], 3, 0) = 0 (Timeout)
[pid  4112] poll([{fd=5, events=POLLIN}, {fd=6, events=POLLIN}, {fd=8, events=POLLIN}], 3, -1) = 1 ([{fd=8, revents=POLLIN}])
[pid  4112] recvmsg(8, {msg_name(0)=NULL, msg_iov(1)=[{"l\2\1\0010\0\0\0\0307\16\0005\0\0\0", 16}], msg_controllen=0, msg_flags=MSG_CMSG_CLOEXEC}, MSG_CMSG_CLOEXEC) = 16
[pid  4112] poll([{fd=8, events=POLLIN}], 1, 0) = 1 ([{fd=8, revents=POLLIN}])
[pid  4112] recvmsg(8, {msg_name(0)=NULL, msg_iov(1)=[{"\10\1g\0\6xxxxxx\0\0\0\0\0\5\1u\0\1\275\4\0\6\1s\0\6\0\0\0"..., 104}], msg_controllen=0, msg_flags=MSG_CMSG_CLOEXEC}, MSG_CMSG_CLOEXEC) = 104
[pid  4112] poll([{fd=5, events=POLLIN}, {fd=6, events=POLLIN}], 2, 0) = 0 (Timeout)
[pid  4112] write(3, "\1\0\0\0\0\0\0\0", 8) = 8
[pid  4098] <... poll resumed> )        = 1 ([{fd=3, revents=POLLIN}])
[pid  4112] poll([{fd=8, events=POLLIN}], 1, 0 <unfinished ...>
[pid  4098] read(3,  <unfinished ...>
[pid  4112] <... poll resumed> )        = 0 (Timeout)
[pid  4098] <... read resumed> "\1\0\0\0\0\0\0\0", 16) = 8
[pid  4112] poll([{fd=5, events=POLLIN}, {fd=6, events=POLLIN}, {fd=8, events=POLLIN}], 3, -1 <unfinished ...>
[pid  4098] access("MARK: gsd-account on_got_password_expiration_policy: start ", F_OK) = -1 ENOENT (No such file or directory)
[pid  4098] access("MARK: gsd-account on_got_password_expiration_policy: end ", F_OK) = -1 ENOENT (No such file or directory)
[pid  4098] poll([{fd=3, events=POLLIN}], 1, 0) = 0 (Timeout)
[pid  4098] write(6, "\1\0\0\0\0\0\0\0", 8 <unfinished ...>
[pid  4112] <... poll resumed> )        = 1 ([{fd=6, revents=POLLIN}])
[pid  4098] <... write resumed> )       = 8
[pid  4112] futex(0x1639c80, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
[pid  4098] futex(0x1639c80, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
[pid  4112] <... futex resumed> )       = -1 EAGAIN (Resource temporarily unavailable)

[etc.]

d) [partly offtopic, but maybe can add further light to interactions between polkitd and this accounts daemon issue] restarting polkitd using instructions in comment #5 in bug 1235681 eventually causes my user to lose GNOME desktop permissions to:

  1. shutdown/suspend the laptop
  2. open up a VPN connection
 (both functionalities available in top right corner GNOME menu)


Relevant RPMS -

gnome-settings-daemon-3.26.2-9.el7.x86_64
accountsservice-0.6.45-7.el7.x86_64
 
This issue also appears to be reported on CentOS 7 systems (top CPU using processes are the same I can observe on my laptops): https://www.centos.org/forums/viewtopic.php?t=67547
Comment 5 Emanuel 2018-08-10 04:41:40 EDT
(In reply to Yan Li from comment #3)
> It is gsd-account that keeps polling polkit for GetPasswordExpirationPolicy. 
> 
> Following is my analysis. This is a bug in
> gnome-settings-daemon:plugins/account/gsd-account-manager.c. A timeout is
> added to check the password expiration time, and a notification would be
> shown a few days before the password expires. The timeout queues itself back
> after being triggered. However, the timeout relies on
> manager->priv->last_notify_time, which is _ONLY_ updated when a notification
> is shown, but not after a check. Therefore, the timeout is added with 0
> second and is triggered again instantly after it is added, causing the
> plugin to keep busy checking password expiration policy.
> 
> This bug should be reassigned to gsd maintainer.

Same issue here with several RHEL 7.5 servers and Yan Li's workaround works fine.
Comment 7 Paul F. Richards 2018-08-20 11:33:07 EDT
A user name mlogic on the centos forum has found a workaround that may or may not be suitable for your environment:

"A workaround that I've been using is to remove /etc/xdg/autostart/org.gnome.SettingsDaemon.Account.desktop until the above mentioned bug is fixed. The only function you will lose is the notification of expiring account password."

I can also verify this workaround resolved the high cpu usage of polkitd on two systems.

Note You need to log in before you can comment on or make changes to this bug.