Bug 2155424

Summary: ListUnitFiles impacts systemd a lot, causing high CPU consumption and delay in operations
Product: Red Hat Enterprise Linux 9 Reporter: Renaud Métrich <rmetrich>
Component: systemdAssignee: systemd maint <systemd-maint>
Status: CLOSED MIGRATED QA Contact: Frantisek Sumsal <fsumsal>
Severity: medium Docs Contact:
Priority: medium    
Version: 9.0CC: dtardon, fkrska, larutiun, msekleta, rgreene, sbalasub, smahanga, systemd-maint, systemd-maint-list, systemd-maint, ykhutale
Target Milestone: rcKeywords: MigratedToJIRA, Performance, Reproducer, Triaged
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-09-21 12:20:03 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Renaud Métrich 2022-12-21 08:38:26 UTC
Description of problem:

This is a respin of BZ #1828759 ("systemctl list-unit-files" kills the system usability).

With RHEL8.7 / systemd-239-68.el8 (didn't check with older ones yet), we can see that the ListUnitFiles DBus operation is very costly.
When executing systemd under strace, we can observe that on a system having 145 units ("minimal installation" of RHEL8.7), the readlinkat/openat/getdents64 syscalls are called hundreds or even thousands of times on same files (see reproducer below), causing high CPU consumption.


Version-Release number of selected component (if applicable):

systemd-239-68.el8

How reproducible:

Always

Steps to Reproduce:
1. Execute the DBus ListUnitFiles operation while stracing systemd

    # strace -CttTvyy -s 128 -o list-unit-files.strace -p 1 &
    # time dbus-send --system --print-reply --reply-timeout=20000 --type=method_call --dest=org.freedesktop.systemd1 /org/freedesktop/systemd1 org.freedesktop.systemd1.Manager.ListUnitFiles
    # kill %1

2. Check number of syscalls

    # tail -20 list-unit-files.strace
    % time     seconds  usecs/call     calls    errors syscall
    ------ ----------- ----------- --------- --------- ----------------
     30.08    0.277966           2    119048           readlinkat
     21.62    0.199787           2     79554     16869 openat
     21.01    0.194170           2     89605           getdents64
     10.02    0.092602           1     90426           fcntl
    ...

3. Check number of times readlinkat() is executed on a given path

    # grep " readlinkat(" list-unit-files.strace | cut -f2 -d'"' | sort | uniq -c | sort -k1 -rn > list-unit-files.strace.readlinkat
    # head list-unit-files.strace.readlinkat
    820 /etc/systemd/system/syslog.service
    819 /etc/systemd/system/multi-user.target.wants/remote-fs.target
    818 /etc/systemd/system/multi-user.target.wants/crond.service
    817 /etc/systemd/system/multi-user.target.wants/NetworkManager.service
    815 /etc/systemd/system/multi-user.target.wants/dnf-makecache.timer
    814 /etc/systemd/system/multi-user.target.wants/sssd.service
    ...

    # grep "NetworkManager.service" list-unit-files.strace.readlinkat
    817 /etc/systemd/system/multi-user.target.wants/NetworkManager.service
    793 /etc/systemd/system/dbus-org.freedesktop.NetworkManager.service

Actual results:

Slowness, CPU consumption

Expected results:

Files/Symlinks resolved once.

Comment 23 RHEL Program Management 2023-09-21 12:16:22 UTC
Issue migration from Bugzilla to Jira is in process at this time. This will be the last message in Jira copied from the Bugzilla bug.

Comment 24 RHEL Program Management 2023-09-21 12:20:03 UTC
This BZ has been automatically migrated to the issues.redhat.com Red Hat Issue Tracker. All future work related to this report will be managed there.

Due to differences in account names between systems, some fields were not replicated.  Be sure to add yourself to Jira issue's "Watchers" field to continue receiving updates and add others to the "Need Info From" field to continue requesting information.

To find the migrated issue, look in the "Links" section for a direct link to the new issue location. The issue key will have an icon of 2 footprints next to it, and begin with "RHEL-" followed by an integer.  You can also find this issue by visiting https://issues.redhat.com/issues/?jql= and searching the "Bugzilla Bug" field for this BZ's number, e.g. a search like:

"Bugzilla Bug" = 1234567

In the event you have trouble locating or viewing this issue, you can file an issue by sending mail to rh-issues. You can also visit https://access.redhat.com/articles/7032570 for general account information.