Bug 1762025

Summary: pacemaker_remoted spams close() with large file descriptor limit
Product: Red Hat Enterprise Linux 8 Reporter: John Eckersberg <jeckersb>
Component: pacemakerAssignee: Ken Gaillot <kgaillot>
Status: CLOSED ERRATA QA Contact: pkomarov
Severity: high Docs Contact:
Priority: high    
Version: 8.0CC: abeekhof, aherr, cfeist, cluster-maint, michele, pkomarov, toneata
Target Milestone: rcKeywords: ZStream
Target Release: 8.2   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: pacemaker-2.0.3-1.el8 Doc Type: No Doc Update
Doc Text:
This will be invisible to most end users
Story Points: ---
Clone Of:
: 1764181 (view as bug list) Environment:
Last Closed: 2020-04-28 15:38:28 UTC Type: Enhancement
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1764181    

Description John Eckersberg 2019-10-15 21:21:35 UTC
Description of problem:
pacemaker_remoted spams close() inside of openstack containers due to large file descriptor table size.

In this bit from action_launch_child in services_linux.c:

https://github.com/ClusterLabs/pacemaker/blob/81d51216d218e39051ba32a8d2f46a91e450217b/lib/services/services_linux.c#L478-L481

getdtablesize() is 2^20, or 1048576, meaning every time a child gets forked off we call close repeatedly on a million non-existent file descriptors.

When doing some profiling with perf while running openstack tempest tests, I see that pacemaker_remoted is using about the same amount of cpu time as all of the rest of the control plane services combined, after taking into account the time wasted constantly jumping back and forth across the syscall boundary.

I have a patch that I will post upstream shortly and will reference this BZ#.


Version-Release number of selected component (if applicable):
pacemaker-remote-2.0.1-4.el8_0.4.x86_64

How reproducible:
Always

Steps to Reproduce:
With OSP15 deployment:

[root@controller-0 ~]# ps -f 8557
UID          PID    PPID  C STIME TTY      STAT   TIME CMD
root        8557    8114  0 19:32 ?        Ss     0:03 /usr/sbin/pacemaker_remoted
[root@controller-0 ~]# perf stat -e 'syscalls:sys_enter_close' -p 8557 -- sleep 37                                                                                                                                                                             

 Performance counter stats for process id '8557':

         3,147,367      syscalls:sys_enter_close                                    

      37.009166783 seconds time elapsed

[root@controller-0 ~]#

Expected results:
Significantly fewer calls to close()

Comment 1 John Eckersberg 2019-10-15 21:35:23 UTC
https://github.com/ClusterLabs/pacemaker/pull/1912

Comment 2 John Eckersberg 2019-10-15 21:47:52 UTC
After the patch, calls to close() reduce from 3,147,367 above to 1,878 below:

[root@controller-0 ~]# ps -f 8939
UID          PID    PPID  C STIME TTY      STAT   TIME CMD
root        8939    8623  0 21:39 ?        Ss     0:01 /usr/sbin/pacemaker_remoted
[root@controller-0 ~]# perf stat -e 'syscalls:sys_enter_close' -p 8939 -- sleep 37

 Performance counter stats for process id '8939':

             1,878      syscalls:sys_enter_close                                    

      37.007315989 seconds time elapsed

[root@controller-0 ~]#

Comment 4 Ken Gaillot 2019-10-17 16:45:07 UTC
Fixed upstream as of commit 5a73027

Comment 7 pkomarov 2019-11-05 12:30:00 UTC
Verified ,

root       46431       1  0 Nov04 ?        00:00:00   /usr/libexec/podman/conmon -s -c f2b67e100201c2580300befece3737f8
d3ce11732ac371712f1545015e339974 -u f2b67e100201c2580300befece3737f8d3ce11732ac371712f1545015e339974 -n redis-bundle-po
dman-0 -r /usr/bin/runc -b /var/lib/containers/storage/overlay-containers/f2b67e100201c2580300befece3737f8d3ce11732ac37
1712f1545015e339974/userdata -p /var/run/containers/storage/overlay-containers/f2b67e100201c2580300befece3737f8d3ce1173
2ac371712f1545015e339974/userdata/pidfile --exit-dir /var/run/libpod/exits --exit-command /usr/bin/podman --exit-comman
d-arg --root --exit-command-arg /var/lib/containers/storage --exit-command-arg --runroot --exit-command-arg /var/run/co
ntainers/storage --exit-command-arg --log-level --exit-command-arg error --exit-command-arg --cgroup-manager --exit-com
mand-arg systemd --exit-command-arg --tmpdir --exit-command-arg /var/run/libpod --exit-command-arg --runtime --exit-com
mand-arg runc --exit-command-arg --storage-driver --exit-command-arg overlay --exit-command-arg container --exit-comman
d-arg cleanup --exit-command-arg f2b67e100201c2580300befece3737f8d3ce11732ac371712f1545015e339974 --socket-dir-path /va
r/run/libpod/socket -l journald: --log-level error
root       46444   46431  0 Nov04 ?        00:00:00     dumb-init --single-child -- /bin/bash /usr/local/bin/kolla_star
t
root       46458   46444  0 Nov04 ?        00:00:40       /usr/sbin/pacemaker_remoted
42460      46796   46444  1 Nov04 ?        00:17:48       /usr/bin/redis-server 172.17.1.49:6379

[root@overcloud-controller-0 ~]# perf stat -e 'syscalls:sys_enter_close' -p 46458 -- sleep 37

 Performance counter stats for process id '46458':

         1,049,414      syscalls:sys_enter_close

      37.001985306 seconds time elapsed

dnf install -y http://download.eng.bos.redhat.com/brewroot/vol/rhel-8/packages/pacemaker/2.0.2/3.el8_1.2/x86_64/pacemaker-2.0.2-3.el8_1.2.x86_64.rpm http://download.eng.bos.redhat.com/brewroot/vol/rhel-8/packages/pacemaker/2.0.2/3.el8_1.2/x86_64/pacemaker-cli-2.0.2-3.el8_1.2.x86_64.rpm http://download.eng.bos.redhat.com/brewroot/vol/rhel-8/packages/pacemaker/2.0.2/3.el8_1.2/x86_64/pacemaker-cluster-libs-2.0.2-3.el8_1.2.x86_64.rpm http://download.eng.bos.redhat.com/brewroot/vol/rhel-8/packages/pacemaker/2.0.2/3.el8_1.2/x86_64/pacemaker-libs-2.0.2-3.el8_1.2.x86_64.rpm http://sts.lab.msp.redhat.com/dist/brewroot/repos/errata-rhel8.1.z/x86_64/pacemaker-schemas-2.0.2-3.el8_1.2.noarch.rpm http://sts.lab.msp.redhat.com/dist/brewroot/repos/errata-rhel8.1.z/x86_64/pacemaker-remote-2.0.2-3.el8_1.2.x86_64.rpm

[root@overcloud-controller-0 ~]# rpm -q pacemaker-remote
pacemaker-remote-2.0.2-3.el8_1.2.x86_64

[root@overcloud-controller-0 ~]# podman exec -it redis-bundle-podman-0 bash
()[root@overcloud-controller-0 /]# rpm -q pacemaker-remote
pacemaker-remote-2.0.2-3.el8_1.2.x86_64


[root@overcloud-controller-0 ~]# ps -efH|grep -A3 redis-bundle-p[o]
root      330769       1  0 11:11 ?        00:00:00   /usr/libexec/podman/conmon -s -c 2a54b643a1eb5cecea0058c5fcfbefa5d8900ac8b34f750c02adaaad7829f012 -u 2a54b643a1eb5cecea0058c5fcfbefa5d8900ac8b34f750c02adaaad7829f012 -n redis-bundle-podman-0 -r /usr/bin/runc -b /var/lib/containers/storage/overlay-containers/2a54b643a1eb5cecea0058c5fcfbefa5d8900ac8b34f750c02adaaad7829f012/userdata -p /var/run/containers/storage/overlay-containers/2a54b643a1eb5cecea0058c5fcfbefa5d8900ac8b34f750c02adaaad7829f012/userdata/pidfile --exit-dir /var/run/libpod/exits --exit-command /usr/bin/podman --exit-command-arg --root --exit-command-arg /var/lib/containers/storage --exit-command-arg --runroot --exit-command-arg /var/run/containers/storage --exit-command-arg --log-level --exit-command-arg error --exit-command-arg --cgroup-manager --exit-command-arg systemd --exit-command-arg --tmpdir --exit-command-arg /var/run/libpod --exit-command-arg --runtime --exit-command-arg runc --exit-command-arg --storage-driver --exit-command-arg overlay --exit-command-arg container --exit-command-arg cleanup --exit-command-arg 2a54b643a1eb5cecea0058c5fcfbefa5d8900ac8b34f750c02adaaad7829f012 --socket-dir-path /var/run/libpod/socket -l journald: --log-level error
root      330781  330769  3 11:11 ?        00:00:00     dumb-init --single-child -- /bin/bash /usr/local/bin/kolla_start
root      330796  330781  2 11:11 ?        00:00:00       /usr/sbin/pacemaker_remoted
[root@overcloud-controller-0 ~]# perf stat -e 'syscalls:sys_enter_close' -p 330796 -- sleep 37

Broadcast message from systemd-journald@overcloud-controller-0 (Tue 2019-11-05 11:12:29 UTC):

haproxy[300409]: proxy redis has no server available!


 Performance counter stats for process id '330796':

             6,880      syscalls:sys_enter_close

      37.002650077 seconds time elapsed

Comment 9 errata-xmlrpc 2020-04-28 15:38:28 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2020:1609