Bug 2215412

Summary: Performance issue with systemd-coredump and container process linking 2000 shared libraries.
Product: Red Hat Enterprise Linux 9 Reporter: Paulo Andrade <pandrade>
Component: systemdAssignee: David Tardon <dtardon>
Status: VERIFIED --- QA Contact: Frantisek Sumsal <fsumsal>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 9.2CC: ayadav, dtardon, mjw, romain.geissler, sdodson, systemd-maint-list
Target Milestone: rcKeywords: Performance, Triaged, ZStream
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: systemd-252-16.el9 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 2222259 (view as bug list) Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 2222259    

Description Paulo Andrade 2023-06-15 21:01:23 UTC
Sample reproducer:

"""
cat > lib.c << EOF
int TOTO = 42;
EOF

cat > toto.c << EOF
#include <stdio.h>
int main() {
        printf("started\n");
        int* i = 0;
        *i = 42;
}
EOF


LIBS=""
for i in `seq 2000`; do
        [[ -f liba$i.so ]] || gcc -shared -fPIC -o liba$i.so lib.c -D TOTO=var$i &
        LIBS="${LIBS} -la$i"
done
wait

echo libs built
gcc -o toto toto.c -L. -Wl,-rpath,$(pwd) ${LIBS}
echo running binary
ulimit -c unlimited
time ./toto
"""

  This will trigger systemd-coredump (forking and calling itself sd-parse-elf)
calling elf_gfetdata_rawchunk from elfutils, that does a linear search
(could be better). Then, for every chunk parsed earliersd-parse-elf calls again
elf_getdata_rawchunk, making it of O(N^2) complexity. That is already bad.

  See https://sourceware.org/pipermail/elfutils-devel/2023q2/006225.html for
a sample/proposed patch for systemd-coredump, that would turn the quadratic
performance issue into a linear one. The above link also shows that while
elf_gfetdata_rawchunk could be made better, systemd-coredump is not using it
correctly.

  The possible performance issue in elfutils is https://sourceware.org/git/?p=elfutils.git;a=commitdiff;h=8db849976f07046d27b4217e9ebd08d5623acc4f

Comment 2 Mark Wielaard 2023-06-22 11:54:26 UTC
Note that the systemd upstream patch seems to fix the performance issue by reducing the calls to elfutils libelf elf_getdata_rawchunk.
But there was indeed also a performance issue in elf_getdata_rawchunk. Proposed patch for that is:
https://inbox.sourceware.org/elfutils-devel/3465f95ee22b1ac433f1268f113e3813430be70a.camel@klomp.org/
We can also try to backport that one. The systemd fix seems easier though, and has a bigger performance impact.

Comment 3 Romain Geissler 2023-06-22 23:39:07 UTC
Please note that the systemd patch has a follow up fix here: https://github.com/systemd/systemd/pull/28128

Comment 4 Scott Dodson 2023-06-26 13:59:43 UTC
Requesting 9.2.z