Bug 1752880

Summary: katello-host-tools-tracer stats paths abusively, leading to a hang or slowness of yum command
Product: Red Hat Satellite Reporter: Renaud Métrich <rmetrich>
Component: katello-tracerAssignee: Jonathon Turel <jturel>
Status: CLOSED ERRATA QA Contact: Ondrej Gajdusek <ogajduse>
Severity: medium Docs Contact:
Priority: high    
Version: 6.5.0CC: akapse, bkearney, cdonnell, egolov, jturel, ktordeur, rbdiri, satellite6-bugs
Target Milestone: 6.8.0Keywords: PrioBumpGSS, Triaged
Target Release: Unused   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: tracer-0.7.2-1 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-10-27 12:59:02 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Renaud Métrich 2019-09-17 13:24:33 UTC
Description of problem:

When having the katello-host-tools-tracer package installed on the system, many processes are scanned when a yum operation is performed.
In particular it seems that arguments of these processes are collected, then stat() is issued on the arguments, which is usually non-sense:

-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------
15160 15:09:27.340215 stat("--no-daemon", 0x7ffec78c75f0) = -1 ENOENT (No such file or directory) <0.000012>
15160 15:09:27.376073 stat("--keep-baud", 0x7ffec78c75f0) = -1 ENOENT (No such file or directory) <0.000012>
15160 15:09:27.380001 stat("--noclear", 0x7ffec78c75f0) = -1 ENOENT (No such file or directory) <0.000013>
...
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------

If for some reason a process argument looks like a path name on a remote location which is not accessible (e.g. an automount), this makes the yum process hang indefinitely.


Version-Release number of selected component (if applicable):

katello-host-tools-tracer-3.5.0-2.el7sat.noarch
katello-host-tools-tracer-3.3.5-5.el7sat.noarch


How reproducible:

Always (the "dumb" stats calls at least)


Steps to Reproduce:
1. Execute the following python code

# strace -fttTvyy -o /tmp/tracer.strace -s 4096 -e stat -- python << EOF
import katello.tracer
print katello.tracer.query_affected_apps()
EOF

2. Check the strace for "ENOENT"

# grep -w ENOENT /tmp/tracer.strace | grep -v "/"


Actual results:

-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------
...
15160 15:09:27.206950 stat("--system", 0x7ffec78c75f0) = -1 ENOENT (No such file or directory) <0.000011>
15160 15:09:27.206999 stat("--address=systemd:", 0x7ffec78c75f0) = -1 ENOENT (No such file or directory) <0.000010>
15160 15:09:27.207050 stat("--nofork", 0x7ffec78c75f0) = -1 ENOENT (No such file or directory) <0.000010>
15160 15:09:27.207092 stat("--nopidfile", 0x7ffec78c75f0) = -1 ENOENT (No such file or directory) <0.000012>
15160 15:09:27.207136 stat("--systemd-activation", 0x7ffec78c75f0) = -1 ENOENT (No such file or directory) <0.000019>
...
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------


Expected results:

No stat() done on process arguments, in particular if these are known remote paths


Additional info:

I don't know exactly when these stat calls are performed. It looks like this is done when updating some packages.

Comment 7 Bryan Kearney 2019-11-11 21:23:30 UTC
This work has not been slated for a release yet.

Comment 8 Jonathon Turel 2020-03-30 13:09:04 UTC
Created redmine issue https://projects.theforeman.org/issues/29436 from this bug

Comment 9 Bryan Kearney 2020-04-07 14:01:30 UTC
Moving this bug to POST for triage into Satellite 6 since the upstream issue https://projects.theforeman.org/issues/29436 has been resolved.

Comment 10 Ondrej Gajdusek 2020-08-06 18:33:54 UTC
Verified using the following versions:
python2-tracer-0.7.3-1
tracer-common-0.7.3-1
katello-host-tools-tracer-3.5.4-1


Reproducer used: https://github.com/FrostyX/tracer/pull/135#issue-395646444

Using the older versions of tracer and katello-host-tools-tracer, tracer is hanging.
katello-host-tools-tracer-3.5.1-2
python2-tracer.noarch-0.7.1-2
tracer-common.noarch-0.7.1-2

-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------
63198 14:01:33.781417 stat("sleeper.py", {st_dev=makedev(253, 0), st_ino=201469471, st_mode=S_IFREG|0644, st_nlink=1, st_uid=0, st_gid=0, st_blksize=4096, st_blocks=8, st_size=48, st_atime=1596728843 /* 2020-08-06T11:47:23.237744393-0400 */, st_atime_nsec=237744393, st_mtime=1596728833 /* 2020-08-06T11:47:13.313567153-0400 */, st_mtime_nsec=313567153, st_ctime=1596728833 /* 2020-08-06T11:47:13.313567153-0400 */, st_ctime_nsec=313567153}) = 0 <0.000039>
63198 14:01:33.781626 stat("--file", 0x7ffd56997a10) = -1 ENOENT (No such file or directory) <0.000026>
63198 14:01:33.781740 stat("/nfsshare/foo",
Killed


Using the newer versions mentioned above, tracer kills the stat() after a timeout.

-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------
96246 14:07:45.263776 stat("/usr/lib64/libsystemd.so.0.6.0;5f2bd967 (deleted)", 0x7ffdea8cf4d0) = -1 ENOENT (No such file or directory) <0.000031>
96246 14:07:45.263911 stat("/usr/lib64/libsystemd.so.0.6.0;5f2bd967 (deleted)", 0x7ffdea8cf4d0) = -1 ENOENT (No such file or directory) <0.000015>
96265 14:07:45.297730 stat("/nfsshare/foo",  <unfinished ...>) = ?
96265 14:07:45.798067 +++ killed by SIGKILL +++
96266 14:07:45.798133 +++ exited with 0 +++
96246 14:07:45.798150 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_KILLED, si_pid=96265, si_uid=0, si_status=SIGKILL, si_utime=0, si_stime=0} ---
96246 14:07:45.809671 stat("/usr/lib64/libsystemd.so.0.6.0;5f2bd967 (deleted)", 0x7ffdea8cf4d0) = -1 ENOENT (No such file or directory) <0.000054>
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------

Comment 13 errata-xmlrpc 2020-10-27 12:59:02 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: Satellite 6.8 release), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:4366