Bug 2094448

Summary: regex split code in tracer not scaling very well
Product: Red Hat Enterprise Linux 8 Reporter: Paulo Andrade <pandrade>
Component: tracerAssignee: Jakub Kadlčík <jkadlcik>
Status: NEW --- QA Contact: rhel8-maint
Severity: medium Docs Contact:
Priority: unspecified    
Version: 8.6CC: jkadlcik, ovasik
Target Milestone: rcFlags: ovasik: needinfo? (jkadlcik)
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Paulo Andrade 2022-06-07 15:51:16 UTC
Tracer appears to be hung due to taking too much time.

  Customer provided some data for debugging the issue.

  Based on some strace, perf and cProfile data, we can see that the bottleneck
appears to be the regex split code. In a very short run, to profile a bit, it
is called almost 2 million times, and using more than 90% of the processing
time.

  Sample cProfile output:

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
  1745231  413.627    0.000  413.627    0.000 {method 'split' of '_sre.SRE_Pattern' objects}
  9175435   35.478    0.000  473.426    0.000 /usr/lib/python3.6/site-packages/tracer/resources/processes.py:76(name)
  1746812   17.331    0.000   17.332    0.000 {method 'join' of 'str' objects}
 23604269    4.621    0.000    4.621    0.000 /usr/lib/python3.6/site-packages/tracer/resources/processes.py:114(_attr)
     4362    3.370    0.001    3.371    0.001 {method 'send_message_with_reply_and_block' of '_dbus_bindings.Connection' objects}
   164719    3.258    0.000    4.943    0.000 /usr/lib64/python3.6/site-packages/psutil/_pslinux.py:1800(get_blocks)
     1510    2.475    0.002    4.500    0.003 {method 'block' of 'dbus.lowlevel.PendingCall' objects}
  3058071    2.466    0.000  478.441    0.000 /usr/lib/python3.6/site-packages/tracer/resources/applications.py:272(<lambda>)
     2294    2.111    0.001    2.111    0.001 {built-in method posix.listdir}
  3058071    1.617    0.000  317.814    0.000 /usr/lib/python3.6/site-packages/tracer/resources/processes.py:276(real_name)
     1512    1.560    0.001    1.968    0.001 {method 'Parse' of 'pyexpat.xmlparser' objects}
  3935101    1.324    0.000    1.324    0.000 {method 'split' of 'bytes' objects}
     2281    1.206    0.001    6.560    0.003 /usr/lib/python3.6/site-packages/tracer/resources/processes.py:41(all)