Red Hat Bugzilla – Bug 241071
Patch to speed up sos processing
Last modified: 2009-06-08 13:23:07 EDT
Description of problem:
Version-Release number of selected component (if applicable): sos-1.3-1.el5.noarch
The code in helpers.py to actually execute a command comes from the Python
Cookbook as stated in the code. As far as I can tell, that code is simply
executing the command and grabbing stdout and stderr. My patch removes the
Python Cookbook code and uses Python's commands library instead. The only
downside to this is that stderr and stdout are now combined into one chunk of
text -- so if you wanted to treat them separately, this wouldn't help. If it
doesn't matter that stderr and stdout are combined, then this works great.
I have done some rudimentary benchmarking with this. With the code that
currently exists in helper.py in sos, running an sos took 6m48s (Note there will
be error as I had to type my name and ticket number). Rebooting the machine and
running a patched version of sos took 3m27s. If we assume an error of 30 seconds
on either end for me to type in my name and ticket number, we still show a
significant speed increase.
Steps to Reproduce:
1. Run time sosreport -av
3. Patch /usr/lib/python-2.4/site-packages/sos/helper.py
4. Run time sosreport -av
Significant speed improvement with the patch.
Karl J. Abbott
Created attachment 155298 [details]
Patch for helpers.py that improves the performance of sos
stdout and stderr are separated because stderr is put in the sos log, and stdout
goes into the report.
It may be a good idea to combine them in the output to the report. Any errors
while running commands should be of interest to whoever is looking at the
report, and having errors buried in another log obscures them. I don't think
that there was any compelling reason that they were separated in the first place.
I agree, using popen unnecessarily complicates things. Committed to svn.
This seems to be unpatched in the latest svn trunk:
Was this intended or was this accidental?
I initially applied the patch in revision 200:
I later found it that it created problems with SELinux, generating a lot of
warnings. I didn't have a chance to investigate this further, but for this
reason I decided to go back to using popen in a similar way it is used from
Later on, I implemented a timeout to allow the main thread to kill the process
and continue (which is something we couldn't do with commands.getstatusoutput),
this is very helpful especially on systems where we might end up with processes
in D status:
Honestly I haven't done any benchmarking, but as we are using a very similar
code as in getstatusoutput() (which also uses popen), the execution speed should
be fairly similar.
Have you had a chance to try the code in trunk ?