Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 1482574 - jars plugin makes sos sit indefinitely
jars plugin makes sos sit indefinitely
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: sos (Show other bugs)
7.4
Unspecified Unspecified
urgent Severity urgent
: rc
: ---
Assigned To: Pavel Moravec
Anna Khaitovich
https://github.com/sosreport/sos/pull...
: ZStream
: 1483397 1486377 1495872 1530401 (view as bug list)
Depends On:
Blocks: 1420851 1500290
  Show dependency treegraph
 
Reported: 2017-08-17 11:24 EDT by Patrick C. F. Ernzer
Modified: 2018-06-08 12:59 EDT (History)
32 users (show)

See Also:
Fixed In Version: sos-3.5-1.el7
Doc Type: Bug Fix
Doc Text:
Previously, when the "jars" plug-in was used by the sosreport utility, sosreport became unresponsive. With this update, "jars" no longer performs unnecessary directory searches, and sosreport is generated as expected.
Story Points: ---
Clone Of:
: 1500290 (view as bug list)
Environment:
Last Closed: 2018-04-10 14:04:13 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 3155691 None None None 2017-08-21 05:36 EDT
Red Hat Knowledge Base (Solution) 3156441 None None None 2017-08-18 10:31 EDT
Red Hat Knowledge Base (Solution) 3306001 None None None 2018-01-02 18:03 EST
Red Hat Product Errata RHEA-2018:0963 None None None 2018-04-10 14:05 EDT

  None (edit)
Description Patrick C. F. Ernzer 2017-08-17 11:24:49 EDT
Description of problem:
If I run sosreport on one box, (not to self sysdsat01 at DBAG) it never completes (well the longest I waited was 70 minutes).

If I add -vvv I see tat it gets to setting up ipmitool but no further output after that.

After some fiddling (trying all the profiles until I found one that hung and then the plugins of that profile), I fould that it's the jars plugin that is making it hang.

Version-Release number of selected component (if applicable):
s0s-3.4-6.el7.noarch

How reproducible:
always

Steps to Reproduce:
1. sosreport -v

Actual results:
gets to setting up ipmitool and then no further output but a lot of disk read (as per iotop -o)

Expected results:
sosreport finishes the setup phase and then runs the plugins

Additional info:
sosreport -n jars
works just fine

sosreport -o jars
hangs
Comment 3 Patrick C. F. Ernzer 2017-08-17 11:34:23 EDT
# sosreport -vvv -o jars
set sysroot to '/' (default)

sosreport (version 3.4)

This command will collect diagnostic and configuration information from
this Red Hat Enterprise Linux system and installed applications.

An archive containing the collected information will be generated in
/var/tmp/sos.sMtUxZ and may be provided to a Red Hat support
representative.

Any information provided to Red Hat will be treated in accordance with
the published support policies at:

  https://access.redhat.com/support/

The generated archive may contain data considered sensitive and its
content should be reviewed by the originating organization before being
passed to any third party.

No changes will be made to system configuration.

Press ENTER to continue, or CTRL-C to quit.

Please enter your first initial and last name [REDACTED]:
Please enter the case id that you are generating this report for []:

 Setting up archive ...
[archive:TarFileArchive] initialised empty FileCacheArchive at '/var/tmp/sos.sMtUxZ/sosreport-REDACTED-20170817173054'
[sos.sosreport:setup] executing 'sosreport -vvv -o jars'
 Setting up plugins ...


it will sit at that point seemingly indefinitely
Comment 6 Pavel Moravec 2017-08-18 09:59:44 EDT
The root cause is imho here:

/usr/lib/python2.7/site-packages/sos/plugins/jars.py

    jar_locations = (
        "/usr/share/java",  # common location for JARs
        "/usr/lib/java",    # common location for JARs containing native code
        "/opt",             # location for RHSCL and 3rd party software
        "/usr/local",       # used by sysadmins when installing SW locally
        "/var/lib"          # Java services commonly explode WARs there
    )

        locations = list(Jars.jar_locations)
        ..
        for location in locations:
            for dirpath, _, filenames in os.walk(location):
                <do something here>


If either of the jar_locations has many files/dirs, the os.walk can spend there whatever long time.

You can try playing with it via running:

find /var/lib | wc -l

(and the same for other directories from the list)

and if some such dir contains really many files/directories, let comment out that dir in the jar_locations (just ensure latest item in the list cant be followed by comma), and try re-running sosreport.

Anyway, we will definitely need to remove the "/var/lib" dir from the list. As e.g. /var/lib/pulp can have millions of files there..
Comment 7 Pavel Moravec 2017-08-18 10:31:51 EDT
This needs to go to z-stream. Without that, e.g. Satellite6 having millions of files under /var/lib/pulp would not be able to run sosreport (in a reasonable time).

Workaround - disable jars plug-in.

(please update the KCS with other products affected)
Comment 8 Pavel Moravec 2017-08-21 02:53:05 EDT
*** Bug 1483397 has been marked as a duplicate of this bug. ***
Comment 9 Bryn M. Reeves 2017-08-21 05:29:28 EDT
You can avoid this for a single run by disabling jars with -n:

  # sosreport -n jars

Or persistently by adding a line to the 'plugins' section of /etc/sos.conf:

  [plugins]
  disable = jars

(use a comma-separated list if you wish to disable multiple plugins).
Comment 10 Pavel Moravec 2017-08-22 03:03:19 EDT
Upstream PR:

https://github.com/sosreport/sos/pull/1077

Steve, could you pls. pm_ack for 7.5 (in fact we would like to get into 7.4.z even where we need pm_ack now)?
Comment 11 Pavel Moravec 2017-08-22 03:08:40 EDT
Reproducer steps for QE:

- have a system with million files/dirs under /var/lib (or /opt or usr/local) - an example is Satellite6 with more synchronized repositories, putting all files (RPMs, repo metadata etc) under /var/lib/pulp
- run sosreport (with jars plug-in enabled)

If necessary, I can provide a reproducer machine for verification.
Comment 12 Patrick C. F. Ernzer 2017-08-22 05:01:40 EDT
(In reply to Pavel Moravec from comment #6)

This is from a lightly loaded Sat6. Few hosts attached to it but a lot of software synced to it.
Give a shout if you want me to check a couple more (I have one prod Sat I can ask the find /var/lib on and I have a staging Sat at this customer)

[...]
> You can try playing with it via running:
> 
> find /var/lib | wc -l
> 
> (and the same for other directories from the list)

# find /usr/share/java | wc -l
812
# find /usr/lib/java | wc -l
3
# find /opt | wc -l
61069
# find /usr/local | wc -l
38
# time find /var/lib | wc -l
11639625

real    45m53.524s
user    0m31.590s
sys     4m18.416s

So it seems indeed that the 11.5 million files under /var/lib of this Satellite are the problem.
Comment 13 Bryn M. Reeves 2017-08-29 11:38:02 EDT
*** Bug 1486377 has been marked as a duplicate of this bug. ***
Comment 14 Pavel Moravec 2017-09-26 09:34:48 EDT
*** Bug 1495872 has been marked as a duplicate of this bug. ***
Comment 15 Pavel Moravec 2017-10-10 06:09:51 EDT
Posted to upstream via https://github.com/sosreport/sos/commit/6fc42802b87f95dba1d6bfda49ae158143e7799c
Comment 19 Pavel Moravec 2017-11-02 11:15:58 EDT
Fixed via sos 3.5 rebase.
Comment 22 Steffen Froemer 2018-01-02 18:03:50 EST
*** Bug 1530401 has been marked as a duplicate of this bug. ***
Comment 26 errata-xmlrpc 2018-04-10 14:04:13 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2018:0963

Note You need to log in before you can comment on or make changes to this bug.