RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1482574 - jars plugin makes sos sit indefinitely
Summary: jars plugin makes sos sit indefinitely
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: sos
Version: 7.4
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: rc
: ---
Assignee: Pavel Moravec
QA Contact: Anna Khaitovich
URL: https://github.com/sosreport/sos/pull...
Whiteboard:
: 1483397 1486377 1495872 1530401 (view as bug list)
Depends On:
Blocks: 1420851 1500290
TreeView+ depends on / blocked
 
Reported: 2017-08-17 15:24 UTC by Patrick C. F. Ernzer
Modified: 2021-09-09 12:31 UTC (History)
32 users (show)

Fixed In Version: sos-3.5-1.el7
Doc Type: Bug Fix
Doc Text:
Previously, when the "jars" plug-in was used by the sosreport utility, sosreport became unresponsive. With this update, "jars" no longer performs unnecessary directory searches, and sosreport is generated as expected.
Clone Of:
: 1500290 (view as bug list)
Environment:
Last Closed: 2018-04-10 18:04:13 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 3155691 0 None None None 2017-08-21 09:36:52 UTC
Red Hat Knowledge Base (Solution) 3156441 0 None None None 2017-08-18 14:31:50 UTC
Red Hat Knowledge Base (Solution) 3306001 0 None None None 2018-01-02 23:03:49 UTC
Red Hat Product Errata RHEA-2018:0963 0 None None None 2018-04-10 18:05:22 UTC

Description Patrick C. F. Ernzer 2017-08-17 15:24:49 UTC
Description of problem:
If I run sosreport on one box, (not to self sysdsat01 at DBAG) it never completes (well the longest I waited was 70 minutes).

If I add -vvv I see tat it gets to setting up ipmitool but no further output after that.

After some fiddling (trying all the profiles until I found one that hung and then the plugins of that profile), I fould that it's the jars plugin that is making it hang.

Version-Release number of selected component (if applicable):
s0s-3.4-6.el7.noarch

How reproducible:
always

Steps to Reproduce:
1. sosreport -v

Actual results:
gets to setting up ipmitool and then no further output but a lot of disk read (as per iotop -o)

Expected results:
sosreport finishes the setup phase and then runs the plugins

Additional info:
sosreport -n jars
works just fine

sosreport -o jars
hangs

Comment 3 Patrick C. F. Ernzer 2017-08-17 15:34:23 UTC
# sosreport -vvv -o jars
set sysroot to '/' (default)

sosreport (version 3.4)

This command will collect diagnostic and configuration information from
this Red Hat Enterprise Linux system and installed applications.

An archive containing the collected information will be generated in
/var/tmp/sos.sMtUxZ and may be provided to a Red Hat support
representative.

Any information provided to Red Hat will be treated in accordance with
the published support policies at:

  https://access.redhat.com/support/

The generated archive may contain data considered sensitive and its
content should be reviewed by the originating organization before being
passed to any third party.

No changes will be made to system configuration.

Press ENTER to continue, or CTRL-C to quit.

Please enter your first initial and last name [REDACTED]:
Please enter the case id that you are generating this report for []:

 Setting up archive ...
[archive:TarFileArchive] initialised empty FileCacheArchive at '/var/tmp/sos.sMtUxZ/sosreport-REDACTED-20170817173054'
[sos.sosreport:setup] executing 'sosreport -vvv -o jars'
 Setting up plugins ...


it will sit at that point seemingly indefinitely

Comment 6 Pavel Moravec 2017-08-18 13:59:44 UTC
The root cause is imho here:

/usr/lib/python2.7/site-packages/sos/plugins/jars.py

    jar_locations = (
        "/usr/share/java",  # common location for JARs
        "/usr/lib/java",    # common location for JARs containing native code
        "/opt",             # location for RHSCL and 3rd party software
        "/usr/local",       # used by sysadmins when installing SW locally
        "/var/lib"          # Java services commonly explode WARs there
    )

        locations = list(Jars.jar_locations)
        ..
        for location in locations:
            for dirpath, _, filenames in os.walk(location):
                <do something here>


If either of the jar_locations has many files/dirs, the os.walk can spend there whatever long time.

You can try playing with it via running:

find /var/lib | wc -l

(and the same for other directories from the list)

and if some such dir contains really many files/directories, let comment out that dir in the jar_locations (just ensure latest item in the list cant be followed by comma), and try re-running sosreport.

Anyway, we will definitely need to remove the "/var/lib" dir from the list. As e.g. /var/lib/pulp can have millions of files there..

Comment 7 Pavel Moravec 2017-08-18 14:31:51 UTC
This needs to go to z-stream. Without that, e.g. Satellite6 having millions of files under /var/lib/pulp would not be able to run sosreport (in a reasonable time).

Workaround - disable jars plug-in.

(please update the KCS with other products affected)

Comment 8 Pavel Moravec 2017-08-21 06:53:05 UTC
*** Bug 1483397 has been marked as a duplicate of this bug. ***

Comment 9 Bryn M. Reeves 2017-08-21 09:29:28 UTC
You can avoid this for a single run by disabling jars with -n:

  # sosreport -n jars

Or persistently by adding a line to the 'plugins' section of /etc/sos.conf:

  [plugins]
  disable = jars

(use a comma-separated list if you wish to disable multiple plugins).

Comment 10 Pavel Moravec 2017-08-22 07:03:19 UTC
Upstream PR:

https://github.com/sosreport/sos/pull/1077

Steve, could you pls. pm_ack for 7.5 (in fact we would like to get into 7.4.z even where we need pm_ack now)?

Comment 11 Pavel Moravec 2017-08-22 07:08:40 UTC
Reproducer steps for QE:

- have a system with million files/dirs under /var/lib (or /opt or usr/local) - an example is Satellite6 with more synchronized repositories, putting all files (RPMs, repo metadata etc) under /var/lib/pulp
- run sosreport (with jars plug-in enabled)

If necessary, I can provide a reproducer machine for verification.

Comment 12 Patrick C. F. Ernzer 2017-08-22 09:01:40 UTC
(In reply to Pavel Moravec from comment #6)

This is from a lightly loaded Sat6. Few hosts attached to it but a lot of software synced to it.
Give a shout if you want me to check a couple more (I have one prod Sat I can ask the find /var/lib on and I have a staging Sat at this customer)

[...]
> You can try playing with it via running:
> 
> find /var/lib | wc -l
> 
> (and the same for other directories from the list)

# find /usr/share/java | wc -l
812
# find /usr/lib/java | wc -l
3
# find /opt | wc -l
61069
# find /usr/local | wc -l
38
# time find /var/lib | wc -l
11639625

real    45m53.524s
user    0m31.590s
sys     4m18.416s

So it seems indeed that the 11.5 million files under /var/lib of this Satellite are the problem.

Comment 13 Bryn M. Reeves 2017-08-29 15:38:02 UTC
*** Bug 1486377 has been marked as a duplicate of this bug. ***

Comment 14 Pavel Moravec 2017-09-26 13:34:48 UTC
*** Bug 1495872 has been marked as a duplicate of this bug. ***

Comment 15 Pavel Moravec 2017-10-10 10:09:51 UTC
Posted to upstream via https://github.com/sosreport/sos/commit/6fc42802b87f95dba1d6bfda49ae158143e7799c

Comment 19 Pavel Moravec 2017-11-02 15:15:58 UTC
Fixed via sos 3.5 rebase.

Comment 22 Steffen Froemer 2018-01-02 23:03:50 UTC
*** Bug 1530401 has been marked as a duplicate of this bug. ***

Comment 26 errata-xmlrpc 2018-04-10 18:04:13 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2018:0963


Note You need to log in before you can comment on or make changes to this bug.