Description of problem: When running sosreport, the rhui plugin consistently times out, resulting in incomplete information submitted as part of sosreport. Version-Release number of selected component (if applicable): rh-rhui-tools-debug-script 2.1, sos 3.2 How reproducible: Steps to Reproduce: 1. Run sosreport on a rhua or cds server that has several repositories synced. 2. Wait to see if you get [plugin:rhui] command 'python /usr/share/rh-rhua/rhui-debug.py --dir /tmp/sosreport/sos_commands/rhui' timed out after 300s 3. Profit! Actual results: sosreport may submit incomplete information to Red Hat Support. Expected results: Additional info: I suspect these commands are the culprit: CDS_COMMANDS = { "ls -lR /var/lib/pulp-cds": {"filename" : "ls_-lR_var.lib.pulp-cds", "access_path" : "/var/lib/pulp-cds"} } PULP_COMMANDS = { "ls -lR /var/lib/pulp": {"filename" : "ls_-lR_var.lib.pulp", "access_path" : "/var/lib/pulp"} } On a RHUA or CDS that has dozens of repos configured, these commands can take a LONG time to complete. The parent sosreport only allows 300 seconds for plugin scripts to execute. Suggest making generating the file list an option of the plugin (default to NOT collecting the file list.)
Diff that adds file list generation as a command line option. This is against Version : 2.1.37, Release : 2.el6 # diff -u /usr/share/rh-rhua/rhui-debug.py /home/jstoner/rhui-debug.py --- /usr/share/rh-rhua/rhui-debug.py 2014-10-20 11:04:31.000000000 -0400 +++ /home/jstoner/rhui-debug.py 2016-01-06 16:32:22.524121306 -0500 @@ -80,6 +80,12 @@ dest="cds", default=False, help="If the script needs to be executed to collect CDS information instead of RHUA") + parser.add_option( + '--genfilelist', + dest='genfilelist', + action="store_true", + default=False, + help='Generate a file list of all repositories (warning: can be slow') return parser # Checks whether user has root access to run this script @@ -160,11 +166,13 @@ # Collect CDS specific debugging information copy_dirs(CDS_DIRS, base_dir) copy_files(CDS_FILES, base_dir) - run_commands(CDS_COMMANDS, base_dir) + if opt.genfilelist: + run_commands(CDS_COMMANDS, base_dir) else: # Collect RHUA specific debugging information copy_dirs(PULP_DIRS, base_dir) copy_files(PULP_FILES, base_dir) - run_commands(PULP_COMMANDS, base_dir) + if opt.genfilelist: + run_commands(PULP_COMMANDS, base_dir)
This is not an sos problem - the paths mentioned: CDS_COMMANDS = { "ls -lR /var/lib/pulp-cds": {"filename" : "ls_-lR_var.lib.pulp-cds", "access_path" : "/var/lib/pulp-cds"} } PULP_COMMANDS = { "ls -lR /var/lib/pulp": {"filename" : "ls_-lR_var.lib.pulp", "access_path" : "/var/lib/pulp"} } Are from the rhui-debug script (as is the patch in comment #3). We could temporarily increase the timeout used for rhui-debug in RHEL however this would be a workaround (and not a great one) - if the script is blowing a 300s command timeout it's clear that that needs to be improved or made optional.
I fixed the bug title to make it clear which plugin we're talking about but I think it would be better to move this to the relevant component for rhui-debug since that's where the problem is.
I couldn't reproduce it on on RHEL7 iso 20160719 >> sosreport .... Setting up archive ... Setting up plugins ... Running plugins. Please wait ... Running 82/82: yum... Creating compressed archive... Your sosreport has been generated and saved in: /var/tmp/sosreport-rhua.example.com.001-20160727030740.tar.xz The checksum is: d83a15b9f458b411045288f8345b885f Please send this file to your support representative I can't check it for RHEL6 iso 20160719 because of BZ1358564
Looks like BZ1358564 is VERIFIED. Can you see if it is reproducible on RHEL6?
I couldn't reproduce it on RHEL6 iso 201025. >> sosreport ... Setting up archive ... Setting up plugins ... Running plugins. Please wait ... Running 85/85: yum... Creating compressed archive... Your sosreport has been generated and saved in: /tmp/sosreport-rhua.example.com.123-20161101113737.tar.xz The checksum is: f971368c9c73a4672ac84595ee905d46 Please send this file to your support representative.
> I couldn't reproduce it on RHEL6 iso 201025. It's not so much the host, as the environment, that affects this: it's the quantity of data that foreman-debug needs to fetch and format that causes the very long runtimes. A fresh install on a lightly-configured server isn't likely to hit this at all.
How many repositories did you configure in RHUI? You need to configure many repositories and let them fully sync to properly reproduce the problem because the problem is triggered when you have 10,000+ files in the repo directories.
Hello, Bryn and Jeff I did it with 31 repos which includes e.g. Beta RHEL RHUI Everything 7 Source Srpms (x86_64) Beta RHEL RHUI Server 7 Debug (x86_64) Beta RHEL RHUI Server 7 Optional OS (x86_64) RHEL RHUI Server 6 Rhscl 1 Source Srpms (6Server-x86_64) RHEL RHUI Server 7 Optional OS (7Server-x86_64) Red Hat Enterprise Linux Server 6 (RPMs) (6Server-i386) Red Hat Enterprise Linux Server 6 (SRPMS) (6Server-i386) > [root@rhua ~]# cd /var/lib/pulp > [root@rhua pulp]# find . -type f | wc -l 52549 Please let me know if it's ok or I should do smth else.
I forgot to mention, I tested it on RHUI3 iso, not RHUI2.
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days