+++ This bug is a downstream clone. The original bug is: +++ +++ bug 1490447 +++ ====================================================================== Description of problem: ovirt-log-collector collects logs from all hosts, which makes its output too big to handle. Specifying a list of hypervisors by name is tedious and error-prone. People tend to be using --no-hypervisor and miss the purpose. This RFE is about adding a midway --one-hypervisor-per-cluster. When supplied to the script, the script would choose one host per cluster, and collect its logs. The current SPM and the current host of a self-hosted-engine should be included as well. (Originally by danken)
We would like to backport this to 3.6-els, to ease log collection prior to upgrade. (Originally by danken)
My 2 cents: 1. If this will be turned off by default, customers won't be using it / will keep using --no-hypervisor 2. "--one-hypervisor-per-cluster" name is too long, I would prefer something shorter and more user-friendly such as 3. Easy way to fix 1. would be to ask question on --no-hypervisor to include at least 1 or if user is sure that hypervisors should be omitted QS: What host do we choose? SPM? With PM? First host? Random host? Does it matter for DEV on debugging side? (Originally by Lukas Svaty)
I moved this to async because ovirt-log-collector --hypervisor-per-cluster eases creation of meaningful log analysis and helps upgrades from 3.6.
log collection is blocked on rhel7 hosts with (rhv4 or rhev3.6 repositories) - no data collected log collector output: [root@~ tmp]# engine-log-collector --hypervisor-per-cluster This command will collect system configuration and diagnostic information from this system. The generated archive may contain data considered sensitive and its content should be reviewed by the originating organization before being passed to any third party. No changes will be made to system configuration. Please provide the REST API password for the admin@internal oVirt Engine user (CTRL+D to skip): ERROR: Failure fetching information about hypervisors from API.Error: global name 'ovirtsdk4' is not defined ERROR: _get_hypervisors_from_api: global name 'ovirtsdk4' is not defined INFO: Gathering oVirt Engine information... INFO: Gathering PostgreSQL the oVirt Engine database and log files from localhost... INFO: Hypervisor data will not be collected, Error while selecting hypervisors Reason: global name 'orig_hosts' is not defined Creating compressed archive... INFO: Log files have been collected and placed in /tmp/sosreport-LogCollector-20180125111602.tar.xz. The MD5 for this file is 6051b2c31bae18335ff5fa38d473f0dd and its size is 8.2M tested in rhevm-log-collector-3.6.2-1.el6ev.noarch
ouch, we should have thought of this. it won't be very quick to fix, so i'd take it out of the erratum.
actually, it is the *only* bug in the erratum, so we should not ship it as of yet.
Please note current 3.6 git code has missing imports: $ pyflakes-2 src/helper/hypervisors.py src/helper/hypervisors.py:88: undefined name 'ovirtsdk4' src/helper/hypervisors.py:89: undefined name 'ovirtsdk4'
Ala, I see https://gerrit.ovirt.org/#/c/87646/ has been merged, anything else needed for moving this bug to modified?
(In reply to Sandro Bonazzola from comment #13) > Ala, I see https://gerrit.ovirt.org/#/c/87646/ has been merged, anything > else needed for moving this bug to modified? Moved to MODIFIED.
It's a mess. There's rhevm-log-collector-3.6.2-2.el6ev.noarch.rpm in BZ's errata[1] but BZ's 'fixed in version' field has rhevm-log-collector-3.6.3-1.el6ev. [1] https://errata.devel.redhat.com/advisory/32228/builds So? - rhevm-log-collector-3.6.2-2.el6ev.noarch.rpm issue: # engine-log-collector --hypervisor-per-cluster This command will collect system configuration and diagnostic information from this system. The generated archive may contain data considered sensitive and its content should be reviewed by the originating organization before being passed to any third party. No changes will be made to system configuration. Please provide the REST API password for the admin@internal oVirt Engine user (CTRL+D to skip): INFO: Gathering oVirt Engine information... INFO: Gathering PostgreSQL the oVirt Engine database and log files from localhost... INFO: Hypervisor data will not be collected, Error while selecting hypervisors ^^^^ ERROR!! Reason: global name 'orig_hosts' is not defined ^^^^^^^^^^^^ ???? Creating compressed archive... INFO: Log files have been collected and placed in /tmp/sosreport-LogCollector-20180227185400.tar.xz. The MD5 for this file is 58bcb356d64cf093d4c970a979ec7d4e and its size is 8.3M nothing from hypervisor: # tar tJf /tmp/sosreport-LogCollector-20180227185400.tar.xz | grep 10\.37\.137\.250 sosreport-LogCollector-20180227185400/sosreport-10-37-137-244.rhev.lab.eng.brq.redhat.com-20180227185322/var/log/ovirt-engine/host-deploy/ovirt-host-deploy-20180227163737-10.37.137.250-8817052.log sosreport-LogCollector-20180227185400/sosreport-10-37-137-244.rhev.lab.eng.brq.redhat.com-20180227185322/var/log/ovirt-engine/host-deploy/ovirt-host-deploy-20180227182736-10.37.137.250-32988ddf.log - rhevm-log-collector-3.6.3-1.el6ev.noarch # engine-log-collector --hypervisor-per-cluster This command will collect system configuration and diagnostic information from this system. The generated archive may contain data considered sensitive and its content should be reviewed by the originating organization before being passed to any third party. No changes will be made to system configuration. Please provide the REST API password for the admin@internal oVirt Engine user (CTRL+D to skip): INFO: Gathering oVirt Engine information... INFO: Gathering PostgreSQL the oVirt Engine database and log files from localhost... INFO: Hypervisor data will not be collected, Error while selecting hypervisors Reason: global name 'orig_hosts' is not defined Creating compressed archive... INFO: Log files have been collected and placed in /tmp/sosreport-LogCollector-20180227185736.tar.xz. The MD5 for this file is 31763f65817acb9255ef7601846bda52 and its size is 8.3M [root@10-37-137-244 ~]# rpm -qf `which engine-log-collector` rhevm-log-collector-3.6.3-1.el6ev.noarch
I used: rhevm-3.6.12.3-0.1.el6.noarch rhevm-sdk-python-3.6.9.1-1.el6ev.noarch & vdsm-4.17.44-2.el7ev.noarch redhat-release-server-7.3-7.el7.x86_64
ok rhevm-3.6.12.3-0.1.el6.noarch rhevm-log-collector-3.6.2-3.el6ev.noarch tested default execution, --hypervisor-per-cluster, -H <host>, --no-hypervisors host data present, db from 3.6 env present # tar xOJf /tmp/sosreport-LogCollector-20180228093707.tar.xz sosreport-LogCollector-20180228093707/log-collector-data/10.37.137.250/10.37.137.250-sosreport-10-37-137-250.rhev.lab.eng.brq.redhat.com-20180228093620.tar.xz | tar -tJf - | wc -l 4128
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2018:0383
BZ<2>Jira re-sync