Bug 1512307

Summary: [downstream clone - 3.6.12] [RFE] option to collect logs from one host per cluster
Product: Red Hat Enterprise Virtualization Manager Reporter: rhev-integ
Component: ovirt-log-collectorAssignee: Ala Hino <ahino>
Status: CLOSED ERRATA QA Contact: Jiri Belka <jbelka>
Severity: urgent Docs Contact:
Priority: high    
Version: 2.1.0CC: ahino, bugs, danken, dnecpal, dougsland, eheftman, gveitmic, mkalinin, trichard, ylavi
Target Milestone: ovirt-3.6.z-asyncKeywords: FutureFeature, ZStream
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: rhevm-log-collector-3.6.2-3.el6ev Doc Type: Enhancement
Doc Text:
Previously, data was collected from all hosts in a cluster, which created an output file that was too large to handle. In this release, the hypervisor-per-cluster option enables you to collect data from a single host (the Storage Pool Manager, if available) per cluster.
Story Points: ---
Clone Of: 1490447 Environment:
Last Closed: 2018-03-01 16:35:00 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Integration RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1490447    
Bug Blocks:    

Description rhev-integ 2017-11-12 15:09:32 UTC
+++ This bug is a downstream clone. The original bug is: +++
+++   bug 1490447 +++
======================================================================

Description of problem:
ovirt-log-collector collects logs from all hosts, which makes its output too big to handle. Specifying a list of hypervisors by name is tedious and error-prone. People tend to be using --no-hypervisor and miss the purpose.

This RFE is about adding a midway --one-hypervisor-per-cluster. When supplied to the script, the script would choose one host per cluster, and collect its logs. The current SPM and the current host of a self-hosted-engine should be included as well.

(Originally by danken)

Comment 1 rhev-integ 2017-11-12 15:09:39 UTC
We would like to backport this to 3.6-els, to ease log collection prior to upgrade.

(Originally by danken)

Comment 4 rhev-integ 2017-11-12 15:09:47 UTC
My 2 cents:
1. If this will be turned off by default, customers won't be using it / will keep using --no-hypervisor
2. "--one-hypervisor-per-cluster" name is too long, I would prefer something shorter and more user-friendly such as 
3. Easy way to fix 1. would be to ask question on --no-hypervisor to include at least 1 or if user is sure that hypervisors should be omitted

QS:
What host do we choose? SPM? With PM? First host? Random host? Does it matter for DEV on debugging side?

(Originally by Lukas Svaty)

Comment 7 Dan Kenigsberg 2017-12-11 13:15:12 UTC
I moved this to async because

  ovirt-log-collector --hypervisor-per-cluster

eases creation of meaningful log analysis and helps upgrades from 3.6.

Comment 9 Lukas Svaty 2018-01-25 10:17:12 UTC
log collection is blocked on rhel7 hosts with (rhv4 or rhev3.6 repositories)  - no data collected

log collector output:
[root@~ tmp]# engine-log-collector --hypervisor-per-cluster
This command will collect system configuration and diagnostic
information from this system.
The generated archive may contain data considered sensitive and its
content should be reviewed by the originating organization before
being passed to any third party.
No changes will be made to system configuration.
Please provide the REST API password for the admin@internal oVirt Engine user (CTRL+D to skip): 
ERROR: Failure fetching information about hypervisors from API.Error: global name 'ovirtsdk4' is not defined
ERROR: _get_hypervisors_from_api: global name 'ovirtsdk4' is not defined
INFO: Gathering oVirt Engine information...
INFO: Gathering PostgreSQL the oVirt Engine database and log files from localhost...
INFO: Hypervisor data will not be collected, Error while selecting hypervisors
Reason: global name 'orig_hosts' is not defined
Creating compressed archive...
INFO: Log files have been collected and placed in /tmp/sosreport-LogCollector-20180125111602.tar.xz.
The MD5 for this file is 6051b2c31bae18335ff5fa38d473f0dd and its size is 8.2M


tested in rhevm-log-collector-3.6.2-1.el6ev.noarch

Comment 10 Dan Kenigsberg 2018-02-06 10:11:00 UTC
ouch, we should have thought of this. it won't be very quick to fix, so i'd take it out of the erratum.

Comment 11 Dan Kenigsberg 2018-02-06 10:14:24 UTC
actually, it is the *only* bug in the erratum, so we should not ship it as of yet.

Comment 12 Sandro Bonazzola 2018-02-12 14:47:19 UTC
Please note current 3.6 git code has missing imports:

$ pyflakes-2 src/helper/hypervisors.py 
src/helper/hypervisors.py:88: undefined name 'ovirtsdk4'
src/helper/hypervisors.py:89: undefined name 'ovirtsdk4'

Comment 13 Sandro Bonazzola 2018-02-26 09:02:23 UTC
Ala, I see https://gerrit.ovirt.org/#/c/87646/ has been merged, anything else needed for moving this bug to modified?

Comment 14 Ala Hino 2018-02-26 09:44:44 UTC
(In reply to Sandro Bonazzola from comment #13)
> Ala, I see https://gerrit.ovirt.org/#/c/87646/ has been merged, anything
> else needed for moving this bug to modified?

Moved to MODIFIED.

Comment 16 Jiri Belka 2018-02-27 17:59:40 UTC
It's a mess. There's rhevm-log-collector-3.6.2-2.el6ev.noarch.rpm in BZ's errata[1]
but BZ's 'fixed in version' field has rhevm-log-collector-3.6.3-1.el6ev.

[1] https://errata.devel.redhat.com/advisory/32228/builds

So?

- rhevm-log-collector-3.6.2-2.el6ev.noarch.rpm issue:

# engine-log-collector --hypervisor-per-cluster                                                                
This command will collect system configuration and diagnostic
information from this system.
The generated archive may contain data considered sensitive and its
content should be reviewed by the originating organization before
being passed to any third party.
No changes will be made to system configuration.
Please provide the REST API password for the admin@internal oVirt Engine user (CTRL+D to skip): 
INFO: Gathering oVirt Engine information...
INFO: Gathering PostgreSQL the oVirt Engine database and log files from localhost...
INFO: Hypervisor data will not be collected, Error while selecting hypervisors

                                             ^^^^ ERROR!!

Reason: global name 'orig_hosts' is not defined
                                    ^^^^^^^^^^^^ ????

Creating compressed archive...
INFO: Log files have been collected and placed in /tmp/sosreport-LogCollector-20180227185400.tar.xz.
The MD5 for this file is 58bcb356d64cf093d4c970a979ec7d4e and its size is 8.3M

nothing from hypervisor:

# tar tJf /tmp/sosreport-LogCollector-20180227185400.tar.xz | grep 10\.37\.137\.250
sosreport-LogCollector-20180227185400/sosreport-10-37-137-244.rhev.lab.eng.brq.redhat.com-20180227185322/var/log/ovirt-engine/host-deploy/ovirt-host-deploy-20180227163737-10.37.137.250-8817052.log
sosreport-LogCollector-20180227185400/sosreport-10-37-137-244.rhev.lab.eng.brq.redhat.com-20180227185322/var/log/ovirt-engine/host-deploy/ovirt-host-deploy-20180227182736-10.37.137.250-32988ddf.log


- rhevm-log-collector-3.6.3-1.el6ev.noarch

# engine-log-collector --hypervisor-per-cluster                                                                
This command will collect system configuration and diagnostic
information from this system.
The generated archive may contain data considered sensitive and its
content should be reviewed by the originating organization before
being passed to any third party.
No changes will be made to system configuration.
Please provide the REST API password for the admin@internal oVirt Engine user (CTRL+D to skip): 
INFO: Gathering oVirt Engine information...
INFO: Gathering PostgreSQL the oVirt Engine database and log files from localhost...
INFO: Hypervisor data will not be collected, Error while selecting hypervisors
Reason: global name 'orig_hosts' is not defined
Creating compressed archive...
INFO: Log files have been collected and placed in /tmp/sosreport-LogCollector-20180227185736.tar.xz.
The MD5 for this file is 31763f65817acb9255ef7601846bda52 and its size is 8.3M
[root@10-37-137-244 ~]# rpm -qf `which engine-log-collector`
rhevm-log-collector-3.6.3-1.el6ev.noarch

Comment 17 Jiri Belka 2018-02-27 18:02:18 UTC
I used:

rhevm-3.6.12.3-0.1.el6.noarch
rhevm-sdk-python-3.6.9.1-1.el6ev.noarch

& 

vdsm-4.17.44-2.el7ev.noarch
redhat-release-server-7.3-7.el7.x86_64

Comment 19 Jiri Belka 2018-02-28 08:51:19 UTC
ok

rhevm-3.6.12.3-0.1.el6.noarch
rhevm-log-collector-3.6.2-3.el6ev.noarch

tested default execution, --hypervisor-per-cluster, -H <host>, --no-hypervisors

host data present, db from 3.6 env present

# tar xOJf /tmp/sosreport-LogCollector-20180228093707.tar.xz sosreport-LogCollector-20180228093707/log-collector-data/10.37.137.250/10.37.137.250-sosreport-10-37-137-250.rhev.lab.eng.brq.redhat.com-20180228093620.tar.xz | tar -tJf - | wc -l
4128

Comment 22 errata-xmlrpc 2018-03-01 16:35:00 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2018:0383

Comment 23 Franta Kust 2019-05-16 12:54:53 UTC
BZ<2>Jira re-sync