Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1560520

Summary: [RFE] - Improve log collection time in log collector tool
Product: [oVirt] ovirt-log-collector Reporter: Ilan Zuckerman <izuckerm>
Component: GeneralAssignee: Douglas Schilling Landgraf <dougsland>
Status: CLOSED NOTABUG QA Contact: Daniel Gur <dagur>
Severity: high Docs Contact:
Priority: unspecified    
Version: 4.2.4CC: bugs, izuckerm, mlehrer, ratamir, sbonazzo, ylavi
Target Milestone: ---Keywords: FutureFeature, Performance
Target Release: ---Flags: rule-engine: planning_ack?
rule-engine: devel_ack?
rule-engine: testing_ack?
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-06-12 11:20:05 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Integration RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Ilan Zuckerman 2018-03-26 11:56:10 UTC
Description of problem:
When executing ovirt-log-collector with a list of 20 hosts, it takes 1 hour to complete the run (including archive creation)

Version-Release number of selected component (if applicable):
ovirt-log-collector-4.2.4

How reproducible:
100%

Steps to Reproduce:
1. Have an environment where multiple hosts managed by one ovirt engine (ovirt-log-collector-4.2.4)
2. Execute logs collection from Engine machine with the following pattern:
ovirt-log-collector --user=admin@internal --hosts=XXX, XXX, XXX -v
3. Measure time it takes to complete.

Actual results:
~60 minutes to complete

Expected results:
Shouldn't take that long time

Additional info:
Logs and nmon reports are attached in Google Drive (see private comment). PLEASE DOWNLOAD THEM ASAP as they might get deleted.
Logs times are UTC
logs collector run time:
2018-03-22 07:56:34 - 2018-03-22 08:52:13

Here are my thoughts:

The nmon report for engine CPU (all_CPUs.png) shows activity spike from 08:36 till 05:52
When looking at the log (ovirt-log-collector-20180322075629.log), i could visibly divide it to the following parts:

* 07:58 - 08:36 collecting logs from all the specified hosts. Each host took about 14 minutes to collect.
For example monitored host b01-h14 took 14 min   (start: 08:12  end: 08:26) You can see this in nmon report for host CPU (All_CPUs.png)
Last host collection ended on 08:36 (b02-h09) .

* 08:36 - 08:52 creating archive

Few conclusions from this analysis:
- collecting time from 1 host takes 14 minutes 
- Most of the time and CPU power is spent on archive creation

Comment 2 Sandro Bonazzola 2018-03-28 07:10:18 UTC
What's the request here? Collecting from 20 hosts and creating an archive including all the data takes time, there's not much we can do about it.
Please note you can filter the host list by command line for reducing the number of hosts to be included in the repo and that's the only reasonable option for reducing the time needed for collecting logs.

Comment 5 Daniel Gur 2018-04-25 09:39:03 UTC
The user that has large RHV setup would need the ability to collect logs from it easily and efficiently. Whe we presented the current time to developers( Yaniv Kaul) and PMs
The feedback was it is not acceptable and we were requested to Open a bug on it.

If you would like to question the purpose of this use-case please check it with them.

Comment 6 Yaniv Lavi 2018-06-12 11:20:05 UTC
(In reply to Daniel Gur from comment #5)
> The user that has large RHV setup would need the ability to collect logs
> from it easily and efficiently. Whe we presented the current time to
> developers( Yaniv Kaul) and PMs
> The feedback was it is not acceptable and we were requested to Open a bug on
> it.
> 
> If you would like to question the purpose of this use-case please check it
> with them.

We are not planning for user to collect from 200 hosts at the same time.
We have two options:
- Aggregate logs with the metrics store.
- Collect from one host from each cluster.