Bug 1542849 - hosted-engine plugin collects only last log, alphabetically-sorted
Summary: hosted-engine plugin collects only last log, alphabetically-sorted
Keywords:
Status: CLOSED UPSTREAM
Alias: None
Product: Fedora
Classification: Fedora
Component: sos
Version: 27
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Yedidyah Bar David
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-02-07 07:31 UTC by Yedidyah Bar David
Modified: 2018-06-05 10:30 UTC (History)
5 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2018-05-16 12:46:08 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github sosreport sos issues 1243 0 None None None Never
Red Hat Bugzilla 1585739 0 unspecified CLOSED ovirt-hosted-engine-cleanup causing to "The host has been set in non_operational status" during redeployment. 2021-02-22 00:41:40 UTC

Internal Links: 1585739

Description Yedidyah Bar David 2018-02-07 07:31:16 UTC
Description of problem:

In 4.2, a single run of 'hosted-engine --deploy --ansible' writes multiple log files (one for the otopi code and one per ansible playbook), with different prefixes.

Then, running sosreport collects only the last one, which (always?) happens to be "ovirt-hosted-engine-setup-ansible-initial_clean*".

That's not very useful.

Not sure what's the best approach, assuming we do not want to always collect all of them. Perhaps something like:
1. Check the timestamp of the latest file, last_ts
2. Collect all logs with timestamp <= last_ts - 1 day

Version-Release number of selected component (if applicable):

I think all versions of sos, current master branch of ovirt-hosted-engine-setup (and a bit earlier).

How reproducible:

Always

Steps to Reproduce:
1. Install current ovirt-hosted-engine-setup from master snapshot
2. Run hosted-engine --deploy --ansible
3. Run sosreport

Actual results:

Collects only a single log

Expected results:

Collects at least all logs relevant to the last run

Additional info:

Comment 1 Nikolai Sednev 2018-02-07 07:33:37 UTC
It happens also on RHEL7.4.

Comment 2 Yedidyah Bar David 2018-02-07 07:34:36 UTC
(In reply to Nikolai Sednev from comment #1)
> It happens also on RHEL7.4.

Of course, but you know - upstream first...

Comment 3 Nikolai Sednev 2018-02-07 09:31:31 UTC
Sure. Just added it here for verification purposes.

Comment 4 Bryn M. Reeves 2018-02-07 09:48:56 UTC
By default sos only collects the most recent log per set of globs. The globs passed to add_copy_spec() when limiting is enabled (i.e. log files) are supposed to be a single logical set (a current log and all its rotated copies).

The ovirt_hosted_engine plugin currently mashes these all together:

    SETUP_LOG_GLOB = '/var/log/ovirt-hosted-engine-setup/*.log'
    HA_LOG_GLOB = '/var/log/ovirt-hosted-engine-ha/*.log'

So it's a crapshoot as to what you'll actually end up with. There should be one glob per type of log file.

That said, you should not be seeing alpha sorted file names since this change was merged in 3.5 (unless you are running some older version):

commit ae76de192e6c7a42e4515971249298decd602f54
Author: Bryn M. Reeves <bmr>
Date:   Thu Oct 19 16:34:04 2017 +0100

    [Plugin] reverse sort file list when size limiting
    
    Reverse sort the file list by modification time, to ensure that
    the most recent data is included when limits apply.
    
    Fixes: #1082
    
    Signed-off-by: Bryn M. Reeves <bmr>

Anyway: it's a plugin bug. Sos is designed to only collect the most recent logs by default since this keeps tarballs to a manageable size. In situations where you want more you should use the --log-size or --all-logs (and possibly --profile to select a more targeted group of plugins).

Comment 5 Yedidyah Bar David 2018-02-07 09:59:44 UTC
(In reply to Bryn M. Reeves from comment #4)
> By default sos only collects the most recent log per set of globs. The globs
> passed to add_copy_spec() when limiting is enabled (i.e. log files) are
> supposed to be a single logical set (a current log and all its rotated
> copies).
> 
> The ovirt_hosted_engine plugin currently mashes these all together:
> 
>     SETUP_LOG_GLOB = '/var/log/ovirt-hosted-engine-setup/*.log'
>     HA_LOG_GLOB = '/var/log/ovirt-hosted-engine-ha/*.log'
> 
> So it's a crapshoot as to what you'll actually end up with. There should be
> one glob per type of log file.

Understood. But we decided (for simplicity?) to keep each playbook in its own log. And would rather not maintain the list of playbook names both in hosted-engine-setup and in sos, if possible.

Of course we can decide to change the decision and concatenate the ansible logs to the main setup log and have just a single one per run.

Not sure what's the best approach.

> 
> That said, you should not be seeing alpha sorted file names since this
> change was merged in 3.5 (unless you are running some older version):

I think I saw this with 3.4 in RHEL7, but:

> 
> commit ae76de192e6c7a42e4515971249298decd602f54
> Author: Bryn M. Reeves <bmr>
> Date:   Thu Oct 19 16:34:04 2017 +0100
> 
>     [Plugin] reverse sort file list when size limiting
>     
>     Reverse sort the file list by modification time, to ensure that
>     the most recent data is included when limits apply.
>     
>     Fixes: #1082
>     
>     Signed-off-by: Bryn M. Reeves <bmr>

I don't think this is relevant, because the ovirt plugin sorts the logs by itself and passes only one to add_copy_spec.

> 
> Anyway: it's a plugin bug.

Understood. Taken the bug for now.

> Sos is designed to only collect the most recent
> logs by default since this keeps tarballs to a manageable size. In
> situations where you want more you should use the --log-size or --all-logs
> (and possibly --profile to select a more targeted group of plugins).

This makes sense for logs of daemons, that run all the time. Not sure it's such a bad idea to always collect all _setup_ logs, which are (usually?) ran only manually and quite infrequently.

Comment 6 Bryn M. Reeves 2018-02-07 10:22:41 UTC
> would rather not maintain the list of playbook names both in hosted-engine-setup 

Then there are two choices: provide a mechanism in h-e-s to tell sos what the log file names are from the current playbooks, or implement some hacky goop in the plugin to find each unique prefix in the directory and add it separately. That's kinda ugly but if you insist on not putting the names directly in sos and don't want to provide a way for sos to interrogate them it's really the only option.

> I don't think this is relevant, because the ovirt plugin sorts the logs by itself and passes only one to add_copy_spec.

That's a bit of a hack and not really a good idea (what's the reason for doing it? If you need some modified sort order it's usually better to extend the collection methods): doing your own globbing and sorting may end up giving confusing results for users.

> This makes sense for logs of daemons, that run all the time. Not sure it's such 
> a bad idea to always collect all _setup_ logs, which are (usually?) ran only 
> manually and quite infrequently.

Then don't limit them! The add_copy_spec() API allows the caller to choose. If you want all these collected every time (since they are setup logs rather than runtime) then collect them without any limits.

It's fine to do that, although a comment in the code explaining why it does it is helpful. See for e.g. the anaconda plugin which applies no size limits to the logs it collects.

Comment 7 Bryn M. Reeves 2018-02-07 10:24:01 UTC
Dropping all the limits also means that you avoid any need for sos to know the exact file names it should collect: a glob is sufficient as long as you really do want to collect everything it matches.

Comment 8 Yedidyah Bar David 2018-03-06 14:07:05 UTC
https://github.com/sosreport/sos/pull/1244

Comment 9 Sandro Bonazzola 2018-05-16 12:46:08 UTC
Closing with upstream resolution, being https://github.com/sosreport/sos/pull/1244 merged.


Note You need to log in before you can comment on or make changes to this bug.