Bug 1304394 - [RFE] add options to sosreport to limit time range of collected logs
[RFE] add options to sosreport to limit time range of collected logs
Status: NEW
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: sos (Show other bugs)
7.4
Unspecified Unspecified
medium Severity medium
: rc
: ---
Assigned To: Pavel Moravec
BaseOS QE - Apps
https://github.com/sosreport/sos/issu...
: FutureFeature, Reopened
Depends On:
Blocks: 1464262 1477664
  Show dependency treegraph
 
Reported: 2016-02-03 08:32 EST by Yaniv Lavi
Modified: 2018-04-02 09:51 EDT (History)
19 users (show)

See Also:
Fixed In Version:
Doc Type: Enhancement
Doc Text:
Story Points: ---
Clone Of: 1020790
Environment:
Last Closed: 2016-02-21 07:00:17 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 458323 None None None 2016-02-03 08:32 EST

  None (edit)
Description Yaniv Lavi 2016-02-03 08:32:17 EST
+++ This bug was initially created as a clone of Bug #1020790 +++

Description of problem:
sometimes it makes sense to limit the time range for logs so that large amounts of irrelevant info are omitted from sosreport archive.
Comment 2 Bryn M. Reeves 2016-02-03 09:09:55 EST
This is not a simple problem since end-users may not use the default syslog formatting options.

For now we offer the ability to limit the size of text log by size - this will collect the most recent log entries up to the specified limit.

With journald this is much simpler and more reliable since timestamps are stored in an unambiguous form.
Comment 3 Yaniv Lavi 2016-02-03 09:30:52 EST
Bryn, what will happen is the log was rotated and you set a log size? Will it do it in a smart way taking from most recent to rotated by size?
Sandro, what do you think of this suggestion?
Comment 4 Sandro Bonazzola 2016-02-05 02:09:22 EST
(In reply to Yaniv Dary from comment #3)
> Sandro, what do you think of this suggestion?

If --log-size takes most recent entries, that may be used.
However, note that limiting the size doesn't guarantee that you collect for example last 24 hours of logs: you may have a flood in the last hour filling the logs making previous logs not available in the report.

The time based filtering feature has been opened upstream here: https://github.com/sosreport/sos/issues/284 by Bryn 2 years ago so he's aware of this.

If it's acceptable to have a limit on the size instead of on the time (requiring a new sosreport execution if needed logs are not included), using the log-size option is fine for me (adding a big warning about it).
Comment 7 Bryn M. Reeves 2017-04-27 09:55:46 EDT
We can do this easily (with a little bit of work upstream) for journald logs: the existing Plugin.add_journal() interface supports the underlying --since and --until switches of journalctl:

    def add_journal(self, units=None, boot=None, since=None, until=None,
                    lines=None, allfields=False, output=None, timeout=None):
        """ Collect journald logs from one of more units.

        Keyword arguments:
        units     -- A string, or list of strings specifying the systemd
                     units for which journal entries will be collected.
        boot      -- A string selecting a boot index using the journalctl
                     syntax. The special values 'this' and 'last' are also
                     accepted.
        since     -- A string representation of the start time for journal
                     messages.
        until     -- A string representation of the end time for journal
                     messages.
        lines     -- The maximum number of lines to be collected.
        allfields -- Include all journal fields regardless of size or
                     non-printable characters.
        output    -- A journalctl output control string, for example
                     "verbose".
        timeout   -- An optional timeout in seconds.
        """

Right now there are no users of this part of the interface for RHEL (on Fedora, where /var/log/messages no longer exists it is used to grab the last three days of logs), but plugins using journal logs are growing quite rapidly:

$ grep add_journal sos/plugins/[a-zA-Z]*.py | wc -l
20

I've hesitated to wire these up to new global options (e.g. --logsince, --loguntil) to avoid creating an expectation that this will work with syslog too - implementing  that is considerably more work, and is more fragile due to the syslog formatting problems mentioned in comment #2.

If it's either felt worth hooking up just for journal logs (e.g. --journalsince...) that's a relatively small piece of work - the main thing is fixing up the 20 plugins that use the interface already to use the options.

It's possible to do for syslog also, with some limitations - but that is a larger piece of work - if we're thinking of this for the next update we should start planning when the work will get done now.
Comment 8 Pavel Moravec 2017-08-30 06:05:37 EDT
(In reply to Yaniv Lavi (Dary) from comment #3)
> Bryn, what will happen is the log was rotated and you set a log size? Will
> it do it in a smart way taking from most recent to rotated by size?
> Sandro, what do you think of this suggestion?

Sadly, this does not work (now). Calling e.g. "sosreport -o logs", it collects /var/log/messages* with a sizelimit, but sorts files alphabetically, so files are added (until limit is reached) in ordering:

/var/log/messages
/var/log/messages-20160101
/var/log/messages-20170101
/var/log/messages-20170828

Anyway changing this shall be simple.

We have the log limit / filter on sos roadmap but dont have capacity to implement it - at least within 7.5 scope. The above can be a feasible workaround - would you appreciate it?
Comment 9 Bryn M. Reeves 2017-08-30 06:45:13 EDT
(In reply to Pavel Moravec from comment #8)
> /var/log/messages
> /var/log/messages-20160101
> /var/log/messages-20170101
> /var/log/messages-20170828
> 
> Anyway changing this shall be simple.

This is a bug but sadly it's not that simple; the current sorting convention the sos file collector uses is alphanumeric. This gives correct results for the "old" rotation naming convention of appending ".N" (older files have higher numbers), but it fails for the "new" convention of appending the rotation date:

$ LC_ALL=C echo -e 'messages\nmessages.2\nmessages.1' | sort
messages
messages.1
messages.2

$ LC_ALL=C echo -e 'messages-20161102\nmessages-20170801' | sort
messages-20161102
messages-20170801

There are two obvious solutions: checking the file name pattern and adapting, and performing a stat(2) check to sort files by mtime. Both of these add some complexity (although I think the stat approach is more complex).

We can try to get this addressed for 3.5 (even if it's a temporary change to improve the behaviour for the two current common conventions - but I would rather have a "proper" fix).
Comment 10 Pavel Moravec 2017-09-01 15:00:51 EDT
My proposal:

- fix in RHEL7.5 just the https://bugzilla.redhat.com/show_bug.cgi?id=1486952 to collect newest logrotated files every time properly

- have this BZ opened for the initial reasonable RFE request

Updating flags accordingly.
Comment 13 Pavel Moravec 2018-04-02 09:51:22 EDT
no space in limited-scope of 7.6

Note You need to log in before you can comment on or make changes to this bug.