Red Hat Bugzilla – Bug 1304394
[RFE] add options to sosreport to limit time range of collected logs
Last modified: 2018-04-02 09:51:22 EDT
+++ This bug was initially created as a clone of Bug #1020790 +++
Description of problem:
sometimes it makes sense to limit the time range for logs so that large amounts of irrelevant info are omitted from sosreport archive.
This is not a simple problem since end-users may not use the default syslog formatting options.
For now we offer the ability to limit the size of text log by size - this will collect the most recent log entries up to the specified limit.
With journald this is much simpler and more reliable since timestamps are stored in an unambiguous form.
Bryn, what will happen is the log was rotated and you set a log size? Will it do it in a smart way taking from most recent to rotated by size?
Sandro, what do you think of this suggestion?
(In reply to Yaniv Dary from comment #3)
> Sandro, what do you think of this suggestion?
If --log-size takes most recent entries, that may be used.
However, note that limiting the size doesn't guarantee that you collect for example last 24 hours of logs: you may have a flood in the last hour filling the logs making previous logs not available in the report.
The time based filtering feature has been opened upstream here: https://github.com/sosreport/sos/issues/284 by Bryn 2 years ago so he's aware of this.
If it's acceptable to have a limit on the size instead of on the time (requiring a new sosreport execution if needed logs are not included), using the log-size option is fine for me (adding a big warning about it).
We can do this easily (with a little bit of work upstream) for journald logs: the existing Plugin.add_journal() interface supports the underlying --since and --until switches of journalctl:
def add_journal(self, units=None, boot=None, since=None, until=None,
lines=None, allfields=False, output=None, timeout=None):
""" Collect journald logs from one of more units.
units -- A string, or list of strings specifying the systemd
units for which journal entries will be collected.
boot -- A string selecting a boot index using the journalctl
syntax. The special values 'this' and 'last' are also
since -- A string representation of the start time for journal
until -- A string representation of the end time for journal
lines -- The maximum number of lines to be collected.
allfields -- Include all journal fields regardless of size or
output -- A journalctl output control string, for example
timeout -- An optional timeout in seconds.
Right now there are no users of this part of the interface for RHEL (on Fedora, where /var/log/messages no longer exists it is used to grab the last three days of logs), but plugins using journal logs are growing quite rapidly:
$ grep add_journal sos/plugins/[a-zA-Z]*.py | wc -l
I've hesitated to wire these up to new global options (e.g. --logsince, --loguntil) to avoid creating an expectation that this will work with syslog too - implementing that is considerably more work, and is more fragile due to the syslog formatting problems mentioned in comment #2.
If it's either felt worth hooking up just for journal logs (e.g. --journalsince...) that's a relatively small piece of work - the main thing is fixing up the 20 plugins that use the interface already to use the options.
It's possible to do for syslog also, with some limitations - but that is a larger piece of work - if we're thinking of this for the next update we should start planning when the work will get done now.
(In reply to Yaniv Lavi (Dary) from comment #3)
> Bryn, what will happen is the log was rotated and you set a log size? Will
> it do it in a smart way taking from most recent to rotated by size?
> Sandro, what do you think of this suggestion?
Sadly, this does not work (now). Calling e.g. "sosreport -o logs", it collects /var/log/messages* with a sizelimit, but sorts files alphabetically, so files are added (until limit is reached) in ordering:
Anyway changing this shall be simple.
We have the log limit / filter on sos roadmap but dont have capacity to implement it - at least within 7.5 scope. The above can be a feasible workaround - would you appreciate it?
(In reply to Pavel Moravec from comment #8)
> Anyway changing this shall be simple.
This is a bug but sadly it's not that simple; the current sorting convention the sos file collector uses is alphanumeric. This gives correct results for the "old" rotation naming convention of appending ".N" (older files have higher numbers), but it fails for the "new" convention of appending the rotation date:
$ LC_ALL=C echo -e 'messages\nmessages.2\nmessages.1' | sort
$ LC_ALL=C echo -e 'messages-20161102\nmessages-20170801' | sort
There are two obvious solutions: checking the file name pattern and adapting, and performing a stat(2) check to sort files by mtime. Both of these add some complexity (although I think the stat approach is more complex).
We can try to get this addressed for 3.5 (even if it's a temporary change to improve the behaviour for the two current common conventions - but I would rather have a "proper" fix).
- fix in RHEL7.5 just the https://bugzilla.redhat.com/show_bug.cgi?id=1486952 to collect newest logrotated files every time properly
- have this BZ opened for the initial reasonable RFE request
Updating flags accordingly.
no space in limited-scope of 7.6