Bug 1304394
Summary: | [RFE] add options to sosreport to limit time range of collected logs | |||
---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Yaniv Lavi <ylavi> | |
Component: | sos | Assignee: | Pavel Moravec <pmoravec> | |
Status: | CLOSED ERRATA | QA Contact: | Miroslav HradĂlek <mhradile> | |
Severity: | medium | Docs Contact: | Michal Stubna <mstubna> | |
Priority: | medium | |||
Version: | 7.4 | CC: | agk, bmr, cww, dfediuck, djasa, jjansky, mhradile, pdwyer, plambri, pmoravec, pstehlik, Rhev-m-bugs, sbonazzo, sbradley, srevivo | |
Target Milestone: | rc | Keywords: | FutureFeature, Reopened | |
Target Release: | --- | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
URL: | https://github.com/sosreport/sos/issues/284 | |||
Whiteboard: | ||||
Fixed In Version: | sos-3.8-6.el7 | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | ||
Clone Of: | 1020790 | |||
: | 1789049 (view as bug list) | Environment: | ||
Last Closed: | 2020-03-31 20:04:09 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1594286, 1648022, 1789049 |
Description
Yaniv Lavi
2016-02-03 13:32:17 UTC
This is not a simple problem since end-users may not use the default syslog formatting options. For now we offer the ability to limit the size of text log by size - this will collect the most recent log entries up to the specified limit. With journald this is much simpler and more reliable since timestamps are stored in an unambiguous form. Bryn, what will happen is the log was rotated and you set a log size? Will it do it in a smart way taking from most recent to rotated by size? Sandro, what do you think of this suggestion? (In reply to Yaniv Dary from comment #3) > Sandro, what do you think of this suggestion? If --log-size takes most recent entries, that may be used. However, note that limiting the size doesn't guarantee that you collect for example last 24 hours of logs: you may have a flood in the last hour filling the logs making previous logs not available in the report. The time based filtering feature has been opened upstream here: https://github.com/sosreport/sos/issues/284 by Bryn 2 years ago so he's aware of this. If it's acceptable to have a limit on the size instead of on the time (requiring a new sosreport execution if needed logs are not included), using the log-size option is fine for me (adding a big warning about it). We can do this easily (with a little bit of work upstream) for journald logs: the existing Plugin.add_journal() interface supports the underlying --since and --until switches of journalctl: def add_journal(self, units=None, boot=None, since=None, until=None, lines=None, allfields=False, output=None, timeout=None): """ Collect journald logs from one of more units. Keyword arguments: units -- A string, or list of strings specifying the systemd units for which journal entries will be collected. boot -- A string selecting a boot index using the journalctl syntax. The special values 'this' and 'last' are also accepted. since -- A string representation of the start time for journal messages. until -- A string representation of the end time for journal messages. lines -- The maximum number of lines to be collected. allfields -- Include all journal fields regardless of size or non-printable characters. output -- A journalctl output control string, for example "verbose". timeout -- An optional timeout in seconds. """ Right now there are no users of this part of the interface for RHEL (on Fedora, where /var/log/messages no longer exists it is used to grab the last three days of logs), but plugins using journal logs are growing quite rapidly: $ grep add_journal sos/plugins/[a-zA-Z]*.py | wc -l 20 I've hesitated to wire these up to new global options (e.g. --logsince, --loguntil) to avoid creating an expectation that this will work with syslog too - implementing that is considerably more work, and is more fragile due to the syslog formatting problems mentioned in comment #2. If it's either felt worth hooking up just for journal logs (e.g. --journalsince...) that's a relatively small piece of work - the main thing is fixing up the 20 plugins that use the interface already to use the options. It's possible to do for syslog also, with some limitations - but that is a larger piece of work - if we're thinking of this for the next update we should start planning when the work will get done now. (In reply to Yaniv Lavi (Dary) from comment #3) > Bryn, what will happen is the log was rotated and you set a log size? Will > it do it in a smart way taking from most recent to rotated by size? > Sandro, what do you think of this suggestion? Sadly, this does not work (now). Calling e.g. "sosreport -o logs", it collects /var/log/messages* with a sizelimit, but sorts files alphabetically, so files are added (until limit is reached) in ordering: /var/log/messages /var/log/messages-20160101 /var/log/messages-20170101 /var/log/messages-20170828 Anyway changing this shall be simple. We have the log limit / filter on sos roadmap but dont have capacity to implement it - at least within 7.5 scope. The above can be a feasible workaround - would you appreciate it? (In reply to Pavel Moravec from comment #8) > /var/log/messages > /var/log/messages-20160101 > /var/log/messages-20170101 > /var/log/messages-20170828 > > Anyway changing this shall be simple. This is a bug but sadly it's not that simple; the current sorting convention the sos file collector uses is alphanumeric. This gives correct results for the "old" rotation naming convention of appending ".N" (older files have higher numbers), but it fails for the "new" convention of appending the rotation date: $ LC_ALL=C echo -e 'messages\nmessages.2\nmessages.1' | sort messages messages.1 messages.2 $ LC_ALL=C echo -e 'messages-20161102\nmessages-20170801' | sort messages-20161102 messages-20170801 There are two obvious solutions: checking the file name pattern and adapting, and performing a stat(2) check to sort files by mtime. Both of these add some complexity (although I think the stat approach is more complex). We can try to get this addressed for 3.5 (even if it's a temporary change to improve the behaviour for the two current common conventions - but I would rather have a "proper" fix). My proposal: - fix in RHEL7.5 just the https://bugzilla.redhat.com/show_bug.cgi?id=1486952 to collect newest logrotated files every time properly - have this BZ opened for the initial reasonable RFE request Updating flags accordingly. no space in limited-scope of 7.6 To recap: - bz1486952 implemented this RFE for journal logs - the pending request is to have option "collect logfiles not newer/older than .." - so collecting of logfiles should be conditional based on these parameters Sadly, the logfiles are collected by the same method like config files (add_copy_spec), while config files must be collected regardless of their age. So for proper implementation, we would need to: - have a dedicated method add_copy_log that applies the time range condition - update almost *all* plugins (all those collecting a log file) accordingly OK, challenge for Christmas silent period accepted. preliminary ACKing for 7.7 The upstream PR https://github.com/sosreport/sos/pull/1586 will just allow the possibility to filter (log)files collected by sosreport by their mtime. No change in sosreport data collection itself will happen. If you have ideas what particular logs to filter based on mtime, then please either comment here (soon) or open a new bugzilla, stating: - logfile pattern - maxage and/or minage (in hours) - usual and maximal sizes of files in such pattern (do we really want sosreport to collect 10GB logfile created in latest day?) - if/what sizelimit should be still applied to that pattern - note that sizelimit is applied *independently* on age limit. So adding age limit, we shall probably increase (or remove) sizelimit to move from size to age filtering of the given file pattern. For QE: how I tested it: 1) in sos/plugins/qpid.py, I added: "/var/log/cumin" ], minage=1, maxage=3) (or played with those values or specified just one of the params) 2) generated fake logs with fake mtime date >> /var/log/cumin # this location is usually dir but.. we just fake something, right? date >> /var/log/cumin.log date >> /var/log/qpidd.log touch -m --date="$(date -d '1 hour ago')" /var/log/cumin touch -m --date="$(date -d '3 hours ago')" /var/log/cumin.log touch -m --date="$(date -d '1 minute ago')" /var/log/qpidd.log sosreport -o qpid --batch --build and check what of these 3 files will be collected, based on minage/maxage setup. Scope of 7.7 closed, rescheduling for potential inclusion in 7.8. -- since option might be implemented in 7.8, but it will suffer by https://github.com/sosreport/sos/issues/1750 (collecting directories instead of files will ignore --since option). That is expected limitation so far. Concise specification of the implemented feature: option --since will filter out logarchive files older than given timestamp, as well as journal log entries older than the timestamp. Detailed description: - if --since option is not used, no change - if --since YYYYMMDD[HHMMSS] is provided, then: - no journal log older than the timestamp will be collected - no "logrotated file" older than given timestamp will be collected - "logrotated file" = file matching reg.expression https://github.com/sosreport/sos/blob/master/sos/plugins/__init__.py#L857 - other files (not matching the expression - ideally all configs or current logs) will be still collected Good findings! I raised: https://github.com/sosreport/sos/issues/1847 (--since option wrongly applied to some configs also) - this must be somehow resolved in 7.8 https://github.com/sosreport/sos/issues/1848 (jorunalctl shall apply --since as well (I think), or at least remove the --all-logs this/prev boot nonsense) - I see this rather optional (say, nkown issue), but would like to have it fixed as well in 7.8 this has been mostly fixed in 3.8-1 already due to sos rebase, some final bits are pending to be committed to dist-git now. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2020:1127 |