Bug 1416535
Summary: | Custom corosync and pacemaker log locations are not collected | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Ken Gaillot <kgaillot> |
Component: | sos | Assignee: | Pavel Moravec <pmoravec> |
Status: | CLOSED ERRATA | QA Contact: | Miroslav HradĂlek <mhradile> |
Severity: | medium | Docs Contact: | |
Priority: | unspecified | ||
Version: | 7.3 | CC: | abeekhof, agk, bmr, cfeist, gavin, isenfeld, jfriesse, kgaillot, plambri, sbradley |
Target Milestone: | rc | Keywords: | OtherQA |
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
URL: | https://github.com/sosreport/sos/pull/924, https://github.com/sosreport/sos/pull/1002 | ||
Whiteboard: | |||
Fixed In Version: | sos-3.4-3.el7 | Doc Type: | If docs needed, set a value |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2017-08-01 23:10:42 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Ken Gaillot
2017-01-25 18:13:14 UTC
How often a customer uses non-default location of these logfiles? Just one / very few, or is it more common? And do they disable logging to syslog (if not, check journalctl logs)? Also how to *uniquely* identify value of logfile option? Does it need to be like: logfile: /path/to/log.log or can it be: logfile: /path/to/log.log ? What if some subsystem changes this? Shall we grep the very first value only and rely on it? @Pavel, Ken asked me to reply to your question. corosync.conf parser is basically very simple (and sometimes little stupid ;) ). Parser is line based. Line can be one of: - Beginning section. Format is then section_name {. It's line based so nothing like section1_name { subsection_name {. - End of section. Format is then }. Because it's line based, there cannot be things like }}. - key + value. Format is key: value. No new lines allowed. So ---- logfile: /path/to/log.log --- cannot happen. Spaces (space + tab) at the beginning and end of line are stripped. Lines beginning with # are comments and they are ignored (keep in mind first step is strip of spaces, so " #comment" is valid comment line and it's ignored). If key set is found (so key: value), spaces are stripped from beginning and end of both key and value. So "logfile: /path/to/log.log " is valid and sets logfile to "/path/to/log.log"). Problem is, that values may be overwritten. So config file: --- logging { log_file: test something: some_value log_file: /tmp/test.log } --- results in log_file set to /tmp/test.log. Same applies to --- logging { log_file: test } logging { log_file: /tmp/test.log } --- again means log_file is /tmp/test.log. As a bonus key can have dot in it's name and it means same as start of section, so: ---- logging { log_file: test } logging.log_file: /tmp/test.log --- again means log_file is set to /tmp/test.log. Of course subsystem change is problematic. Question is, if we really need to try to solve it. Another question is, if we need to be collect logs even if corosync is not running. Because if not, we can use corosync-cmapctl. This would solve all "parsing" problems. If collecting logs with not running corosync is needed I would recommend to make it simple. Just grep 'log_file:', read value, check if value is valid existing file and if so, add it to sosreport. We do need to collect logs without the cluster running, so I would go with the parsing. We can use the currently configured value, and not worry about previous values. Users that change the value typically do so when they initially set up the system, and the window for problems (someone changing the location between when an issue occurs and when they run sosreport) is small enough that we can handle any such case manually. I wouldn't say it's common that RHEL users change the location (more so upstream and in other distributions, e.g. /var/log/corosync/corosync.log is common), but I believe the impact on investigating issues is high enough to warrant the effort. It doesn't have to be a high priority. Proper processing the logfile still would need more logic than a sos plugin shall have on parsing config files. I suggest searching of (first? all?) reg.exp: '^\s*(logging.)?log_file:\s*(\S+)\s*$' (i.e. either "log_file:" followed by filename, or "logging.log_file:" followed by filename, both with whitespaces anywhere around) I guess log_file can't appear elsewhere than under logging section, so this simplification is safe. Question: does it make sense to search: - just for the very first occurrence (usually the generic/non-subsystem one, I expect) of log_file (easier/faster to get, shall be sufficient), - or do we need/utilize all logfiles (i.e. also from all subsystems (cant we obtain something unwanted?) - more (maybe relevant) logs to get, more lengthy? @Pavel:
I believe there is misunderstanding:
> I guess log_file can't appear elsewhere than under logging section, so this simplification is safe.
logging {
log_file: something
}
is equal to:
logging.log_file: something
but not to:
logging {
logging.log_file: something
}
For first/all question. If possible go with ALL. Just test if file exists and if so, add it to SOS report. Possibility of storing more information is almost always better than storing less.
(In reply to Jan Friesse from comment #6) > @Pavel: > > I believe there is misunderstanding: > > > I guess log_file can't appear elsewhere than under logging section, so this simplification is safe. > > logging { > log_file: something > } > > is equal to: > > logging.log_file: something > > but not to: > > logging { > logging.log_file: something > } This "something" will be captured while it shouldnt. No big deal, esp. when assuming customers wont have such (mis)config often. That simplified parsing shall not skip collecting a log_file but might collect something else that shall be rather syntax error. If sos would have to parse also "logging" section, the parsing would be more complex with a little gain only. > > > For first/all question. If possible go with ALL. Just test if file exists > and if so, add it to SOS report. Possibility of storing more information is > almost always better than storing less. OK, raised it in upstream as https://github.com/sosreport/sos/pull/924, subject to change if needed (esp. wrt. parsing the log_file locations). Feel free to play with the code if it will collect all expected logfiles. (In reply to Pavel Moravec from comment #7) > (In reply to Jan Friesse from comment #6) > > @Pavel: > > > > I believe there is misunderstanding: > > > > > I guess log_file can't appear elsewhere than under logging section, so this simplification is safe. > > > > logging { > > log_file: something > > } > > > > is equal to: > > > > logging.log_file: something > > > > but not to: > > > > logging { > > logging.log_file: something > > } > > This "something" will be captured while it shouldnt. No big deal, esp. when Yes. And it's really nothing bad. > assuming customers wont have such (mis)config often. Yes. > > That simplified parsing shall not skip collecting a log_file but might > collect something else that shall be rather syntax error. > > If sos would have to parse also "logging" section, the parsing would be more > complex with a little gain only. > > > > > > > For first/all question. If possible go with ALL. Just test if file exists > > and if so, add it to SOS report. Possibility of storing more information is > > almost always better than storing less. > > OK, raised it in upstream as https://github.com/sosreport/sos/pull/924, > subject to change if needed (esp. wrt. parsing the log_file locations). Feel > free to play with the code if it will collect all expected logfiles. I was testing the code and it doesn't work as expected. It is probably caused by me, because I was always writing "log_file" what is not correct. Correct one is "logfile". Again, sorry for confusing you. After changing regexp to: ^\s*(logging.)?logfile:\s*(\S+)$ it works as expected. Thanks for the check, fixed in upstream PR in https://github.com/sosreport/sos/pull/924/commits/774e930a0324795077885a7e58a743b3d25e48a2 The corosync plugin looks good to me. The pacemaker log will be easy if you already have common functions for detecting init environment location (/etc/sysconfig vs. /etc/default etc.) and parsing shell-style variables.(VAR=value). Hello, would you be able to test the fix in candidate RPM for RHEL7.4 (once available)? Our QE would appreciate it. (In reply to Pavel Moravec from comment #11) > Hello, > would you be able to test the fix in candidate RPM for RHEL7.4 (once > available)? > > Our QE would appreciate it. Yes, put the link here when it's ready, and I'll test it as soon as I can. Thanks! Closed #924 via 7015223. Ah sorry, forgot the 2nd requirement. Captured here: https://github.com/sosreport/sos/pull/1002 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:2203 |