Bug 831559 - Pacemaker's logging is too extensive
Pacemaker's logging is too extensive
Status: CLOSED CURRENTRELEASE
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: pacemaker (Show other bugs)
7.0
Unspecified Unspecified
medium Severity medium
: rc
: ---
Assigned To: Andrew Beekhof
Cluster QE
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2012-06-13 06:16 EDT by Jaroslav Kortus
Modified: 2014-06-16 02:35 EDT (History)
0 users

See Also:
Fixed In Version: pacemaker-1.1.8-1.el7
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2014-06-16 02:35:44 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Jaroslav Kortus 2012-06-13 06:16:01 EDT
Description of problem:
When starting cluster, too many messages are printed by default.

Example:
Jun 12 11:22:43 r7-node01 crmd[1649]:     info: do_election_count_vote: Election 734 (owner: 9) pass: vote from r7-node09 (Uptime)
Jun 12 11:22:43 r7-node01 rsyslogd-2177: imuxsock begins to drop messages from pid 1649 due to rate-limiting
Jun 12 11:22:49 r7-node01 rsyslogd-2177: imuxsock lost 2523 messages from pid 1649 due to rate-limiting

2523 messages in 6 seconds is too much for default settings. Especially when this goes to /var/log/messages queue which uses sync writes.


Version-Release number of selected component (if applicable):
pacemaker-1.1.7-2.el7.x86_64

How reproducible:
100%

Steps to Reproduce:
1. start 15-node virt cluster, for example
2.
3.
  
Actual results:
/var/log/messages full of more-or-less debug info with no real value for admins

Expected results:
only important messages printed. So in this case it would be, for example:
"... Voting started..."
"... Voting result: "

An option should be added (command line, via /etc/sysconfig/[..] or others) to increase the verbosity when needed.

Any better suggestion is welcome :)

Additional info:
Comment 1 Andrew Beekhof 2012-06-13 19:40:52 EDT
2523 messages seems excessive.  Do you have debug enabled?  How many nodes?

Logging is continually being improved*, but the highest priority is the ability to diagnose problems based on the log contents.  This has been invaluable in allowing us to fix bugs the first time they occur, without requiring a reproducer.

Granted however that we shouldn't be swamping syslog like this :-(

Unfortunately, in this particular case the suggested logs are not possible due to the election algorithm being used (as there are multiple concurrent elections).


* For example, the next releases will support a corosync style "blackbox" and only log info and debug messages to a file (not syslog).  The work for this is already complete and all are configurable from /etc/sysconfig.  See  https://github.com/ClusterLabs/pacemaker/blob/master/mcp/pacemaker.sysconfig#L15 for the default sysconfig file.
Comment 2 Jaroslav Kortus 2012-06-14 11:06:53 EDT
This test was in 15-node virtual cluster. The sync IO really makes a huge difference here as they all share one physical machine. I was unable to form the cluster with sync logging. Turning it off ("-/var/log/messages") made it possible.

Would it be possible to either move the logging to separate file (i.e. /var/log/cluster/pacemaker.log) or change the priority of most of the messages to debug? (it would not be caught by *.info then).

I'm looking forward to having blackbox logging :)
Comment 3 Andrew Beekhof 2012-06-14 20:52:44 EDT
(In reply to comment #2)
> This test was in 15-node virtual cluster. 

Ok, I can imagine that might have this effect. 

> The sync IO really makes a huge
> difference here as they all share one physical machine. I was unable to form
> the cluster with sync logging. Turning it off ("-/var/log/messages") made it
> possible.
> 
> Would it be possible to either move the logging to separate file (i.e.
> /var/log/cluster/pacemaker.log) or change the priority of most of the
> messages to debug? (it would not be caught by *.info then).

The new version won't even send *.info to syslog in the first place, just to the logfile from corosync.conf if one is configured.

Until then, perhaps change the syslog filter to drop *.info?

> 
> I'm looking forward to having blackbox logging :)

You are not alone :-)

We're currently coming to the end of a "break everything" phase, I'll do a new build once everything is stable again (hopefully mid-nextweek).
Comment 4 Andrew Beekhof 2012-07-17 02:59:42 EDT
Logging should be much improved in pacemaker-1.1.8-1.el7

Note You need to log in before you can comment on or make changes to this bug.