Bug 858197 - It's not possible to change loglevels at runtime for TOTEM subsystem
It's not possible to change loglevels at runtime for TOTEM subsystem
Status: CLOSED CURRENTRELEASE
Product: Fedora
Classification: Fedora
Component: libqb (Show other bugs)
rawhide
Unspecified Unspecified
high Severity high
: ---
: ---
Assigned To: Angus Salkeld
Fedora Extras Quality Assurance
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2012-09-18 05:59 EDT by Jan Friesse
Modified: 2014-01-12 20:40 EST (History)
5 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2012-11-16 02:39:46 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Jan Friesse 2012-09-18 05:59:51 EDT
Description of problem:
See "steps to reproduce"

Version-Release number of selected component (if applicable):
git master (both corosync and libqb)

How reproducible:
99.7%

Steps to Reproduce:
1. set debug:off in corosync.conf
2. corosync -f
3. corosync-cmapctl -s logging.debug str on
4. start corosync -f on other node


Actual results:

NO debugging information (messages in debug level) on stderr nor syslog from TOTEM subsystem. Other subsystems (like cpg, libqb, ...) seems to handle change.


Expected results:
It's possible to change loglevel for TOTEM.

Additional info:
It can (of course) be bug in corosync, but filling to libqb, because after 10+ hours, I was unable to find out, HOW libqb log works).
Comment 1 Fabio Massimo Di Nitto 2012-09-19 08:04:22 EDT
using both libqb and corosync master branch from git as of today.

        logger_subsys {
                subsys: TOTEM
                debug: off
        }

corosync-cmapctl -s logging.logger_subsys.TOTEM.debug str trace

It appears that there is memory corruption and list entries duplication in call site handling.

libtotem does not link with libqb nor uses libqb directly.

corosync/exec/main.c defines _logsys_log_printf as:

static void
_logsys_log_printf(int level, int subsys,
                const char *function_name,
                const char *file_name,
                int file_line,
                const char *format, ...)
{
        va_list ap;

        va_start(ap, format);
        qb_log_from_external_source_va(function_name, file_name,
                                    format, level, file_line,
                                    subsys, ap);
        va_end(ap);
}

libtotem (and all parts of totem) are passed that function as logging function.
Including other bits such as subsys.

Every call libtotem does to log_printf is an alias to totem_log_printf that jumps back to qb_log_from_external_source_va.
Also to be noted that TOTEM "filename" for filtering is defined as totem.

All other subsystems are using qb_log directly. This is the first and most important difference.

qb_log_from_external_source_va -> calls into qb_log_callsite_get that
calls into qb_log_dcs_get.

For some reasons, that I can´t really figure out yet, qb_log_dcs_get, does allocated different callsite for the same file and I suspect that there is a memory leak and memory corruption. I don´t exclude that what we do in corosync is wrong, but....

diff --git a/exec/main.c b/exec/main.c
index c21f47f..140845f 100644
--- a/exec/main.c
+++ b/exec/main.c
@@ -827,11 +827,17 @@ _logsys_log_printf(int level, int subsys,
                const char *format, ...)
 {
        va_list ap;
+       struct qb_log_callsite *cs = qb_log_callsite_get(function_name,
+                                                        file_name,
+                                                        format,
+                                                        level,
+                                                        file_line,
+                                                        0);
+
+       fprintf(stderr, "DEBUG: callsite: %p %s\n", cs, file_name);
 
        va_start(ap, format);
-       qb_log_from_external_source_va(function_name, file_name,
-                                   format, level, file_line,
-                                   subsys, ap);
+       qb_log_real_va_(cs, ap);
        va_end(ap);
 }

changing _logsys_log_printf as in this patch shows:
(short snippet)
Sep 19 13:49:26 trace   [TOTEM ] Received ringid(192.168.2.97:304) seq 1
DEBUG: callsite: 0x7fec263c3320 /home/fabbione/work/cluster/corosync/corosync/exec/totemsrp.c
Sep 19 13:49:26 trace   [YKD   ] Received ringid(192.168.2.97:304) seq 2
DEBUG: callsite: 0x7fec263be110 /home/fabbione/work/cluster/corosync/corosync/exec/totemsrp.c
DEBUG: callsite: 0x7fec263be138 /home/fabbione/work/cluster/corosync/corosync/exec/totemsrp.c

that doesn´t look correct at all. subsystem has become YKD from TOTEM (note i don´t load anything in corosync.conf) after changing debug=off to debug=trace.

and it´s odd to see that new cs are allocated (this is just a short summary of a full run).

Adding debugging log to libqb somehow confirms that new_dcs are allocated from time to time.
Comment 3 Fedora Update System 2012-10-28 22:22:12 EDT
libqb-0.14.3-2.fc17 has been submitted as an update for Fedora 17.
https://admin.fedoraproject.org/updates/libqb-0.14.3-2.fc17
Comment 4 Fedora Update System 2012-10-28 22:22:24 EDT
libqb-0.14.3-2.fc18 has been submitted as an update for Fedora 18.
https://admin.fedoraproject.org/updates/libqb-0.14.3-2.fc18
Comment 5 Fedora Update System 2012-10-29 14:16:09 EDT
Package libqb-0.14.3-2.fc18:
* should fix your issue,
* was pushed to the Fedora 18 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing libqb-0.14.3-2.fc18'
as soon as you are able to.
Please go to the following url:
https://admin.fedoraproject.org/updates/FEDORA-2012-17170/libqb-0.14.3-2.fc18
then log in and leave karma (feedback).
Comment 6 Fedora Update System 2012-11-16 02:39:47 EST
libqb-0.14.3-2.fc17 has been pushed to the Fedora 17 stable repository.  If problems still persist, please make note of it in this bug report.

Note You need to log in before you can comment on or make changes to this bug.