Bug 858197 - It's not possible to change loglevels at runtime for TOTEM subsystem
Summary: It's not possible to change loglevels at runtime for TOTEM subsystem
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Fedora
Classification: Fedora
Component: libqb
Version: rawhide
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
Assignee: Angus Salkeld
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-09-18 09:59 UTC by Jan Friesse
Modified: 2014-01-13 01:40 UTC (History)
5 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2012-11-16 07:39:46 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)

Description Jan Friesse 2012-09-18 09:59:51 UTC
Description of problem:
See "steps to reproduce"

Version-Release number of selected component (if applicable):
git master (both corosync and libqb)

How reproducible:
99.7%

Steps to Reproduce:
1. set debug:off in corosync.conf
2. corosync -f
3. corosync-cmapctl -s logging.debug str on
4. start corosync -f on other node


Actual results:

NO debugging information (messages in debug level) on stderr nor syslog from TOTEM subsystem. Other subsystems (like cpg, libqb, ...) seems to handle change.


Expected results:
It's possible to change loglevel for TOTEM.

Additional info:
It can (of course) be bug in corosync, but filling to libqb, because after 10+ hours, I was unable to find out, HOW libqb log works).

Comment 1 Fabio Massimo Di Nitto 2012-09-19 12:04:22 UTC
using both libqb and corosync master branch from git as of today.

        logger_subsys {
                subsys: TOTEM
                debug: off
        }

corosync-cmapctl -s logging.logger_subsys.TOTEM.debug str trace

It appears that there is memory corruption and list entries duplication in call site handling.

libtotem does not link with libqb nor uses libqb directly.

corosync/exec/main.c defines _logsys_log_printf as:

static void
_logsys_log_printf(int level, int subsys,
                const char *function_name,
                const char *file_name,
                int file_line,
                const char *format, ...)
{
        va_list ap;

        va_start(ap, format);
        qb_log_from_external_source_va(function_name, file_name,
                                    format, level, file_line,
                                    subsys, ap);
        va_end(ap);
}

libtotem (and all parts of totem) are passed that function as logging function.
Including other bits such as subsys.

Every call libtotem does to log_printf is an alias to totem_log_printf that jumps back to qb_log_from_external_source_va.
Also to be noted that TOTEM "filename" for filtering is defined as totem.

All other subsystems are using qb_log directly. This is the first and most important difference.

qb_log_from_external_source_va -> calls into qb_log_callsite_get that
calls into qb_log_dcs_get.

For some reasons, that I can´t really figure out yet, qb_log_dcs_get, does allocated different callsite for the same file and I suspect that there is a memory leak and memory corruption. I don´t exclude that what we do in corosync is wrong, but....

diff --git a/exec/main.c b/exec/main.c
index c21f47f..140845f 100644
--- a/exec/main.c
+++ b/exec/main.c
@@ -827,11 +827,17 @@ _logsys_log_printf(int level, int subsys,
                const char *format, ...)
 {
        va_list ap;
+       struct qb_log_callsite *cs = qb_log_callsite_get(function_name,
+                                                        file_name,
+                                                        format,
+                                                        level,
+                                                        file_line,
+                                                        0);
+
+       fprintf(stderr, "DEBUG: callsite: %p %s\n", cs, file_name);
 
        va_start(ap, format);
-       qb_log_from_external_source_va(function_name, file_name,
-                                   format, level, file_line,
-                                   subsys, ap);
+       qb_log_real_va_(cs, ap);
        va_end(ap);
 }

changing _logsys_log_printf as in this patch shows:
(short snippet)
Sep 19 13:49:26 trace   [TOTEM ] Received ringid(192.168.2.97:304) seq 1
DEBUG: callsite: 0x7fec263c3320 /home/fabbione/work/cluster/corosync/corosync/exec/totemsrp.c
Sep 19 13:49:26 trace   [YKD   ] Received ringid(192.168.2.97:304) seq 2
DEBUG: callsite: 0x7fec263be110 /home/fabbione/work/cluster/corosync/corosync/exec/totemsrp.c
DEBUG: callsite: 0x7fec263be138 /home/fabbione/work/cluster/corosync/corosync/exec/totemsrp.c

that doesn´t look correct at all. subsystem has become YKD from TOTEM (note i don´t load anything in corosync.conf) after changing debug=off to debug=trace.

and it´s odd to see that new cs are allocated (this is just a short summary of a full run).

Adding debugging log to libqb somehow confirms that new_dcs are allocated from time to time.

Comment 3 Fedora Update System 2012-10-29 02:22:12 UTC
libqb-0.14.3-2.fc17 has been submitted as an update for Fedora 17.
https://admin.fedoraproject.org/updates/libqb-0.14.3-2.fc17

Comment 4 Fedora Update System 2012-10-29 02:22:24 UTC
libqb-0.14.3-2.fc18 has been submitted as an update for Fedora 18.
https://admin.fedoraproject.org/updates/libqb-0.14.3-2.fc18

Comment 5 Fedora Update System 2012-10-29 18:16:09 UTC
Package libqb-0.14.3-2.fc18:
* should fix your issue,
* was pushed to the Fedora 18 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing libqb-0.14.3-2.fc18'
as soon as you are able to.
Please go to the following url:
https://admin.fedoraproject.org/updates/FEDORA-2012-17170/libqb-0.14.3-2.fc18
then log in and leave karma (feedback).

Comment 6 Fedora Update System 2012-11-16 07:39:47 UTC
libqb-0.14.3-2.fc17 has been pushed to the Fedora 17 stable repository.  If problems still persist, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.