Bug 798582 - Please include ringid in some log messages
Summary: Please include ringid in some log messages
Keywords:
Status: CLOSED UPSTREAM
Alias: None
Product: Fedora
Classification: Fedora
Component: libqb
Version: 16
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Angus Salkeld
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-02-29 10:15 UTC by Andrew Beekhof
Modified: 2012-06-04 09:25 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2012-06-04 09:25:57 UTC
Type: ---


Attachments (Terms of Use)
Proposed patch (1.11 KB, patch)
2012-05-10 08:15 UTC, Jan Friesse
no flags Details | Diff

Description Andrew Beekhof 2012-02-29 10:15:22 UTC
Description of problem:

Its difficult/impossible to correlate syslog entries[1] with their blackbox counterparts[2] as there is no timestamp or otherwise identifying information.

This is particularly relevant when seeing loops of [1] and/or when the blackbox is dumped after the fact.
Possibly after corosync has been restarted.

[1] Logs from syslog 
Feb  1 19:30:00 virt1 corosync[943]:   [TOTEM ] A processor joined or left the membership and a new membership was formed.
Feb  1 19:30:00 virt1 corosync[943]:   [CPG   ] chosen downlist: sender r(0) ip(192.168.122.2) ; members(old:1 left:0)
Feb  1 19:30:00 virt1 corosync[943]:   [MAIN  ] Completed service synchronization, ready to provide service.

[2] blackbox logs which may or may not correspond to the above syslog messages
rec=[320] Log Message=A processor joined or left the membership and a new membership was formed.
rec=[360] Log Message=Completed service synchronization, ready to provide service.

Actual results:

Feb  1 19:30:00 virt1 corosync[943]:   [TOTEM ] A processor joined or left the membership and a new membership was formed.

Expected results:

Feb  1 19:30:00 virt1 corosync[943]:   [TOTEM ] A processor joined or left the membership and a new membership (1452) was formed.

Additional info:

Ideally the blackbox output would also indicate the PID and timestamp.

Comment 1 Steven Dake 2012-02-29 15:43:04 UTC
makes sense.

Comment 2 Jan Friesse 2012-05-10 08:15:04 UTC
Andrew,
corosync 2.0 has support for timestamps in logs. It's nonsense to backport this functionality back to flatiron.

Attached is patch to add ringid to membership messages.

Does that satisfy your needs?

Honza

Comment 3 Jan Friesse 2012-05-10 08:15:32 UTC
Created attachment 583471 [details]
Proposed patch

Comment 4 Andrew Beekhof 2012-05-11 01:01:22 UTC
(In reply to comment #2)
> Andrew,
> corosync 2.0 has support for timestamps in logs. 

Does this include blackbox output?

> It's nonsense to backport this
> functionality back to flatiron.

I think your definition of nonsense differs from mine.
Hard to do != good to do.

But as long as we'll have timestamps in RHEL7 I will survive :-)

> Attached is patch to add ringid to membership messages.
> 
> Does that satisfy your needs?

What about pid and timestamps in the blackbox output for future 2.0.x releases?

Comment 5 Jan Friesse 2012-05-11 06:14:17 UTC
(In reply to comment #4)
> (In reply to comment #2)
> > Andrew,
> > corosync 2.0 has support for timestamps in logs. 
> 
> Does this include blackbox output?

Ya,
# corosync-blackbox
Ringbuffer:
 ->NORMAL
 ->write_pt [2725]
 ->read_pt [0]
 ->size [2097152 words]
 =>free [8377704 bytes]
 =>used [10900 bytes]
May 09 04:25:55 main():1047 Corosync Cluster Engine ('2.0.0.4-f244-dirty'): started and ready to provide service.
May 09 04:25:55 main():1048 Corosync built-in features: debug
May 09 04:25:55 totemsrp_initialize():837 Token Timeout (1000 ms) retransmit timeout (238 ms)
May 09 04:25:55 totemsrp_initialize():840 token hold (180 ms) retransmits before loss (4 retrans)
...

> 
> > It's nonsense to backport this
> > functionality back to flatiron.
> 
> I think your definition of nonsense differs from mine.
> Hard to do != good to do.
> 

It's not as much hard as impossible without radical changes in log structures/functions/... So even it is good thing, breaking API/ABI is bad thing.

> But as long as we'll have timestamps in RHEL7 I will survive :-)
> 

;)

> > Attached is patch to add ringid to membership messages.
> > 
> > Does that satisfy your needs?
> 
> What about pid and timestamps in the blackbox output for future 2.0.x releases?

Timestamps are there. I'm unsure which PID you mean. PID of corosync? PID of IPC clients?

Comment 6 Jan Friesse 2012-05-17 12:59:23 UTC
Patch included in master.

Andrew, can you please give me more information about PID you would like to include in logs?

Comment 7 Andrew Beekhof 2012-05-21 01:52:36 UTC
(In reply to comment #5)
> (In reply to comment #4)
...
> It's not as much hard as impossible without radical changes in log
> structures/functions/... So even it is good thing, breaking API/ABI is bad
> thing.

Sounds like the right decision.

> > What about pid and timestamps in the blackbox output for future 2.0.x releases?
> 
> Timestamps are there. I'm unsure which PID you mean. PID of corosync? PID of
> IPC clients?

PID of whoever the log message came from.  So corosync in this case.
To simplify correlating the blackbox contents with /var/log/messages.

Comment 8 Jan Friesse 2012-05-21 08:21:08 UTC
(In reply to comment #7)
> (In reply to comment #5)
> > (In reply to comment #4)
> ...
> > It's not as much hard as impossible without radical changes in log
> > structures/functions/... So even it is good thing, breaking API/ABI is bad
> > thing.
> 
> Sounds like the right decision.
> 
> > > What about pid and timestamps in the blackbox output for future 2.0.x releases?
> > 
> > Timestamps are there. I'm unsure which PID you mean. PID of corosync? PID of
> > IPC clients?
> 
> PID of whoever the log message came from.  So corosync in this case.
> To simplify correlating the blackbox contents with /var/log/messages.

Ok. This is impossible to handle directly in corosync because blackbox functionality is implemented in libqb. So reassign bug to libqb.

Comment 9 Angus Salkeld 2012-05-21 08:53:01 UTC
(In reply to comment #8)
> (In reply to comment #7)
> > (In reply to comment #5)
> > > (In reply to comment #4)
> > ...
> > > It's not as much hard as impossible without radical changes in log
> > > structures/functions/... So even it is good thing, breaking API/ABI is bad
> > > thing.
> > 
> > Sounds like the right decision.
> > 
> > > > What about pid and timestamps in the blackbox output for future 2.0.x releases?
> > > 
> > > Timestamps are there. I'm unsure which PID you mean. PID of corosync? PID of
> > > IPC clients?
> > 
> > PID of whoever the log message came from.  So corosync in this case.
> > To simplify correlating the blackbox contents with /var/log/messages.
> 
> Ok. This is impossible to handle directly in corosync because blackbox
> functionality is implemented in libqb. So reassign bug to libqb.

Hi

Super easy to do in corosync (one liner).

logsys.c:324

qb_log_format_set(QB_LOG_BLACKBOX, "what ever format you want here");

http://libqb.org/html/doxygen/qblog_8h.html#a0e8cf298ff5fd224a16537b30a899df9
 n FUNCTION NAME 
 f FILENAME 
 l FILELINE 
 p PRIORITY 
 t TIMESTAMP 
 b BUFFER 
 g TAGS 
 N name (passed into qb_log_init)
 P PID
 H hostname

Comment 10 Angus Salkeld 2012-05-28 22:25:46 UTC
(In reply to comment #9)
> (In reply to comment #8)
> > (In reply to comment #7)
> > > (In reply to comment #5)
> > > > (In reply to comment #4)
> > > ...
> > > > It's not as much hard as impossible without radical changes in log
> > > > structures/functions/... So even it is good thing, breaking API/ABI is bad
> > > > thing.
> > > 
> > > Sounds like the right decision.
> > > 
> > > > > What about pid and timestamps in the blackbox output for future 2.0.x releases?
> > > > 
> > > > Timestamps are there. I'm unsure which PID you mean. PID of corosync? PID of
> > > > IPC clients?
> > > 
> > > PID of whoever the log message came from.  So corosync in this case.
> > > To simplify correlating the blackbox contents with /var/log/messages.
> > 
> > Ok. This is impossible to handle directly in corosync because blackbox
> > functionality is implemented in libqb. So reassign bug to libqb.
> 
> Hi
> 
> Super easy to do in corosync (one liner).
> 
> logsys.c:324
> 
> qb_log_format_set(QB_LOG_BLACKBOX, "what ever format you want here");
> 
> http://libqb.org/html/doxygen/qblog_8h.html#a0e8cf298ff5fd224a16537b30a899df9
>  n FUNCTION NAME 
>  f FILENAME 
>  l FILELINE 
>  p PRIORITY 
>  t TIMESTAMP 
>  b BUFFER 
>  g TAGS 
>  N name (passed into qb_log_init)
>  P PID
>  H hostname

Actually I am wrong, sorry - I forgot myself.
I have added all the non-static options here:
https://github.com/asalkeld/libqb/issues/36

Andrew has encoded the pid into the name of the blackbox file
so don't think he needs it anymore.

Comment 11 Andrew Beekhof 2012-06-01 04:51:26 UTC
(In reply to comment #10)
> Andrew has encoded the pid into the name of the blackbox file
> so don't think he needs it anymore.

Right, Pacemaker includes the pid in the filename.
Perhaps corosync could do this too?

Comment 12 Jan Friesse 2012-06-04 09:25:57 UTC
(In reply to comment #11)
> (In reply to comment #10)
> > Andrew has encoded the pid into the name of the blackbox file
> > so don't think he needs it anymore.
> 
> Right, Pacemaker includes the pid in the filename.
> Perhaps corosync could do this too?

Andrew,
I like that idea (and also seems to be MUCH better then cluttering log with still same pid line). I've sent patch to ML, which creates fdata file in format fdata-ISO_DATE-PID. I believe this bug can be closed now.


Note You need to log in before you can comment on or make changes to this bug.