Bug 508881

Summary: [RFE] Initial support for Corosync diagnostics
Product: Red Hat Enterprise Linux 6 Reporter: Carl Trieloff <cctrieloff>
Component: corosyncAssignee: Jan Friesse <jfriesse>
Status: CLOSED CURRENTRELEASE QA Contact: Cluster QE <mspqa-list>
Severity: medium Docs Contact:
Priority: medium    
Version: 6.0CC: cluster-maint, edamato, sdake, syeghiay
Target Milestone: rcKeywords: FutureFeature
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: corosync-1.2.0-1.el6 Doc Type: Enhancement
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2010-02-12 12:45:58 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Implementation of cpg_iteration + tool none

Description Carl Trieloff 2009-06-30 12:02:38 UTC
cmd line tool(s) that can provide the following information for AIS

- Show what CPG groups exists
- Show who (IP's) are in what group.
- Show number of retry's for a given node
- Show if a node is queuing
- Show status of a given node (active, failed to r-ring (if rr configured), or down)
- Be able to do a ping test between nodes and print current latency for ping test.

Comment 1 Steven Dake 2009-07-01 03:36:27 UTC
as agreed in email targeting for upstream feature development in corosync.

Comment 2 Steven Dake 2009-07-07 02:57:58 UTC
Honza,

Handle this as per our published or soon to be published roadmap.

Thanks.
-steve

Comment 3 Jan Friesse 2009-08-27 13:23:40 UTC
Created attachment 358867 [details]
Implementation of cpg_iteration + tool

Same patch sent to ml. It implements first 2 points. Should be pushed to upstream soon.

Comment 4 Jan Friesse 2009-12-01 13:48:36 UTC
Carl,
I'm pretty sure that my patch (pushed to upstream) + bug 529138 covers functionality you requested.

Can you please confirm, that this is true, so I can close this bug?

Thanks,
  Honza

Comment 6 Carl Trieloff 2009-12-08 20:19:53 UTC
Can you give me the sample output from the tool so I can close it off, I can't take the patch right now to play with it, but should be able to close it from a dump of the tool run with each options

Comment 8 Jan Friesse 2010-01-12 16:47:36 UTC
[root@node-06 ~/corosync/trunk/tools]# ./corosync-cpgtool
Group Name             PID         Node ID
abc\x7Fa\x00
                      20960              67 (10.34.38.106)
GROUP\x00
                      97685              66 (10.34.38.108)
                      20962              67 (10.34.38.106)
                      20961              67 (10.34.38.106)
[root@node-06 ~/corosync/trunk/tools]# ./corosync-cpgtool -d ';'
GRP_NAME;PID;NODEID
abc\x7Fa\x00;20960;67
GROUP\x00;97685;66
GROUP\x00;20962;67
GROUP\x00;20961;67
[root@node-06 ~/corosync/trunk/tools]# ./corosync-cpgtool -n -e
abca
GROUP 

Other information you are able to get dumping objdb. I must ask Steve, what tool is provided in RHEL6.

Comment 9 Jan Friesse 2010-01-13 09:07:16 UTC
For dumping objdb, there is corosync-objdbctl. Output:
[root@node-06 ~/corosync/trunk/tools]# corosync-objctl
totem.version=2
totem.secauth=off
totem.threads=0
totem.nodeid=6
totem.token=10000
totem.interface.ringnumber=0
totem.interface.bindnetaddr=10.34.38.106
totem.interface.mcastaddr=226.94.1.3
totem.interface.mcastport=5407
logging.fileline=off
logging.to_stderr=yes
logging.to_logfile=yes
logging.to_syslog=yes
logging.logfile=/tmp/corosync.log
logging.debug=off
logging.timestamp=on
logging.logger_subsys.subsys=AMF
logging.logger_subsys.debug=off
amf.mode=disabled
runtime.services.evs.service_id=0
runtime.services.evs.0.tx=0
runtime.services.evs.0.rx=0
runtime.services.cfg.service_id=7
runtime.services.cpg.service_id=8
runtime.services.cpg.0.tx=0
runtime.services.cpg.0.rx=0
runtime.services.cpg.1.tx=0
runtime.services.cpg.1.rx=0
runtime.services.cpg.2.tx=0
runtime.services.cpg.2.rx=0
runtime.services.cpg.3.tx=0
runtime.services.cpg.3.rx=0
runtime.services.cpg.4.tx=2
runtime.services.cpg.4.rx=3
runtime.services.confdb.service_id=11
runtime.services.pload.service_id=13
runtime.services.pload.0.tx=0
runtime.services.pload.0.rx=0
runtime.services.pload.1.tx=0
runtime.services.pload.1.rx=0
runtime.services.quorum.service_id=12
runtime.connections.active=1
runtime.connections.closed=3
runtime.connections.9.service_id=11
runtime.connections.9.client_pid=0
runtime.connections.9.responses=111
runtime.connections.9.dispatched=0
runtime.connections.9.requests=114
runtime.connections.9.sem_retry_count=0
runtime.connections.9.send_retry_count=0
runtime.connections.9.recv_retry_count=0
runtime.connections.9.flow_control=0
runtime.connections.9.flow_control_count=0
runtime.connections.9.queue_size=0
runtime.totem.pg.mrp.srp.orf_token_tx=2
runtime.totem.pg.mrp.srp.orf_token_rx=190
runtime.totem.pg.mrp.srp.memb_merge_detect_tx=89
runtime.totem.pg.mrp.srp.memb_merge_detect_rx=89
runtime.totem.pg.mrp.srp.memb_join_tx=2
runtime.totem.pg.mrp.srp.memb_join_rx=4
runtime.totem.pg.mrp.srp.mcast_tx=24
runtime.totem.pg.mrp.srp.mcast_retx=0
runtime.totem.pg.mrp.srp.mcast_rx=24
runtime.totem.pg.mrp.srp.memb_commit_token_tx=4
runtime.totem.pg.mrp.srp.memb_commit_token_rx=4
runtime.totem.pg.mrp.srp.token_hold_cancel_tx=0
runtime.totem.pg.mrp.srp.token_hold_cancel_rx=0
runtime.totem.pg.mrp.srp.operational_entered=2
runtime.totem.pg.mrp.srp.operational_token_lost=0
runtime.totem.pg.mrp.srp.gather_entered=2
runtime.totem.pg.mrp.srp.gather_token_lost=0
runtime.totem.pg.mrp.srp.commit_entered=2
runtime.totem.pg.mrp.srp.commit_token_lost=0
runtime.totem.pg.mrp.srp.recovery_entered=2
runtime.totem.pg.mrp.srp.recovery_token_lost=0
runtime.totem.pg.mrp.srp.consensus_timeouts=0
runtime.totem.pg.mrp.srp.mtt_rx_token=890
runtime.totem.pg.mrp.srp.avg_token_workload=0
runtime.totem.pg.mrp.srp.avg_backlog_calc=0
runtime.totem.pg.mrp.srp.rx_msg_dropped=0

Comment 10 Jan Friesse 2010-01-13 09:13:59 UTC
I hope, this is all informations you need, if something is missing, please ask.

Regards,
  Honza

Comment 12 Steven Dake 2010-02-12 12:45:58 UTC
The requested features were resolved in corosync 1.2.0 prior to RHEL6 branching except for the network test feature.  That feature will be addressed as a separate RFE in Bug #543938.