Bug 688260 - corosync-cpgtool does not specify both interfaces in a dual ring configuration
corosync-cpgtool does not specify both interfaces in a dual ring configuration
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: corosync (Show other bugs)
6.3
Unspecified Unspecified
low Severity low
: rc
: ---
Assigned To: Jan Friesse
Cluster QE
: TechPreview
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2011-03-16 12:48 EDT by dan clark
Modified: 2012-04-03 10:23 EDT (History)
5 users (show)

See Also:
Fixed In Version: corosync-1.4.0-1.el6
Doc Type: Technology Preview
Doc Text:
Cause 1. configure two rings 2. run an application that registers with the same group on each node 3. corosync-cpgtool Consequence % corosync-cpgtool Group Name PID Node ID aGroup\x00 4774 990357696 (10.0.0.910.0.0.9) 4694 1040689344 (10.0.0.110.0.0.1) 4682 1023912128 (10.0.0.210.0.0.2) -> Duplicated ring IP Fix Fix cfg service to correctly return two interfaces instead of doubled one interface. Result % corosync-cpgtool Group Name PID Node ID aGroup\x00 4774 990357696 (192.168.7.59 10.0.0.9) 4694 1040689344 (192.168.7.61 10.0.0.1) 4682 1023912128 (192.168.7.62 10.0.0.2)
Story Points: ---
Clone Of:
Environment:
Last Closed: 2011-12-06 06:50:15 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)
Proposed patch for MAIN problem (3.41 KB, patch)
2011-03-22 12:38 EDT, Jan Friesse
no flags Details | Diff
Proposed patch adding space between two IP items (756 bytes, patch)
2011-03-22 12:39 EDT, Jan Friesse
no flags Details | Diff

  None (edit)
Description dan clark 2011-03-16 12:48:24 EDT
Description of problem:
The output of corosync-cpgtool reports the expanded 'Node ID' column with a concatenated repeat of one of the interfaces in a dual ring.  An outstanding enhancement would be to include both interfaces in a dual ring with a space separating the two fields.

Version-Release number of selected component (if applicable):
corosync 3.0

How reproducible:
trivial

Steps to Reproduce:
1. configure two rings
2. run an application that registers with the same group on each node
3. corosync-cpgtool
  
Actual results:
% corosync-cpgtool
Group Name             PID         Node ID
aGroup\x00
                      4774       990357696 (10.0.0.910.0.0.9)
                      4694      1040689344 (10.0.0.110.0.0.1)
                      4682      1023912128 (10.0.0.210.0.0.2)

Expected results:
% corosync-cpgtool
Group Name             PID         Node ID
aGroup\x00
                      4774       990357696 (192.168.7.59 10.0.0.9)
                      4694      1040689344 (192.168.7.61 10.0.0.1)
                      4682      1023912128 (192.168.7.62 10.0.0.2)

Additional info:
When two rings are enabled with different IP subnets the output status
provides unexpected results for the node IP identifiers.  It appears
that under the two ring situation one of the rings is arbitrarily
selected and used to provide the IP address data, concatenated and
duplicated. 
Perhaps there is a simple fix to avoiding the concatenation
tools/corosync-cpgtool.c -- about line 84  adding a space ater the
print of the string (or fancier to consider 1 versus 2 rings)
                        inet_ntop(ss->ss_family, saddr, buf, sizeof(buf));
<                        fprintf(f, "%s", buf);
>                        fprintf(f, "%s ", buf);

In the example case above (based on the multiple ip addresses of the
source nodes) would it be more helpful to accurately represent each of
the node addresses?    I did not find right away why a single IP
address was selected of the two which represent each node.

In this case the above expected results would provide a very powerful diagnostic to verify both rings and endpoints!  It would be nice when specifying the delimiter field to get a similar output with new delimiters.

What is particularly interesting about the above output is that
provides a perspective across multiple nodes (given an application
utilizing a group).

Perhaps an additional diagnostic enhancement is providing the
multiple node perspective from the cfgtool which as seen below only
shows the current node, but none of the remaining members.  An 'all nodes' query would be great to show all the endpoints across the system.

% corosync-cfgtool -s
Printing ring status.
Local node ID 1023912128
RING ID 0
        id      = 192.168.7.61
        status  = ring 0 active with no faults
RING ID 1
        id      = 10.0.0.1
        status  = ring 1 active with no faults

Thanks for an overall great start to some very helpful command line tools.  I appreciate the consideration of this fine tuning!
Comment 4 Steven Dake 2011-03-18 16:12:13 EDT
Honza,

Please work on this as an upstream feature of corosync 2.0 merging all this cpgtool functionality into the confdb.

Thanks
-steve
Comment 5 Jan Friesse 2011-03-22 12:38:12 EDT
Created attachment 486843 [details]
Proposed patch for MAIN problem

Zero element array behavior is very different from normal array or
pointer. This behavior is root of problem in not returning correctly
filled array of addresses. This appeared only in rrp mode, where more
then one address is returned.

All memcpy's are now correctly converted to copy pointer to char.
Comment 6 Jan Friesse 2011-03-22 12:39:04 EDT
Created attachment 486844 [details]
Proposed patch adding space between two IP items

cpgtool: print list of IP with space between items
Comment 7 Jan Friesse 2011-03-24 12:47:10 EDT
Steve,
I'm pretty sure we should split bug to two parts:
- this part, which fixes root problem and we have patch, so it can be in 6.1. Simply because cfg contains problem, even if we ignore fact that rr mode is not supported
- second part, which is "put cpg groups informations to objdb". This is Fedora material (maybe even not in 6.3)
Comment 16 Jan Friesse 2011-09-29 03:16:43 EDT
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
Cause
1. configure two rings
2. run an application that registers with the same group on each node
3. corosync-cpgtool

Consequence
% corosync-cpgtool
Group Name             PID         Node ID
aGroup\x00
                      4774       990357696 (10.0.0.910.0.0.9)
                      4694      1040689344 (10.0.0.110.0.0.1)
                      4682      1023912128 (10.0.0.210.0.0.2)

-> Duplicated ring IP

Fix
Fix cfg service to correctly return two interfaces instead of doubled one interface.

Result
% corosync-cpgtool
Group Name             PID         Node ID
aGroup\x00
                      4774       990357696 (192.168.7.59 10.0.0.9)
                      4694      1040689344 (192.168.7.61 10.0.0.1)
                      4682      1023912128 (192.168.7.62 10.0.0.2)
Comment 18 errata-xmlrpc 2011-12-06 06:50:15 EST
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2011-1515.html

Note You need to log in before you can comment on or make changes to this bug.