Bug 688260
Summary: | corosync-cpgtool does not specify both interfaces in a dual ring configuration | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 6 | Reporter: | dan clark <2clarkd> | ||||||
Component: | corosync | Assignee: | Jan Friesse <jfriesse> | ||||||
Status: | CLOSED ERRATA | QA Contact: | Cluster QE <mspqa-list> | ||||||
Severity: | low | Docs Contact: | |||||||
Priority: | low | ||||||||
Version: | 6.3 | CC: | 2clarkd, cluster-maint, djansa, jkortus, sdake | ||||||
Target Milestone: | rc | Keywords: | TechPreview | ||||||
Target Release: | --- | ||||||||
Hardware: | Unspecified | ||||||||
OS: | Unspecified | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | corosync-1.4.0-1.el6 | Doc Type: | Technology Preview | ||||||
Doc Text: |
Cause
1. configure two rings
2. run an application that registers with the same group on each node
3. corosync-cpgtool
Consequence
% corosync-cpgtool
Group Name PID Node ID
aGroup\x00
4774 990357696 (10.0.0.910.0.0.9)
4694 1040689344 (10.0.0.110.0.0.1)
4682 1023912128 (10.0.0.210.0.0.2)
-> Duplicated ring IP
Fix
Fix cfg service to correctly return two interfaces instead of doubled one interface.
Result
% corosync-cpgtool
Group Name PID Node ID
aGroup\x00
4774 990357696 (192.168.7.59 10.0.0.9)
4694 1040689344 (192.168.7.61 10.0.0.1)
4682 1023912128 (192.168.7.62 10.0.0.2)
|
Story Points: | --- | ||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2011-12-06 11:50:15 UTC | Type: | --- | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Attachments: |
|
Honza, Please work on this as an upstream feature of corosync 2.0 merging all this cpgtool functionality into the confdb. Thanks -steve Created attachment 486843 [details]
Proposed patch for MAIN problem
Zero element array behavior is very different from normal array or
pointer. This behavior is root of problem in not returning correctly
filled array of addresses. This appeared only in rrp mode, where more
then one address is returned.
All memcpy's are now correctly converted to copy pointer to char.
Created attachment 486844 [details]
Proposed patch adding space between two IP items
cpgtool: print list of IP with space between items
Steve, I'm pretty sure we should split bug to two parts: - this part, which fixes root problem and we have patch, so it can be in 6.1. Simply because cfg contains problem, even if we ignore fact that rr mode is not supported - second part, which is "put cpg groups informations to objdb". This is Fedora material (maybe even not in 6.3) Technical note added. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: Cause 1. configure two rings 2. run an application that registers with the same group on each node 3. corosync-cpgtool Consequence % corosync-cpgtool Group Name PID Node ID aGroup\x00 4774 990357696 (10.0.0.910.0.0.9) 4694 1040689344 (10.0.0.110.0.0.1) 4682 1023912128 (10.0.0.210.0.0.2) -> Duplicated ring IP Fix Fix cfg service to correctly return two interfaces instead of doubled one interface. Result % corosync-cpgtool Group Name PID Node ID aGroup\x00 4774 990357696 (192.168.7.59 10.0.0.9) 4694 1040689344 (192.168.7.61 10.0.0.1) 4682 1023912128 (192.168.7.62 10.0.0.2) Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2011-1515.html |
Description of problem: The output of corosync-cpgtool reports the expanded 'Node ID' column with a concatenated repeat of one of the interfaces in a dual ring. An outstanding enhancement would be to include both interfaces in a dual ring with a space separating the two fields. Version-Release number of selected component (if applicable): corosync 3.0 How reproducible: trivial Steps to Reproduce: 1. configure two rings 2. run an application that registers with the same group on each node 3. corosync-cpgtool Actual results: % corosync-cpgtool Group Name PID Node ID aGroup\x00 4774 990357696 (10.0.0.910.0.0.9) 4694 1040689344 (10.0.0.110.0.0.1) 4682 1023912128 (10.0.0.210.0.0.2) Expected results: % corosync-cpgtool Group Name PID Node ID aGroup\x00 4774 990357696 (192.168.7.59 10.0.0.9) 4694 1040689344 (192.168.7.61 10.0.0.1) 4682 1023912128 (192.168.7.62 10.0.0.2) Additional info: When two rings are enabled with different IP subnets the output status provides unexpected results for the node IP identifiers. It appears that under the two ring situation one of the rings is arbitrarily selected and used to provide the IP address data, concatenated and duplicated. Perhaps there is a simple fix to avoiding the concatenation tools/corosync-cpgtool.c -- about line 84 adding a space ater the print of the string (or fancier to consider 1 versus 2 rings) inet_ntop(ss->ss_family, saddr, buf, sizeof(buf)); < fprintf(f, "%s", buf); > fprintf(f, "%s ", buf); In the example case above (based on the multiple ip addresses of the source nodes) would it be more helpful to accurately represent each of the node addresses? I did not find right away why a single IP address was selected of the two which represent each node. In this case the above expected results would provide a very powerful diagnostic to verify both rings and endpoints! It would be nice when specifying the delimiter field to get a similar output with new delimiters. What is particularly interesting about the above output is that provides a perspective across multiple nodes (given an application utilizing a group). Perhaps an additional diagnostic enhancement is providing the multiple node perspective from the cfgtool which as seen below only shows the current node, but none of the remaining members. An 'all nodes' query would be great to show all the endpoints across the system. % corosync-cfgtool -s Printing ring status. Local node ID 1023912128 RING ID 0 id = 192.168.7.61 status = ring 0 active with no faults RING ID 1 id = 10.0.0.1 status = ring 1 active with no faults Thanks for an overall great start to some very helpful command line tools. I appreciate the consideration of this fine tuning!