Description of problem: As described in Messaging User Guide chapter 7.1. Starting a Broker in a Cluster, in Table 7.1. Options for High Availability Messaging Cluster: ###cite ... By default, all local addresses for the broker are advertised. You only need to set this if 1. Your host has more than one active network interface, and 2. You want to restrict client fail-over to a specific interface or interfaces. ###end cite statement 1. about more network interface is in contradiction with "all local addresses", and still all (non localhost) IPv4 addresses from all interfaces is set by default, but no IPv6 addresses. Version-Release number of selected component (if applicable): How reproducible: 100% Steps to Reproduce: 1. set up interface with IPv4 and IPv6 (non link local fe80::/64) address 2. qpidd --cluster-name=testcluster 3. qpid-cluster Actual results: Cluster Name: testcluster Cluster Status: ACTIVE Cluster Size: 1 Members: ID=192.168.6.2:10474 URL=amqp:tcp:192.168.6.2:5672,tcp:172.16.35.1:5672 Expected results: Cluster Name: testcluster Cluster Status: ACTIVE Cluster Size: 1 Members: ID=192.168.6.2:10474 URL=amqp:tcp:[IPv6 address]:5672,tcp:192.168.6.2:5672,tcp:172.16.35.1:5672 Additional info:
Sorry, here are packages Version-Release number of selected component (if applicable): qpid-cpp-server-devel-0.14-1.el6. qpid-cpp-server-0.14-1.el6. qpid-cpp-server-store-0.14-1.el6. qpid-cpp-server-ssl-0.14-1.el6. qpid-cpp-server-cluster-0.14-1.el6. qpid-cpp-server-xml-0.14-1.el6. qpid-cpp-server-0.14-3.el5 qpid-cpp-server-ssl-0.14-3.el5 qpid-cpp-server-cluster-0.14-3.el5 qpid-cpp-server-store-0.14-3.el5 qpid-cpp-server-devel-0.14-3.el5 qpid-cpp-server-xml-0.14-3.el5
A side note - This default behaviour is actually badly broken: If a machine running a broker has multiple interfaces then the administrator needs to specify explicitly which interfaces are good for fail-over purposes. Anything else leads to advertising addresses that cannot/should not be reached by the client. However in the case of a single interface it is true that the cluster should advertise both IPv4 and IPv6 addresses.
I agree with side note, but then it should not be advertised all IPv4 addresses from all interface either.
This bug should now be fixed upstream trunk in change r1340279 This change should be in the upstream 0.18
Some issues have been discovered with this fix. 1/ All non localhost, non linklocal addresses are offered in ClusterURL. Unfortunatelly even such addresses in loopback interface(s). Such address is unreachable by other way than local connection, thus no address from loopback interface(s) should not be offered in ClusterURL. 2/ When many ip addresses (v4,v6) are set on various interfaces, broker sets them all into ClusterURL, then start to generate encoding error. for example: --cite-- 2012-10-26 09:04:59 [System] error Exception thrown by timer task ManagementAgent::periodicProcessing: Could not encode string of 281 bytes as uint8_t string. (qpid/framing/Buffer.cpp:246) --end-- qpid-cpp-server-0.18-2.el5 qpid-cpp-server-cluster-0.18-2.el5 qpid-cpp-server-devel-0.18-2.el5 qpid-cpp-server-ssl-0.18-2.el5 qpid-cpp-server-store-0.18-2.el5 qpid-cpp-server-xml-0.18-2.el5 qpid-cpp-server-0.18-2.el6_3 qpid-cpp-server-cluster-0.18-2.el6_3 qpid-cpp-server-devel-0.18-2.el6_3 qpid-cpp-server-store-0.18-2.el6_3 qpid-cpp-server-xml-0.18-2.el6_3 moving this bug back -> ASSIGNED
Issue 2 is an independent bug - it would have previously occurred if the list of interfaces were long enough. It's exacerbated because IPv6 literal addresses tend to be long.
The code in SystemInfo::getLocalIpAddresses() in sys/posix/SystemInfo.cpp explicitly checks for loopback addresses (line 85) and link local addresses (line 94). You will need to give the specific case where this fails so I can try to understand why this logic is failing.
let assume you have a machine with interfaces (lo, eth0), and this sequence of commnads: ip addr add 172.30.1.1/24 dev lo qpidd --cluster-name=mycluster --auth=no qpid-cluster and you can se that address 172.30.1.1/24 that is in local class B thus non localhost/linklocal, etc. is regardless of loopback interface assigned to ClusterURL.
Issue 2 is now tracked in bug 875962. (In reply to comment #9) > Issue 2 is an independent bug - it would have previously occurred if the > list of interfaces were long enough. It's exacerbated because IPv6 literal > addresses tend to be long.
Andrew, I forgot. What's the status of issue 1 here? We decided to move this issue (issue 1) to 2.4 as well, true?
(In reply to comment #11) > let assume you have a machine with interfaces (lo, eth0), and this sequence > of commnads: > > ip addr add 172.30.1.1/24 dev lo > qpidd --cluster-name=mycluster --auth=no > qpid-cluster > > and you can se that address 172.30.1.1/24 that is in local class B thus non > localhost/linklocal, etc. is regardless of loopback interface assigned to > ClusterURL. I think you are making a mistake here if you use a local class address it should still be put in the cluster set. There is absolutely no way the broker can exclude these sort of addresses they are very commonly used internally in networks and so can very definitely be valid contact addresses for a broker. The underlying problem here is that it is impossible for the broker to know in general if it is reachable using any given address from any given client. If this is the case you are talking about then I don't think this is a bug at all.
incidentally the same might be true got bridge interfaces and any number of interfaces that aren't actually connected to clients - I don't think you can filter on the basis of network type here either. The essential issue is that the underlying feature is misguided and useless and nearly all practical situations.
Moving this to modified, since the original issue has been addressed in the 0.18 builds.
I know that local class address should be assigned to ClusterURL. But this is not about address this is about LOOPBACK interface. You could subsitute address from my example for like 8.8.8.8 and qpid put it into clusterURL, despite the loopback, leaving the address unrechable.
I understand the point about not using the loopback interface. However if the administrator configures their machine like that why should the qpid daemon try to second guess why that has happened? The networking APIs have a specific concept of loopback address, that the loopback interface is called "lo" is a convention which is not even universal - certain Unixes in the past had different names for it (lo0 was certainly used as well). I don't think we should teach the daemon about some (albeit widespread) conventions - just don't configure the machine like that unless you have a good reason.
That is true, I agree.
Fix looks okay according to comments above. Tested on: RHEL 5.8, RHEL 6.3 && i386, x86_64 Testing packages: qpid-cpp-server-0.18-10.el5 qpid-cpp-server-cluster-0.18-10.el5 qpid-cpp-server-devel-0.18-10.el5 qpid-cpp-server-rdma-0.18-10.el5 qpid-cpp-server-ssl-0.18-10.el5 qpid-cpp-server-store-0.18-10.el5 qpid-cpp-server-xml-0.18-10.el5 qpid-cpp-server-0.18-10.el6_3 qpid-cpp-server-cluster-0.18-10.el6_3 qpid-cpp-server-devel-0.18-10.el6_3 qpid-cpp-server-ha-0.18-10.el6_3 qpid-cpp-server-rdma-0.18-10.el6_3 qpid-cpp-server-ssl-0.18-10.el6_3 qpid-cpp-server-store-0.18-10.el6_3 qpid-cpp-server-xml-0.18-10.el6_3 Moving -> VERIFIED
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHSA-2013-0561.html