Description of problem: Storage nodes seem to swap out user specified host name with a locally resolved IP address. This will prevent JBoss ON and its storage node from being used in an enterprise environment that utilizes network high-availability implementations or in disaster-recovery configurations. This is because these environments rely on virtual LANs which can potentially have its physical IP addresses changed. However, the host name can always be mapped back to the correct IP address. More details from our Tom F in our support group: > I spoke with Biljana about the issue of Cassandra using hard coded IP addresses. > It seems that while we enter a fully qualified host.domain name, the > setup enters the resolved IP address in a number of locations in the > Cassandra files. > > The problems with this are: > > - DHCP gives the computer another IP number > - working from home, using VPN (the "tun0" interface) also gets another > IP number > - working from home, no VPN, yet another IP number. > - switching from wired to wireless > - etc.... > > Of course I understand that in production the IP number usually stays > the same, but even then companies sometimes move installation to > different networks (IP's) > > So all in all, storing that IP number and not the host name will cause > issue, not only for support people, but also for customers who are > testing JON on non-production servers. > > I see two solutions needed > - store the actual hostname -> gets around changing IP numbers > - have a tool (script?) that allows an easy change of the setting > without having to modify this manually in several places. > > I'll leave it to Larry to push this as a BUG, he's better positioned > then myself. > > cheers > Tom Version-Release number of selected component (if applicable): 3.2.0.ER3
The storage installer is the culprit. It has a hostname parameter which can be either a hostname or an IP address. It uses the IP address even if a hostname is specified. That hostname is value for the the listen_address and rpc_address properties in cassandra.yaml. Originally, the installer did not do this. There were a couple reasons why the change was made. First, I have read in several places such as Cassandra mailing list threads that it is much better to use IP addresses instead of hostnames. Secondly, the initial implementation of node deployment (after being imported into inventory) code fetch a storage node configuration property that contained all of the node endpoints. The endpoints are provided as IP addresses. This forced us to do a reverse mapping which was an unwelcome change to already complex code. We no longer rely on that configuration property during storage node deployment. As for using hostnames, I have done some investigation and while I have not found anything conclusive, the big motivation I have come across for not using hostnames particularly for the listen_address is unreliable (or none at all) DNS. I have not however, seen anything that say Cassandra will break if hostnames are used. In fact, http://wiki.apache.org/cassandra/MultinodeCluster says it is perfectly fine to use hostnames provided you have DNS properly configured. The comments in cassandra.yaml for the listen_address property make similar statements. I did a test today with a 3 node cluster to verify that things will work if hostnames are specified for the listen_address and/or rpc_address properties. The aforementioned properties of each node were configured with hostnames. I changed the IP address of each machine, restarted the DNS server, restarted each machine, and then finally restarted each storage node. The cluster continued to function without error. Messages are logged to show that Cassandra updates token range mappings. There are statements like, INFO [GossipStage:1] 2013-11-06 17:26:19,186 StorageService.java (line 1422) Nodes /10.16.23.55 and /10.16.23.185 have the same token 915988031550474468. /10.16.23.55 is the new owner WARN [GossipStage:1] 2013-11-06 17:26:19,187 TokenMetadata.java (line 197) Token 7267129206972664462 changing ownership from /10.16.23.185 to /10.16.23.55 We can revert the storage installer back to its original functionality which is a minor, low-risk change. We need to review server and storage plugin code to see what other changes are required. I will report back any additional changes in this BZ.
master commit hashes: be500838 bfa7fc0fe fbdd630 8c0b13f0a release/jon3.2.x commit hashes: 80b0ad45ab 3a6ef2bf c0055022 91b55ece
Moving to ON_QA as available for testing with new brew build.
Mass moving all of these from ER6 to target milestone ER07 since the ER6 build was bad and QE was halted for the same reason.
Moving to verified state as it is using hostname, jon-server-3.2.0.GA/bin/rhq-server.properties rhq.storage.nodes=rhel6-vm jon-server-3.2.0.GA/rhq-storage/conf/cassandra.yaml - seeds: "rhel6-vm" listen_address: rhel6-vm rpc_address: rhel6-vm Version: JBoss Operations Network 3.2.0.GA Build Number : dcb8b6f:734bd56 GWT Version : 2.5.0 SmartGWT Version : 3.0p