Bug 1027458 - Storage node is overriding user supplied host name with locally resolved IP address
Summary: Storage node is overriding user supplied host name with locally resolved IP a...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: JBoss Operations Network
Classification: JBoss
Component: Database
Version: JON 3.2
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ER07
: JON 3.2.0
Assignee: John Sanda
QA Contact: Mike Foley
URL:
Whiteboard:
Depends On:
Blocks: 1012435
TreeView+ depends on / blocked
 
Reported: 2013-11-06 21:53 UTC by Larry O'Leary
Modified: 2014-01-02 20:38 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:
Type: Bug
Embargoed:


Attachments (Terms of Use)

Description Larry O'Leary 2013-11-06 21:53:08 UTC
Description of problem:
Storage nodes seem to swap out user specified host name with a locally resolved IP address.

This will prevent JBoss ON and its storage node from being used in an enterprise environment that utilizes network high-availability implementations or in disaster-recovery configurations. This is because these environments rely on virtual LANs which can potentially have its physical IP addresses changed. However, the host name can always be mapped back to the correct IP address.

More details from our Tom F in our support group:

> I spoke with Biljana about the issue of Cassandra using hard coded IP addresses.
> It seems that while we enter a fully qualified host.domain name, the 
> setup enters the resolved IP address in a number of locations in the 
> Cassandra files.
> 
> The problems with this are:
> 
> - DHCP gives the computer another IP number
> - working from home, using VPN (the "tun0" interface) also gets another 
> IP number
> - working from home, no VPN, yet another IP number.
> - switching from wired to wireless
> - etc....
> 
> Of course I understand that in production the IP number usually stays 
> the same, but even then companies sometimes move installation to 
> different networks (IP's)
> 
> So all in all, storing that IP number and not the host name will cause 
> issue, not only for support people, but also for customers who are 
> testing JON on non-production servers.
> 
> I see two solutions needed
> - store the actual hostname -> gets around changing IP numbers
> - have a tool (script?) that allows an easy change of the setting 
> without having to modify this manually in several places.
> 
> I'll leave it to Larry to push this as a BUG, he's better positioned 
> then myself.
> 
> cheers
> Tom

Version-Release number of selected component (if applicable):
3.2.0.ER3

Comment 1 John Sanda 2013-11-07 02:40:01 UTC
The storage installer is the culprit. It has a hostname parameter which can be either a hostname or an IP address. It uses the IP address even if a hostname is specified. That hostname is value for the the listen_address and rpc_address properties in cassandra.yaml. Originally, the installer did not do this. There were a couple reasons why the change was made. First, I have read in several places such as Cassandra mailing list threads that it is much better to use IP addresses instead of hostnames. Secondly, the initial implementation of node deployment (after being imported into inventory) code fetch a storage node configuration property that contained all of the node endpoints. The endpoints are provided as IP addresses. This forced us to do a reverse mapping which was an unwelcome change to already complex code. 

We no longer rely on that configuration property during storage node deployment. As for using hostnames, I have done some investigation and while I have not found anything conclusive, the big motivation I have come across for not using hostnames  particularly for the listen_address is unreliable (or none at all) DNS. I have not however, seen anything that say Cassandra will break if hostnames are used. In fact, http://wiki.apache.org/cassandra/MultinodeCluster says it is perfectly fine to use hostnames provided you have DNS properly configured. The comments in cassandra.yaml for the listen_address property make similar statements.

I did a test today with a 3 node cluster to verify that things will work if hostnames are specified for the listen_address and/or rpc_address properties. The aforementioned properties of each node were configured with hostnames. I changed the IP address of each machine, restarted the DNS server, restarted each machine, and then finally restarted each storage node. The cluster continued to function without error. Messages are logged to show that Cassandra updates token range mappings. There are statements like,

INFO [GossipStage:1] 2013-11-06 17:26:19,186 StorageService.java (line 1422) Nodes /10.16.23.55 and /10.16.23.185 have the same token 915988031550474468.  /10.16.23.55 is the new owner
 WARN [GossipStage:1] 2013-11-06 17:26:19,187 TokenMetadata.java (line 197) Token 7267129206972664462 changing ownership from /10.16.23.185 to /10.16.23.55

We can revert the storage installer back to its original functionality which is a minor, low-risk change. We need to review server and storage plugin code to see what other changes are required. I will report back any additional changes in this BZ.

Comment 2 John Sanda 2013-11-11 03:12:26 UTC
master commit hashes:

be500838
bfa7fc0fe
fbdd630
8c0b13f0a

release/jon3.2.x commit hashes:
80b0ad45ab
3a6ef2bf
c0055022
91b55ece

Comment 3 Simeon Pinder 2013-11-19 15:48:42 UTC
Moving to ON_QA as available for testing with new brew build.

Comment 4 Simeon Pinder 2013-11-22 05:14:09 UTC
Mass moving all of these from ER6 to target milestone ER07 since the ER6 build was bad and QE was halted for the same reason.

Comment 5 Jeeva Kandasamy 2013-12-16 13:57:42 UTC
Moving to verified state as it is using hostname,

jon-server-3.2.0.GA/bin/rhq-server.properties
rhq.storage.nodes=rhel6-vm

jon-server-3.2.0.GA/rhq-storage/conf/cassandra.yaml
- seeds: "rhel6-vm"
listen_address: rhel6-vm
rpc_address: rhel6-vm


Version:
JBoss Operations Network
3.2.0.GA
Build Number : dcb8b6f:734bd56
GWT Version : 2.5.0
SmartGWT Version : 3.0p


Note You need to log in before you can comment on or make changes to this bug.