Description of problem: When running the server installer on a second server in an HA deployment, Cassandra schema changes are applied when they should not be. I had a single server running with a 4 node cluster which means a in replication_factor of 3 for the rhq and system_auth keyspaces. Here is the relevant log statements from the second server installer, 21:39:26,450 INFO [org.rhq.cassandra.schema.TopologyManager] Applying topology updates... 21:39:26,450 INFO [org.rhq.cassandra.schema.TopologyManager] Starting to execute UpdateReplicationFactor task. 21:39:26,462 INFO [org.rhq.cassandra.schema.AbstractManager] Applying update file: org.rhq.cassandra.schema.UpdateFile@1b382d35 21:39:26,463 INFO [org.rhq.cassandra.schema.AbstractManager] Statement: ALTER KEYSPACE rhq WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 2} 21:39:27,602 INFO [org.rhq.cassandra.schema.AbstractManager] Statement: ALTER KEYSPACE system_auth WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 2} 21:39:28,823 INFO [org.rhq.cassandra.schema.AbstractManager] Applied update file: org.rhq.cassandra.schema.UpdateFile@1b382d35 21:39:28,823 INFO [org.rhq.cassandra.schema.TopologyManager] Updated replication factor from 3 to 2 21:39:28,824 INFO [org.rhq.cassandra.schema.TopologyManager] Successfully executed UpdateReplicationFactor task. No schema changes should be made. In particular, the replication_factor should not be changed as that gets updated when nodes are added/removed. Moreover, we only ever reduce the replication_factor when removing a node. The problem is that we unconditionally call SchemaManager.updateTopology(). The updateTopology method should only be invoked for a new install. Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
I forgot to mention in the initial write up that prior to the install, I needed to update the storage properties in rhq-server.properties, namely rhq.storage.nodes. The installer should be getting the values for rhq.storage.nodes, rhq.storage.cql-port, and rhq.storage.gossip-port from the database.
Updated the installer code to always storage cluster information from the database if it was previously persisted. release/jon3.2.x branch commits: https://git.fedorahosted.org/cgit/rhq/rhq.git/commit/?h=release/jon3.2.x&id=0facc965c95822a6b704015662ad79c15f60d37e https://git.fedorahosted.org/cgit/rhq/rhq.git/commit/?h=release/jon3.2.x&id=285de9c889ffbc7a5ca58dbae159e56b6dbb099b
We will keep the RHQ 4.10 release for this BZ since it post-dates RHQ 4.9 release. master branch commits: https://git.fedorahosted.org/cgit/rhq/rhq.git/commit/?id=ff85ff537ec99fa3729c6734491faea74b19d7b1 https://git.fedorahosted.org/cgit/rhq/rhq.git/commit/?id=fa463dd7dbd51e8775155ecbd32c30c94e7204e7
So as they are in 4.10, we can put this one on_qa - thanks for checking.
*** Bug 1032199 has been marked as a duplicate of this bug. ***