Bug 1015835 - HA installation breaks storage schema
HA installation breaks storage schema
Status: CLOSED CURRENTRELEASE
Product: RHQ Project
Classification: Other
Component: Installer (Show other bugs)
4.9
Unspecified Unspecified
unspecified Severity urgent (vote)
: ER05
: RHQ 4.10
Assigned To: Stefan Negrea
Mike Foley
:
: 1032199 (view as bug list)
Depends On:
Blocks: 1012435 1067644
  Show dependency treegraph
 
Reported: 2013-10-05 22:05 EDT by John Sanda
Modified: 2014-05-06 21:06 EDT (History)
5 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1067644 (view as bug list)
Environment:
Last Closed: 2014-03-25 17:00:20 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description John Sanda 2013-10-05 22:05:05 EDT
Description of problem:
When running the server installer on a second server in an HA deployment, Cassandra schema changes are applied when they should not be. I had a single server running with a 4 node cluster which means a in replication_factor of 3 for the rhq and system_auth keyspaces. Here is the relevant log statements from the second server installer,

21:39:26,450 INFO  [org.rhq.cassandra.schema.TopologyManager] Applying topology updates...
21:39:26,450 INFO  [org.rhq.cassandra.schema.TopologyManager] Starting to execute UpdateReplicationFactor task.
21:39:26,462 INFO  [org.rhq.cassandra.schema.AbstractManager] Applying update file: org.rhq.cassandra.schema.UpdateFile@1b382d35
21:39:26,463 INFO  [org.rhq.cassandra.schema.AbstractManager] Statement:

    ALTER KEYSPACE rhq WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 2}

21:39:27,602 INFO  [org.rhq.cassandra.schema.AbstractManager] Statement:

    ALTER KEYSPACE system_auth WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 2}

21:39:28,823 INFO  [org.rhq.cassandra.schema.AbstractManager] Applied update file: org.rhq.cassandra.schema.UpdateFile@1b382d35
21:39:28,823 INFO  [org.rhq.cassandra.schema.TopologyManager] Updated replication factor from 3 to  2
21:39:28,824 INFO  [org.rhq.cassandra.schema.TopologyManager] Successfully executed UpdateReplicationFactor task.

No schema changes should be made. In particular, the replication_factor should not be changed as that gets updated when nodes are added/removed. Moreover, we only ever reduce the replication_factor when removing a node. 

The problem is that we unconditionally call SchemaManager.updateTopology(). The updateTopology method should only be invoked for a new install.

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:
Comment 1 John Sanda 2013-10-06 07:27:07 EDT
I forgot to mention in the initial write up that prior to the install, I needed to update the storage properties in rhq-server.properties, namely rhq.storage.nodes. The installer should be getting the values for rhq.storage.nodes, rhq.storage.cql-port, and rhq.storage.gossip-port from the database.
Comment 2 Stefan Negrea 2013-10-16 16:48:12 EDT
Updated the installer code to always storage cluster information from the database if it was previously persisted.


release/jon3.2.x branch commits:
https://git.fedorahosted.org/cgit/rhq/rhq.git/commit/?h=release/jon3.2.x&id=0facc965c95822a6b704015662ad79c15f60d37e

https://git.fedorahosted.org/cgit/rhq/rhq.git/commit/?h=release/jon3.2.x&id=285de9c889ffbc7a5ca58dbae159e56b6dbb099b
Comment 11 Stefan Negrea 2014-02-20 15:16:13 EST
We will keep the RHQ 4.10 release for this BZ since it post-dates RHQ 4.9 release.


master branch commits:
https://git.fedorahosted.org/cgit/rhq/rhq.git/commit/?id=ff85ff537ec99fa3729c6734491faea74b19d7b1

https://git.fedorahosted.org/cgit/rhq/rhq.git/commit/?id=fa463dd7dbd51e8775155ecbd32c30c94e7204e7
Comment 12 Heiko W. Rupp 2014-02-21 04:22:00 EST
So as they are in 4.10, we can put this one on_qa - thanks for checking.
Comment 13 Elias Ross 2014-05-06 21:06:54 EDT
*** Bug 1032199 has been marked as a duplicate of this bug. ***

Note You need to log in before you can comment on or make changes to this bug.