Red Hat Bugzilla – Bug 1032308
Storage node should output a warning from RhqInternodeAuthenticator if node not found
Last modified: 2013-11-20 16:31:17 EST
Description of problem:
I have spent a few hours trying to figure out why certain nodes were not connecting and my cluster could not start. It turns out that I was missing nodes in rhq-storage-auth.conf that I needed.
This was through a manual process.
The authenticator should output a WARN message if it fails to authenticate, perhaps at most 2 or 3 times, to keep the log from flooding.
Version-Release number of selected component (if applicable): 4.9
How reproducible: Always
Steps to Reproduce:
1. Remote a host from rhq-storage-auth.conf
2. Attempt to start the cluster
Actual results: Cluster can't start
Expected results: Not starting, but something explaining why.
RHQ updates the rhq-storage-auth.conf file when nodes are added/removed from the cluster. The only time a user directly edit the file is when multiple storage nodes are deployed prior to your RHQ server being installed. With that said, it is entirely possible for the file to be incorrect.
We could certainly see about adding some logging, but we would certainly want to keep it light and fast as the authenticator executes at the bottom of the C* stack in the messaging layer. More importantly though, we need a comprehensive solution for when new nodes fail to join the cluster or when existing nodes cannot communicate with the cluster. The cluster status column in the storage node UI already addresses deployment scenarios. If a node's cluster status is DOWN, then it can be assumed that the node is not part of the cluster. It does not however address post-deployment scenarios.