Bug 1032308 - Storage node should output a warning from RhqInternodeAuthenticator if node not found
Summary: Storage node should output a warning from RhqInternodeAuthenticator if node n...
Keywords:
Status: NEW
Alias: None
Product: RHQ Project
Classification: Other
Component: Storage Node
Version: 4.9
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: ---
Assignee: Nobody
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-11-19 23:59 UTC by Elias Ross
Modified: 2022-03-31 04:28 UTC (History)
0 users

Fixed In Version:
Clone Of:
Environment:
Last Closed:
Embargoed:


Attachments (Terms of Use)

Description Elias Ross 2013-11-19 23:59:34 UTC
Description of problem:

I have spent a few hours trying to figure out why certain nodes were not connecting and my cluster could not start. It turns out that I was missing nodes in rhq-storage-auth.conf that I needed. 

This was through a manual process.

The authenticator should output a WARN message if it fails to authenticate, perhaps at most 2 or 3 times, to keep the log from flooding.


Version-Release number of selected component (if applicable): 4.9


How reproducible: Always


Steps to Reproduce:
1. Remote a host from rhq-storage-auth.conf
2. Attempt to start the cluster

Actual results: Cluster can't start

Expected results: Not starting, but something explaining why.

Additional info:

Comment 1 John Sanda 2013-11-20 21:31:17 UTC
RHQ updates the rhq-storage-auth.conf file when nodes are added/removed from the cluster. The only time a user directly edit the file is when multiple storage nodes are deployed prior to your RHQ server being installed. With that said, it is  entirely possible for the file to be incorrect.

We could certainly see about adding some logging, but we would certainly want to keep it light and fast as the authenticator executes at the bottom of the C* stack in the messaging layer. More importantly though, we need a comprehensive solution for when new nodes fail to join the cluster or when existing nodes cannot communicate with the cluster. The cluster status column in the storage node UI already addresses deployment scenarios. If a node's cluster status is DOWN, then it can be assumed that the node is not part of the cluster. It does not however address post-deployment scenarios.


Note You need to log in before you can comment on or make changes to this bug.