Red Hat Bugzilla – Bug 973742
Globally uncaught exception on Storage Node click under Administration/Topology
Last modified: 2015-09-02 20:01:41 EDT
Created attachment 760208 [details]
Globally uncaught exception
Description of problem:
Globally uncaught exception on Storage Node under Administration/Topology when rhq storage started as root and jon agent/server started as non-root user.
Version-Release number of selected component (if applicable):
build id: 2c9b8df
Steps to Reproduce:
1. Run rhqctl install
2. Stop rhq storage node
3. Start storage as root user
4. Uninventory platform (fully)
after step 3. Globally uncaught exception on Storage Node click under Administration/Topology.
After step 4. Rhq storage Node is listed under Administration/Topology without resource Id
Rhq server is not set to Maintenance mode (cannot connect to storage)
After step 3. storage node data is removed from db
rhq server is put into maintenance mode
no globally uncaught exception
That should be fixed before RHQ 4.9 / 3.2 beta.
The expectations for the scenario you described above are:
1) No error in the Storage Nodes UI
2) There is no Resource ID in the UI
3) If the storage node process is not stopped, the server will continue to run since the CQL connection is still maintained.
So in this case, it is not possible to manage the storage node from RHQ but the server should keep on running as long the storage node is running correctly.
Please retest with the new set of expectations.
With respect to inventory, after removing the platform from inventory, I would expect to see it back in inventory because of the auto import functionality that has been introduced. The platform and all its resources are removed from the agent's inventory, and then the agent does a discovery scan. It will rediscover the platform and the storage node and report them to the server at which point both should get auto imported back into inventory. And because the StorageNode entity still exists, no cluster maintenance work will be scheduled.
I agree that we need to handle this gracefully in the UI, and this should be fully addressed with bug 1002174.
Non-root agent can see the storage node, node is visible in storage nodes administration as available node, agent gets " Error creating path for yaml file" exception while running an operation against the node.
reproduced the issue described in comment #4 with jon 3.2 er1 - moving back to assigned
Can you please clarify comment #4 . What user runs that agent and what user runs the storage node?
during the install the same hudson user was running storage, agent and server, after a restart root is running the storage.
Running the agent and storage node under different users is NOT a supported use case. The agent and the storage need to be run by the same user due to the requirement for the agent to have the ability to update cassandra.yaml. This a hard requirement for the Storage Node agent plugin.
In your latest test case, if you are still able to navigate to the Storage Admin UI page (even if there are warnings) then the initial reported bug has been fixed.
Created attachment 823390 [details]
Created attachment 823391 [details]
agent started as root can discover storage started as non-root - admin-> topology-> storage nodes are listed, storage resource operation can be run without issues (tried update configuration).