Bug 973742

Summary: Globally uncaught exception on Storage Node click under Administration/Topology
Product: [JBoss] JBoss Operations Network Reporter: Armine Hovsepyan <ahovsepy>
Component: Core ServerAssignee: Stefan Negrea <snegrea>
Status: CLOSED CURRENTRELEASE QA Contact: Mike Foley <mfoley>
Severity: high Docs Contact:
Priority: high    
Version: JON 3.2CC: ahovsepy, hrupp, jsanda, mfoley, snegrea
Target Milestone: ER01   
Target Release: JON 3.2.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-01-02 20:34:02 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 951619    
Attachments:
Description Flags
Globally uncaught exception
none
root_agent_hudson_storage-admin-topology
none
storage-update-config none

Description Armine Hovsepyan 2013-06-12 15:21:52 UTC
Created attachment 760208 [details]
Globally uncaught exception

Description of problem:
Globally uncaught exception on Storage Node under Administration/Topology when rhq storage started as root and jon agent/server started as non-root user.

Version-Release number of selected component (if applicable):
build id: 2c9b8df

How reproducible:
always

Steps to Reproduce:
1. Run rhqctl install
2. Stop rhq storage node
3. Start storage as root user
4. Uninventory platform (fully)


Actual results:
after step 3. Globally uncaught exception on Storage Node click under Administration/Topology.
After step 4. Rhq storage Node is listed under Administration/Topology without resource Id
Rhq server is not set to Maintenance mode (cannot connect to storage)

Expected results:

After step 3. storage node data is removed from db
rhq server is put into maintenance mode
no globally uncaught exception


Additional info:
screenshot attached

Comment 1 Heiko W. Rupp 2013-08-24 17:49:44 UTC
That should be fixed before RHQ 4.9 / 3.2 beta.

Comment 2 Stefan Negrea 2013-08-26 21:34:28 UTC
The expectations for the scenario you described above are:
1) No error in the Storage Nodes UI
2) There is no Resource ID in the UI
3) If the storage node process is not stopped, the server will continue to run since the CQL connection is still maintained. 

So in this case, it is not possible to manage the storage node from RHQ but the server should keep on running as long the storage node is running correctly.


Please retest with the new set of expectations.

Comment 3 John Sanda 2013-08-29 13:58:52 UTC
With respect to inventory, after removing the platform from inventory, I would expect to see it back in inventory because of the auto import functionality that has been introduced. The platform and all its resources are removed from the agent's inventory, and then the agent does a discovery scan. It will rediscover the platform and the storage node and report them to the server at which point both should get auto imported back into inventory. And because the StorageNode entity still exists, no cluster maintenance work will be scheduled.

I agree that we need to handle this gracefully in the UI, and this should be fully addressed with bug 1002174.

Comment 4 Armine Hovsepyan 2013-08-30 13:10:38 UTC
reassigning bug

Non-root agent can see the storage node, node is visible in storage nodes administration as available node, agent gets " Error creating path for yaml file" exception while running an operation against the node.

Comment 6 Armine Hovsepyan 2013-09-20 15:00:22 UTC
reproduced the issue described in comment #4 with jon 3.2 er1  - moving back to assigned

Comment 7 Stefan Negrea 2013-11-11 15:55:41 UTC
Can you please clarify comment #4 . What user runs that agent and what user runs the storage node?

Comment 8 Armine Hovsepyan 2013-11-12 08:33:06 UTC
during the install the same hudson user was running storage, agent and server, after a restart root is running the storage.

Comment 9 Stefan Negrea 2013-11-12 15:43:31 UTC
Running the agent and storage node under different users is NOT a supported use case. The agent and the storage need to be run by the same user due to the requirement for the agent to have the ability to update cassandra.yaml. This a hard requirement for the Storage Node agent plugin.

In your latest test case, if you are still able to navigate to the Storage Admin UI page (even if there are warnings) then the initial reported bug has been fixed.

Comment 10 Armine Hovsepyan 2013-11-13 12:30:56 UTC
Created attachment 823390 [details]
root_agent_hudson_storage-admin-topology

Comment 11 Armine Hovsepyan 2013-11-13 12:37:12 UTC
Created attachment 823391 [details]
storage-update-config

Comment 12 Armine Hovsepyan 2013-11-13 12:48:16 UTC
verified 

agent started as root can discover storage started as non-root - admin-> topology-> storage nodes are listed, storage resource operation can be run without issues (tried update configuration).

screen-shots attached