Bug 973742 - Globally uncaught exception on Storage Node click under Administration/Topology
Globally uncaught exception on Storage Node click under Administration/Topology
Status: CLOSED CURRENTRELEASE
Product: JBoss Operations Network
Classification: JBoss
Component: Core Server (Show other bugs)
JON 3.2
Unspecified Unspecified
high Severity high
: ER01
: JON 3.2.0
Assigned To: Stefan Negrea
Mike Foley
:
Depends On:
Blocks: 951619
  Show dependency treegraph
 
Reported: 2013-06-12 11:21 EDT by Armine Hovsepyan
Modified: 2015-09-02 20:01 EDT (History)
5 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2014-01-02 15:34:02 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Globally uncaught exception (529.18 KB, image/png)
2013-06-12 11:21 EDT, Armine Hovsepyan
no flags Details
root_agent_hudson_storage-admin-topology (336.65 KB, image/png)
2013-11-13 07:30 EST, Armine Hovsepyan
no flags Details
storage-update-config (609.55 KB, image/png)
2013-11-13 07:37 EST, Armine Hovsepyan
no flags Details

  None (edit)
Description Armine Hovsepyan 2013-06-12 11:21:52 EDT
Created attachment 760208 [details]
Globally uncaught exception

Description of problem:
Globally uncaught exception on Storage Node under Administration/Topology when rhq storage started as root and jon agent/server started as non-root user.

Version-Release number of selected component (if applicable):
build id: 2c9b8df

How reproducible:
always

Steps to Reproduce:
1. Run rhqctl install
2. Stop rhq storage node
3. Start storage as root user
4. Uninventory platform (fully)


Actual results:
after step 3. Globally uncaught exception on Storage Node click under Administration/Topology.
After step 4. Rhq storage Node is listed under Administration/Topology without resource Id
Rhq server is not set to Maintenance mode (cannot connect to storage)

Expected results:

After step 3. storage node data is removed from db
rhq server is put into maintenance mode
no globally uncaught exception


Additional info:
screenshot attached
Comment 1 Heiko W. Rupp 2013-08-24 13:49:44 EDT
That should be fixed before RHQ 4.9 / 3.2 beta.
Comment 2 Stefan Negrea 2013-08-26 17:34:28 EDT
The expectations for the scenario you described above are:
1) No error in the Storage Nodes UI
2) There is no Resource ID in the UI
3) If the storage node process is not stopped, the server will continue to run since the CQL connection is still maintained. 

So in this case, it is not possible to manage the storage node from RHQ but the server should keep on running as long the storage node is running correctly.


Please retest with the new set of expectations.
Comment 3 John Sanda 2013-08-29 09:58:52 EDT
With respect to inventory, after removing the platform from inventory, I would expect to see it back in inventory because of the auto import functionality that has been introduced. The platform and all its resources are removed from the agent's inventory, and then the agent does a discovery scan. It will rediscover the platform and the storage node and report them to the server at which point both should get auto imported back into inventory. And because the StorageNode entity still exists, no cluster maintenance work will be scheduled.

I agree that we need to handle this gracefully in the UI, and this should be fully addressed with bug 1002174.
Comment 4 Armine Hovsepyan 2013-08-30 09:10:38 EDT
reassigning bug

Non-root agent can see the storage node, node is visible in storage nodes administration as available node, agent gets " Error creating path for yaml file" exception while running an operation against the node.
Comment 6 Armine Hovsepyan 2013-09-20 11:00:22 EDT
reproduced the issue described in comment #4 with jon 3.2 er1  - moving back to assigned
Comment 7 Stefan Negrea 2013-11-11 10:55:41 EST
Can you please clarify comment #4 . What user runs that agent and what user runs the storage node?
Comment 8 Armine Hovsepyan 2013-11-12 03:33:06 EST
during the install the same hudson user was running storage, agent and server, after a restart root is running the storage.
Comment 9 Stefan Negrea 2013-11-12 10:43:31 EST
Running the agent and storage node under different users is NOT a supported use case. The agent and the storage need to be run by the same user due to the requirement for the agent to have the ability to update cassandra.yaml. This a hard requirement for the Storage Node agent plugin.

In your latest test case, if you are still able to navigate to the Storage Admin UI page (even if there are warnings) then the initial reported bug has been fixed.
Comment 10 Armine Hovsepyan 2013-11-13 07:30:56 EST
Created attachment 823390 [details]
root_agent_hudson_storage-admin-topology
Comment 11 Armine Hovsepyan 2013-11-13 07:37:12 EST
Created attachment 823391 [details]
storage-update-config
Comment 12 Armine Hovsepyan 2013-11-13 07:48:16 EST
verified 

agent started as root can discover storage started as non-root - admin-> topology-> storage nodes are listed, storage resource operation can be run without issues (tried update configuration).

screen-shots attached

Note You need to log in before you can comment on or make changes to this bug.