Bug 1083894

Summary: Storage node's cluster status "NORMAL" written in RED cannot be changed
Product: [JBoss] JBoss Operations Network Reporter: bkramer <bkramer>
Component: UIAssignee: Jirka Kremser <jkremser>
Status: CLOSED CURRENTRELEASE QA Contact: Armine Hovsepyan <ahovsepy>
Severity: high Docs Contact:
Priority: high    
Version: JON 3.2CC: ahovsepy, jkremser, jsanda, jshaughn, loleary, mfoley
Target Milestone: CR01   
Target Release: JON 3.3.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
If a job ran on a storage node that timed out but finished properly, the cluster status of the node was displayed as NORMAL but written using red font which was atypical of this UX pattern. While the red text correctly indicated an underlying issue with the job, there was no way to clear the warning. The fix adds functionality to the storage node details page, which provide more information about an issue and allow the user to re-run the operation if circumstances require this action or acknowledge the error in order to remove the red font color.
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-12-11 14:03:24 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1083895    
Attachments:
Description Flags
node_status none

Description bkramer 2014-04-03 08:18:48 UTC
Description of problem:
If the operation runs on storage node times out (but finishes properly), the cluster status of the node will be in NORMAL state but written using red font. Currently, it seems that there is no way to acknowledge this and change the colour to black unless we directly modify the database.

Version-Release number of selected component (if applicable):
JON 3.2.0

How reproducible:
Always

Steps to Reproduce:
1. Installed JON Server with two storage nodes;
2. Execute "Repair" operation on the storage node and make it times out;
3. Check the operation history for that node and operation will be considered as "failed"
4. Check the log file and confirm that the "repair" was actually properly finished;

Actual results:
The cluster status of the storage node will be NORMAL in red.

Expected results:
The cluster status of the storage node should be NORMAL in red but we should have a way to acknowledge that we read the ERROR/WARN message and to change the colour of the font to black.

Additional info:

Comment 1 Jay Shaughnessy 2014-09-08 15:59:22 UTC
I'm not exactly sure what this is about, perhaps an issue with SN hook in operationmanager.  Asking Jirka to take a closer look for ER04.

Comment 2 John Sanda 2014-09-24 15:09:12 UTC
Jay is correct in that this does involve the hook in OperationManagerBean. Any time a resource operation is updated/finished, OperationManagerBean gets called. And if it is a storage node operation, StorageNodeOperationsHandlerBean gets called. That hooks was intended for cluster maintenance workflows like adding/removing nodes and repair. If a user submits a storage resource operation that fails for example, this will cause the StorageNode.failedOperation and StorageNode.errorMessage fields to get set which in turns causes the problem reported.

I discussed with Jirka changes he is working on that allow the user to manually clear those StorageNode fields on the details view in the storage node admin UI. I think/hope that adequately addresses the problem.

Comment 3 Jirka Kremser 2014-09-25 14:45:13 UTC
branch:  master
link:    https://github.com/rhq-project/rhq/commit/ed4714ee8
time:    2014-09-25 16:43:51 +0200
commit:  ed4714ee81a3b59e2350523cea16e17fc7bc0084
author:  Jirka Kremser - jkremser
message: [BZ 1083894] - Storage node's cluster status "NORMAL" written in RED
         cannot be changed - Adding a way to our UI to explicitly reset
         error message and last failed operation on the rhq_storage_node
         table.

+ 1 i-test

Comment 4 Jirka Kremser 2014-09-25 14:59:29 UTC
branch:  release/jon3.3.x
link:    https://github.com/rhq-project/rhq/commit/5880b8b72
time:    2014-09-25 16:58:33 +0200
commit:  5880b8b72b41950e640f33aa6ddb858f795d92fe
author:  Jirka Kremser - jkremser
message: [BZ 1083894] - Storage node's cluster status "NORMAL" written in RED
         cannot be changed - Adding a way to our UI to explicitly reset
         error message and last failed operation on the rhq_storage_node
         table.

         (cherry picked from commit
         ed4714ee81a3b59e2350523cea16e17fc7bc0084) Signed-off-by: Jirka
         Kremser <jkremser>

Comment 5 Simeon Pinder 2014-10-01 21:33:40 UTC
Moving to ON_QA as available for test with build:
https://brewweb.devel.redhat.com/buildinfo?buildID=388959

Comment 7 Armine Hovsepyan 2014-10-13 17:02:17 UTC
issue is still visible in JON3.3 ER04
screen-shot attached

reproduction steps:
1. Install JON (server, storage & agent) on IP1 (windows2008)
2. Start server and storage  -- agent not connected to server -> storage node installed (stays in status installed)
3. Install JON storage and agent on IP2 and connect to server on IP1, while storage on IP1 is not yet installed
4. Start and connect agent on IP1  -> storage node on IP2 has DOWN state
5. Run deploy node on IP2 

After step5. node becomes available (with status normal) but is red and bold

After step5. both nodes should be available and hav normal (black) state

Comment 8 Armine Hovsepyan 2014-10-13 17:05:19 UTC
Created attachment 946491 [details]
node_status

Comment 9 Jirka Kremser 2014-10-16 11:40:14 UTC
It's working as expected. The red text is denoting there was a failure (either error message was set or an operation didn't run successfully). If you open the storage node details page there should be more information what went wrong and user could ack the notification / re-run the operation / consult the docs / whatever.

Note that, I didn't address the root cause of the problem, I've only added the way to ack the error in UI as it was in the "Expected results:" and in John's comment 2.

Comment 10 Armine Hovsepyan 2014-10-20 09:03:12 UTC
marking as verified based on comment #9
thank you!