Bug 1133611

Summary: No health check alert is issued in ui after first message was issued
Product: [Retired] oVirt Reporter: sefi litmanovich <slitmano>
Component: ovirt-engine-coreAssignee: Eli Mesika <emesika>
Status: CLOSED CURRENTRELEASE QA Contact: sefi litmanovich <slitmano>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 3.5CC: ecohen, gklein, iheim, oourfali, rbalakri, yeylon
Target Milestone: ---   
Target Release: 3.5.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: infra
Fixed In Version: ovirt-3.5.0_rc2 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-10-17 12:40:12 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Infra RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
engine log none

Description sefi litmanovich 2014-08-25 14:45:53 UTC
Created attachment 930541 [details]
engine log

Description of problem:

when configuring power management health check and adding a power management agent with false credentials to a host, a warning will be issued, but afterwards no similar warning will be issued.

Version-Release number of selected component (if applicable):

ovirt-engine-3.5.0-0.0.master.20140821064931.gitb794d66.el6.noarch

How reproducible:

always

Steps to Reproduce:

1. with engine-config, configure: PMHealthCheckEnabled=true, PMHealthCheckIntervalInSec = 20.
2. restart engine.
3. add two hosts, one with a pm agent.
4. configure the pm agent with invalid credential (wrong username/pass/port)
5. wait for health check alert to be issued.
6.a. configure the pm agent with the correct credentials and wait until a check is done and successful.
7.a. configure the pm agent with invalid credential again (wrong username/pass/port)
6.b. remove the alert from the alerts tab

Actual results:

in either 6.a-7.a or 6.b, the result is a message regarding the issued STATUS command every 20 sec, but no alert is issued again regarding the health check's failure (which does appear in engine.log)


Expected results:

After removing a health check alert, either manually or by correcting the initial problem, we expect to get an alert again in case the same power management agent is faulty again.

Additional info:

Comment 1 sefi litmanovich 2014-09-16 14:43:40 UTC
Verified with rhevm-3.5.0-0.12.beta.el6ev.noarch according to scenarios described on the description, also tested scenarios checking all different error messages (sequential primary, sequential secondary, concurrent primary, concurrent secondary) are reacting the same and are reappearing after first occurrence.

Comment 2 Sandro Bonazzola 2014-10-17 12:40:12 UTC
oVirt 3.5 has been released and should include the fix for this issue.