Bug 2234399 - VirtControllerRESTErrorsHigh not fired
Summary: VirtControllerRESTErrorsHigh not fired
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Container Native Virtualization (CNV)
Classification: Red Hat
Component: Metrics
Version: 4.14.0
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 4.15.1
Assignee: João Vilaça
QA Contact: Natalie Gavrielov
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2023-08-24 10:07 UTC by Ohad
Modified: 2024-01-18 08:13 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2023-10-24 17:02:09 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
screenshot of the alerts metrics on the same time (122.73 KB, application/pdf)
2023-08-24 10:07 UTC, Ohad
no flags Details
screenshot of the alerts metrics on the same time (123.26 KB, application/pdf)
2023-08-24 10:10 UTC, Ohad
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker CNV-32395 0 None None None 2023-08-24 10:09:59 UTC

Description Ohad 2023-08-24 10:07:43 UTC
Created attachment 1985006 [details]
screenshot of the alerts metrics on the same time

Description of problem:
The VirtControllerRESTErrorsHigh alert is not fired while the VirtControllerRESTErrorsBurst fired.

VirtControllerRESTErrorsHigh is when more than 5% of rest calls failed, this alert didn't fired while the VirtControllerRESTErrorsBurst is fired (more than 80% of the rest calls failed)


Version-Release number of selected component (if applicable):


How reproducible:
https://polarion.engineering.redhat.com/polarion/#/project/CNV/workitem?id=CNV-9992

Steps to Reproduce:
1.
2.
3.

Actual results:
The alert is not fired

Expected results:
The alert should fire

Additional info:
Added screenshots from the metrics from both of the alerts on the same time where the test is executed

Comment 1 Ohad 2023-08-24 10:10:19 UTC
Created attachment 1985007 [details]
screenshot of the alerts metrics on the same time

Comment 2 Krzysztof Majcher 2023-09-13 10:28:35 UTC
Hi, it seems this bug should be fixed by Virt team. Can you fix it? (we can assist if needed)

Comment 3 Antonio Cardace 2023-09-22 09:41:36 UTC
Deferring to 4.15 due to capacity since we're in blockers-only phase.

Comment 5 Krzysztof Majcher 2023-10-11 13:11:11 UTC
@acardace - I though since it's a Virt alert it should be on Virt team, but let me doublecheck and if we are more suited for fixing it, we'll take it. Will update the bug this week.

Comment 6 João Vilaça 2023-10-16 08:48:05 UTC
@orevah 

this behavior is totally normal

"VirtControllerRESTErrorsHigh is when more than 5% of rest calls failed" 
"VirtControllerRESTErrorsBurst ... more than 80% of the rest calls failed"

that is correct, but the time frames make all the difference here. 
'VirtControllerRESTErrorsHigh' check requests in the last hour while
'VirtControllerRESTErrorsBurst' check requests in the last 5 minutes.

'VirtControllerRESTErrorsBurst' is firing because all requests (most or at least)
are probably failing because the service is facing catastrophic failures.

'VirtControllerRESTErrorsBurst' is useful to know when the service is mostly
working correctly but some endpoints/requests are failing.

Comment 7 Krzysztof Majcher 2023-10-17 12:56:11 UTC
Ohad, can you please retest, having in mind that 

'VirtControllerRESTErrorsHigh' check requests in the last hour while
'VirtControllerRESTErrorsBurst' check requests in the last 5 minutes.

(if the test is not running for an hour, it would be not deterministic to verify it with certainty)

Comment 8 Ohad 2023-10-24 16:59:35 UTC
I tested it again and it seems to be working now, closing this bug.


Note You need to log in before you can comment on or make changes to this bug.