Bug 2234399
| Summary: | VirtControllerRESTErrorsHigh not fired | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | Container Native Virtualization (CNV) | Reporter: | Ohad <orevah> | ||||||
| Component: | Metrics | Assignee: | João Vilaça <jvilaca> | ||||||
| Status: | CLOSED NOTABUG | QA Contact: | Natalie Gavrielov <ngavrilo> | ||||||
| Severity: | high | Docs Contact: | |||||||
| Priority: | high | ||||||||
| Version: | 4.14.0 | CC: | acardace, kmajcher, stirabos | ||||||
| Target Milestone: | --- | ||||||||
| Target Release: | 4.15.1 | ||||||||
| Hardware: | Unspecified | ||||||||
| OS: | Unspecified | ||||||||
| Whiteboard: | |||||||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |||||||
| Doc Text: | Story Points: | --- | |||||||
| Clone Of: | Environment: | ||||||||
| Last Closed: | 2023-10-24 17:02:09 UTC | Type: | Bug | ||||||
| Regression: | --- | Mount Type: | --- | ||||||
| Documentation: | --- | CRM: | |||||||
| Verified Versions: | Category: | --- | |||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||
| Embargoed: | |||||||||
| Attachments: |
|
||||||||
|
Description
Ohad
2023-08-24 10:07:43 UTC
Created attachment 1985007 [details]
screenshot of the alerts metrics on the same time
Hi, it seems this bug should be fixed by Virt team. Can you fix it? (we can assist if needed) Deferring to 4.15 due to capacity since we're in blockers-only phase. @acardace - I though since it's a Virt alert it should be on Virt team, but let me doublecheck and if we are more suited for fixing it, we'll take it. Will update the bug this week. @orevah this behavior is totally normal "VirtControllerRESTErrorsHigh is when more than 5% of rest calls failed" "VirtControllerRESTErrorsBurst ... more than 80% of the rest calls failed" that is correct, but the time frames make all the difference here. 'VirtControllerRESTErrorsHigh' check requests in the last hour while 'VirtControllerRESTErrorsBurst' check requests in the last 5 minutes. 'VirtControllerRESTErrorsBurst' is firing because all requests (most or at least) are probably failing because the service is facing catastrophic failures. 'VirtControllerRESTErrorsBurst' is useful to know when the service is mostly working correctly but some endpoints/requests are failing. Ohad, can you please retest, having in mind that 'VirtControllerRESTErrorsHigh' check requests in the last hour while 'VirtControllerRESTErrorsBurst' check requests in the last 5 minutes. (if the test is not running for an hour, it would be not deterministic to verify it with certainty) I tested it again and it seems to be working now, closing this bug. |