Bug 2179991
| Summary: | VirtApiRESTErrorsBurst threshold high | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Container Native Virtualization (CNV) | Reporter: | Ohad <orevah> | ||||
| Component: | Virtualization | Assignee: | ffossemo | ||||
| Status: | ASSIGNED --- | QA Contact: | Kedar Bidarkar <kbidarka> | ||||
| Severity: | medium | Docs Contact: | |||||
| Priority: | medium | ||||||
| Version: | 4.13.0 | CC: | acardace, kedar.lad, sradco, stirabos | ||||
| Target Milestone: | --- | Flags: | sradco:
needinfo?
(orevah) sradco: needinfo? (kedar.lad) |
||||
| Target Release: | 4.14.0 | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | Type: | Bug | |||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Attachments: |
|
||||||
|
Description
Ohad
2023-03-20 14:16:40 UTC
We need to drop the evaluation time. I'm very confused here. Looking at the steps to reproduce the scenario, it appears that virt-api has been left in a non-running state? Its not clear to me what removing its role binding does after its already running. But, the REST API endpoint for this alert is virt-api itself, is it not? Shirly, you mention that we should drop the evaluation time, but it's not clear that will do anything useful. Can you help us understand what needs to be done and why? I can't really comment on the steps to reproduce. I think this is a question for Ohad. Probably what he was trying to do is get the requests to fail in high %. We need to drop the evaluation time, since in the expression itself we are looking back 5 minutes and checking the % of failed requests. If the failure % is greater than 80% than the alert should fire immediately and not wait for 5m. It the same as VirtApiRESTErrorsHigh. Targeting this to CNV 4.15 depending upon the severity and anticipated capacity at this point. Discussed with Virt Devs, targeting it back to 4.14 Target Version. |