Bug 1414207
| Summary: | Too many ISO refreshing events | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Virtualization Manager | Reporter: | Germano Veit Michel <gveitmic> | ||||
| Component: | ovirt-engine | Assignee: | Tal Nisan <tnisan> | ||||
| Status: | CLOSED ERRATA | QA Contact: | Kevin Alon Goldblatt <kgoldbla> | ||||
| Severity: | high | Docs Contact: | |||||
| Priority: | high | ||||||
| Version: | 4.0.5 | CC: | apinnick, eedri, gveitmic, kgoldbla, lsurette, mkalinin, ratamir, rbalakri, Rhev-m-bugs, shipatil, srevivo, tnisan, ykaul, ylavi | ||||
| Target Milestone: | ovirt-4.2.0 | ||||||
| Target Release: | 4.2.0 | ||||||
| Hardware: | x86_64 | ||||||
| OS: | Linux | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |||||
| Doc Text: |
Previously, the ForceRefreshDomainFilesListByDefault parameter used by virt-viewer was set to "True" by default, sometimes causing a flood of ISO domain refresh events in the logs if many console clients were connected. In the current release, setting this parameter to "False" reduces the number of logged refresh events.
|
Story Points: | --- | ||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2018-05-15 17:40:52 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | Storage | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Attachments: |
|
||||||
|
Description
Germano Veit Michel
2017-01-18 02:59:00 UTC
Yaniv, what do you think will be the best course of action here? Why is the client requesting a refresh? To allow mounting the ISOs on the client drop-down? It comes from virt-viewer apparently, we have a mechanism to avoid repeating audit logs and suppress them for a given amount of time, sounds OK to you? (In reply to Tal Nisan from comment #4) > It comes from virt-viewer apparently, we have a mechanism to avoid repeating > audit logs and suppress them for a given amount of time, sounds OK to you? Why is this audit needed? Maybe drop it? Wouldn't the client refresh cause a scale issue. Why is the client doing this? Maybe a refresh every few minutes is enough. (In reply to Yaniv Dary from comment #6) > Wouldn't the client refresh cause a scale issue. Why is the client doing > this? > Maybe a refresh every few minutes is enough. But this is exactly what's happening. Every client is asking every 5 minutes. But since we do not respond in a cached response, multiple clients cause a refresh storm. (In reply to Yaniv Kaul from comment #7) > (In reply to Yaniv Dary from comment #6) > > Wouldn't the client refresh cause a scale issue. Why is the client doing > > this? > > Maybe a refresh every few minutes is enough. > But this is exactly what's happening. Every client is asking every 5 > minutes. But since we do not respond in a cached response, multiple clients > cause a refresh storm. Exactly, the problem is not the interval, it's the fact that it's per client and not a general interval (In reply to Tal Nisan from comment #8) > (In reply to Yaniv Kaul from comment #7) > > (In reply to Yaniv Dary from comment #6) > > > Wouldn't the client refresh cause a scale issue. Why is the client doing > > > this? > > > Maybe a refresh every few minutes is enough. > > But this is exactly what's happening. Every client is asking every 5 > > minutes. But since we do not respond in a cached response, multiple clients > > cause a refresh storm. > > Exactly, the problem is not the interval, it's the fact that it's per client > and not a general interval Tal, how do we proceed? The audit log is the problem, or the fact we are actually performing so many refreshes? I reckon the latter. If it's just an audit log issue, I'm far less concerned. As I wrote in comment 4, we can suppress those quite easily but I think the refresh every 5 minutes multiplied by each running guest is the real issue here, some caching mechanism with an expiry time of a few minutes should do the trick (In reply to Tal Nisan from comment #10) > As I wrote in comment 4, we can suppress those quite easily but I think the > refresh every 5 minutes multiplied by each running guest is the real issue > here, some caching mechanism with an expiry time of a few minutes should do > the trick Agreed, please implement. I reproduced the scenario with the following code:
------------------------------------------------------
ovirt-engine-4.1.1-0.1.el7.noarch
rhevm-4.1.1-0.1.el7.noarch
vdsm-4.19.5-1.el7ev.x86_64
Reproduced with the following scenario:
-----------------------------------------------------
1. Created a vm with a disk and started the vm and accessed the console:
>> only one messaged was reported in the engine log every 5 minutes
"Refresh image list succeeded for domain(s): ISO_DOMAIN (All file type)"
2. Created 9 more vms and started them
3. Accessed each VM's via the console with about 1 to 2 minutes appart >>> Now a request is reported in the log up to 3 times a minute
Print out from engine log:
----------------------------
2017-02-22 16:51:58,106+02 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-7) [d23ae7c5-6349-4b25-b6a8-4ae741f1dd71] EVENT_ID: REFRESH_REPOSITORY_IMAGE_LIST_SUCCEEDED(998), Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: Refresh image list succeeded for domain(s): ISO_DOMAIN (All file type)
2017-02-22 16:52:21,031+02 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-12) [de0ade3b-f406-4ea3-ab1e-7ed68220dbfc] EVENT_ID: REFRESH_REPOSITORY_IMAGE_LIST_SUCCEEDED(998), Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: Refresh image list succeeded for domain(s): ISO_DOMAIN (All file type)
2017-02-22 16:52:54,095+02 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-4) [0ec6a6d4-e0ac-4374-839a-b929761703e7] EVENT_ID: REFRESH_REPOSITORY_IMAGE_LIST_SUCCEEDED(998), Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: Refresh image list succeeded for domain(s): ISO_DOMAIN (All file type)
2017-02-22 16:53:34,122+02 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-16) [c0d62762-3786-4393-9dcc-25530fe1769c] EVENT_ID: REFRESH_REPOSITORY_IMAGE_LIST_SUCCEEDED(998), Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: Refresh image list succeeded for domain(s): ISO_DOMAIN (All file type)
2017-02-22 16:53:40,781+02 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-13) [8bd4c7f8-b9a1-4fde-b206-f6372855fb3b] EVENT_ID: REFRESH_REPOSITORY_IMAGE_LIST_SUCCEEDED(998), Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: Refresh image list succeeded for domain(s): ISO_DOMAIN (All file type)
2017-02-22 16:53:52,777+02 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-7) [123a046d-4f82-4ccd-b38c-311d9c0a7577] EVENT_ID: REFRESH_REPOSITORY_IMAGE_LIST_SUCCEEDED(998), Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: Refresh image list succeeded for domain(s): ISO_DOMAIN (All file type)
2017-02-22 16:54:54,148+02 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-15) [6dc78534-75bb-402f-8c5e-ebf5d0503f43] EVENT_ID: REFRESH_REPOSITORY_IMAGE_LIST_SUCCEEDED(998), Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: Refresh image list succeeded for domain(s): ISO_DOMAIN (All file type)
2017-02-22 16:55:24,517+02 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-8) [b0dfac5c-85cf-4f4b-8dac-baff0cb43b05] EVENT_ID: REFRESH_REPOSITORY_IMAGE_LIST_SUCCEEDED(998), Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: Refresh image list succeeded for domain(s): ISO_DOMAIN (All file type)
2017-02-22 16:56:58,713+02 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-15) [5e5822cd-5519-4657-a108-cca63fa30aee] EVENT_ID: REFRESH_REPOSITORY_IMAGE_LIST_SUCCEEDED(998), Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: Refresh image list succeeded for domain(s): ISO_DOMAIN (All file type)
2017-02-22 16:57:21,759+02 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-4) [a65d3c24-a89a-4db1-895c-13b225a7f3a8] EVENT_ID: REFRESH_REPOSITORY_IMAGE_LIST_SUCCEEDED(998), Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: Refresh image list succeeded for domain(s): ISO_DOMAIN (All file type)
2017-02-22 16:57:54,817+02 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-7) [c69e93d5-9c8e-49e4-82be-3851459b1409] EVENT_ID: REFRESH_REPOSITORY_IMAGE_LIST_SUCCEEDED(998), Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: Refresh image list succeeded for domain(s): ISO_DOMAIN (All file type)
2017-02-22 16:58:34,791+02 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-12) [41c6b2c8-4ecb-4e96-a17c-865215d71196] EVENT_ID: REFRESH_REPOSITORY_IMAGE_LIST_SUCCEEDED(998), Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: Refresh image list succeeded for domain(s): ISO_DOMAIN (All file type)
2017-02-22 16:58:42,990+02 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-2) [8cdb28b1-cb80-41a7-b9c1-4ff2506806c1] EVENT_ID: REFRESH_REPOSITORY_IMAGE_LIST_SUCCEEDED(998), Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: Refresh image list succeeded for domain(s): ISO_DOMAIN (All file type)
2017-02-22 16:58:53,719+02 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-1) [3c99063c-a313-49b9-aa78-c290f5061eb7] EVENT_ID: REFRESH_REPOSITORY_IMAGE_LIST_SUCCEEDED(998), Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: Refresh image list succeeded for domain(s): ISO_DOMAIN (All file type)
Created attachment 1256522 [details]
server, engine and vdsm logs
Added logs
4.1.4 is planned as a minimal, fast, z-stream version to fix any open issues we may have in supporting the upcoming EL 7.4. Pushing out anything unrelated, although if there's a minimal/trival, SAFE fix that's ready on time, we can consider introducing it in 4.1.4. The proposed patch is way too delicate for a z-stream. In interests of stability, I propose moving this out to 4.2. INFO: Bug status wasn't changed from MODIFIED to ON_QA due to the following reason: [No relevant external trackers attached] For more info please contact: rhv-devops INFO: Bug status wasn't changed from MODIFIED to ON_QA due to the following reason: [Project 'ovirt-engine-api-model'/Component 'ovirt-engine' mismatch] For more info please contact: rhv-devops (In reply to rhev-integ from comment #23) > INFO: Bug status wasn't changed from MODIFIED to ON_QA due to the following > reason: > > [Project 'ovirt-engine-api-model'/Component 'ovirt-engine' mismatch] At least one patch was on the correct project. > > For more info please contact: rhv-devops Its in our hands, do we want to allow other projects attached to the bug and move bugs to ON_QA even if it has one patch on the same component? if not one, then how many? If we can agree on a logic, we can work on change it in our script. Tested with the following code:
--------------------------------
ovirt-engine-4.2.0-0.6.el7.noarch
vdsm-4.20.9-1.el7ev.x86_64
Verified with the following scenario:
-----------------------------------------------------
1. Created a vm with a disk and started the vm and accessed the console:
>> only one messaged was reported in the engine log every 5 minutes
"2017-12-12 11:57:41,202+02 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-2) [6d0552f8-b976-4e01-82c4-a1b20c0b93e3] EVENT_ID: REFRESH_REPOSITORY_IMAGE_LIST_SUCCEEDED(998), Refresh image list succeeded for domain(s): ISO369 (All file type)
"
2. Created 9 more vms and started them
3. Started the console on all 10 vms
After 20 minutes 26 records
After 90 minutes 156 records and then Access 1 more vm
After 180 minutes 166 records
4. Changed the config value on the engine as follows:
engine-config --set ForceRefreshDomainFilesListByDefault=false
[root@kevin4 ~]# engine-config --get ForceRefreshDomainFilesListByDefault
ForceRefreshDomainFilesListByDefault: false version: general
[root@kevin4 ~]# systemctl restart ovirt-engine
5. Waited around 2 hours, access the same 10 consoles >>>>> only 30 records were reported.
Moving to VERIFIED
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2018:1488 BZ<2>Jira Resync |