Description of problem: If AvailableUpdatesFinder finds already running process it should not be ERROR level 2016-12-01 11:44:51,707+01 ERROR [org.ovirt.engine.core.bll.host.AvailableUpdatesFinder] (pool-5-thread-4) [4d7f2acc] Failed to refresh host 'dell-r210ii-04' packages availability, another refresh process already running. 2016-12-01 11:44:51,708+01 ERROR [org.ovirt.engine.core.vdsbroker.VdsManager] (pool-5-thread-4) [4d7f2acc] Failed to check if updates are available for host 'dell-r210ii-04' with error message 'Failed to refresh host 'dell-r210ii-04' packages availability, another refresh process already running.' 2016-12-01 11:44:51,708+01 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-2) [f66108c] Correlation ID: f66108c, Call Stack: null, Custom Event ID: -1, Message: User admin@internal-authz logged out. 2016-12-01 11:44:51,712+01 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (pool-5-thread-4) [4d7f2acc] Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: Failed to check for available updates on host dell-r210ii-04 with message 'Failed to refresh host 'dell-r210ii-04' packages availability, another refresh process already running.'. IMO it should not be ERROR but lower level. IIUC our highest level is ERROR (ie. we don't use FATAL, CRIT, EMERG, ALERT etc..) and thus making this as ERROR could confuse monitoring systems which would expend an ERROR message would be real error-problem. Version-Release number of selected component (if applicable): ovirt-engine-backend-4.1.0-0.0.master.20161129231316.git4dae799.el7.centos.noarch How reproducible: 100% Steps to Reproduce: 1. run upgradecheck via api twice with sleep 1 between them 2. 3. Actual results: ERROR message in log Expected results: < ERROR level message in log Additional info:
We do use alert. I find this use-case not that interesting, but I'm leaving the decision here to Moti.
We can reduce the severity of that message from ERROR to WARN.
ovirt-engine-backend-4.1.0.1-0.1.el7.noarch [root@jbelka-vm2 yum.repos.d]# grep -i 'message:.*another refresh process is already running' /var/log/ovirt-engine/engine.log 2017-01-24 11:57:56,214+01 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (pool-5-thread-3) [29b090ae] EVENT_ID: HOST_AVAILABLE_UPDATES_FAILED(839), Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: Failed to check for available updates on host dell-r210ii-04 with message 'Another refresh process is already running'. 2017-01-24 11:57:57,682+01 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (pool-5-thread-4) [1fa56e70] EVENT_ID: HOST_AVAILABLE_UPDATES_FAILED(839), Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: Failed to check for available updates on host dell-r210ii-04 with message 'Another refresh process is already running'. 2017-01-24 11:57:59,082+01 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (pool-5-thread-5) [2858eb5b] EVENT_ID: HOST_AVAILABLE_UPDATES_FAILED(839), Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: Failed to check for available updates on host dell-r210ii-04 with message 'Another refresh process is already running'. 2017-01-24 11:58:00,508+01 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (pool-5-thread-6) [22695697] EVENT_ID: HOST_AVAILABLE_UPDATES_FAILED(839), Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: Failed to check for available updates on host dell-r210ii-04 with message 'Another refresh process is already running'. [root@jbelka-vm2 yum.repos.d]# grep -i 'WARN.*another refresh process is already running' /var/log/ovirt-engine/engine.log 2017-01-24 11:57:56,209+01 WARN [org.ovirt.engine.core.bll.hostdeploy.HostUpdatesChecker] (pool-5-thread-3) [29b090ae] Failed to refresh host 'dell-r210ii-04' packages availability, another refresh process is already running. 2017-01-24 11:57:57,674+01 WARN [org.ovirt.engine.core.bll.hostdeploy.HostUpdatesChecker] (pool-5-thread-4) [1fa56e70] Failed to refresh host 'dell-r210ii-04' packages availability, another refresh process is already running. 2017-01-24 11:57:59,072+01 WARN [org.ovirt.engine.core.bll.hostdeploy.HostUpdatesChecker] (pool-5-thread-5) [2858eb5b] Failed to refresh host 'dell-r210ii-04' packages availability, another refresh process is already running. 2017-01-24 11:58:00,500+01 WARN [org.ovirt.engine.core.bll.hostdeploy.HostUpdatesChecker] (pool-5-thread-6) [22695697] Failed to refresh host 'dell-r210ii-04' packages availability, another refresh process is already running.
Target release should be placed once a package build is known to fix a issue. Since this bug is not modified, the target version has been reset. Please use target milestone to plan a fix for a oVirt release.
engine=# select severity,log_type_name,message from audit_log where message like '%Another refresh process is already running%'; severity | log_type_name | message ----------+-------------------------------+------------------------------------------------------------------------------------------------------------------------- 2 | HOST_AVAILABLE_UPDATES_FAILED | Failed to check for available updates on host dell-r210ii-04 with message 'Another refresh process is already running'. 2 | HOST_AVAILABLE_UPDATES_FAILED | Failed to check for available updates on host dell-r210ii-04 with message 'Another refresh process is already running'. 2 | HOST_AVAILABLE_UPDATES_FAILED | Failed to check for available updates on host dell-r210ii-04 with message 'Another refresh process is already running'. 2 | HOST_AVAILABLE_UPDATES_FAILED | Failed to check for available updates on host dell-r210ii-04 with message 'Another refresh process is already running'. (4 rows)
Ok, I understand the point: There is a need to separate the event type for the both cases: failure to update as a result of another running upgrade process and any other failure. For that purpose I'll introduce another audit log type for the specific case of another running process with a WARNING severity. The log entries which aren't related to the event-log in the engine.log were already reduced to warn severity.
log entries for already running upgradecheck changed are WARN verified in ovirt-engine-4.1.1.2-0.1.el7.noarch