Bug 1400500 - If AvailableUpdatesFinder finds already running process it should not be ERROR level
Summary: If AvailableUpdatesFinder finds already running process it should not be ERRO...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: ovirt-engine
Classification: oVirt
Component: BLL.Infra
Version: 4.0.6.2
Hardware: Unspecified
OS: Unspecified
unspecified
low
Target Milestone: ovirt-4.1.1
: 4.1.1
Assignee: Moti Asayag
QA Contact: Lucie Leistnerova
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-12-01 11:51 UTC by Jiri Belka
Modified: 2017-04-21 09:38 UTC (History)
5 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2017-04-21 09:38:37 UTC
oVirt Team: Infra
Embargoed:
rule-engine: ovirt-4.1+
rule-engine: planning_ack+
masayag: devel_ack+
pstehlik: testing_ack+


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 67671 0 master MERGED engine: Reduce the severity of check-upgrade log 2016-12-01 15:03:55 UTC
oVirt gerrit 71250 0 master MERGED engine: Add event type for updates check already running 2017-01-30 08:58:23 UTC
oVirt gerrit 71371 0 ovirt-engine-4.1 MERGED engine: Add event type for updates check already running 2017-01-30 15:29:03 UTC

Description Jiri Belka 2016-12-01 11:51:54 UTC
Description of problem:

If AvailableUpdatesFinder finds already running process it should not be ERROR level

2016-12-01 11:44:51,707+01 ERROR [org.ovirt.engine.core.bll.host.AvailableUpdatesFinder] (pool-5-thread-4) [4d7f2acc] Failed to refresh host 'dell-r210ii-04' packages availability, another refresh process already
 running.
2016-12-01 11:44:51,708+01 ERROR [org.ovirt.engine.core.vdsbroker.VdsManager] (pool-5-thread-4) [4d7f2acc] Failed to check if updates are available for host 'dell-r210ii-04' with error message 'Failed to refresh 
host 'dell-r210ii-04' packages availability, another refresh process already running.'
2016-12-01 11:44:51,708+01 INFO  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-2) [f66108c] Correlation ID: f66108c, Call Stack: null, Custom Event ID: -1, Message: User admin@internal-authz logged out.
2016-12-01 11:44:51,712+01 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (pool-5-thread-4) [4d7f2acc] Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: Failed to
 check for available updates on host dell-r210ii-04 with message 'Failed to refresh host 'dell-r210ii-04' packages availability, another refresh process already running.'.

IMO it should not be ERROR but lower level. IIUC our highest level is ERROR (ie. we don't use FATAL, CRIT, EMERG, ALERT etc..) and thus making this as ERROR could confuse monitoring systems which would expend an ERROR message would be real error-problem.

Version-Release number of selected component (if applicable):
ovirt-engine-backend-4.1.0-0.0.master.20161129231316.git4dae799.el7.centos.noarch

How reproducible:
100%

Steps to Reproduce:
1. run upgradecheck via api twice with sleep 1 between them
2.
3.

Actual results:
ERROR message in log

Expected results:
< ERROR level message in log

Additional info:

Comment 1 Oved Ourfali 2016-12-01 12:19:50 UTC
We do use alert.
I find this use-case not that interesting, but I'm leaving the decision here to Moti.

Comment 2 Moti Asayag 2016-12-01 12:49:15 UTC
We can reduce the severity of that message from ERROR to WARN.

Comment 3 Jiri Belka 2017-01-24 11:01:19 UTC
ovirt-engine-backend-4.1.0.1-0.1.el7.noarch

[root@jbelka-vm2 yum.repos.d]# grep -i 'message:.*another refresh process is already running' /var/log/ovirt-engine/engine.log
2017-01-24 11:57:56,214+01 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (pool-5-thread-3) [29b090ae] EVENT_ID: HOST_AVAILABLE_UPDATES_FAILED(839), Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: Failed to check for available updates on host dell-r210ii-04 with message 'Another refresh process is already running'.
2017-01-24 11:57:57,682+01 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (pool-5-thread-4) [1fa56e70] EVENT_ID: HOST_AVAILABLE_UPDATES_FAILED(839), Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: Failed to check for available updates on host dell-r210ii-04 with message 'Another refresh process is already running'.
2017-01-24 11:57:59,082+01 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (pool-5-thread-5) [2858eb5b] EVENT_ID: HOST_AVAILABLE_UPDATES_FAILED(839), Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: Failed to check for available updates on host dell-r210ii-04 with message 'Another refresh process is already running'.
2017-01-24 11:58:00,508+01 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (pool-5-thread-6) [22695697] EVENT_ID: HOST_AVAILABLE_UPDATES_FAILED(839), Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: Failed to check for available updates on host dell-r210ii-04 with message 'Another refresh process is already running'.
[root@jbelka-vm2 yum.repos.d]# grep -i 'WARN.*another refresh process is already running' /var/log/ovirt-engine/engine.log
2017-01-24 11:57:56,209+01 WARN  [org.ovirt.engine.core.bll.hostdeploy.HostUpdatesChecker] (pool-5-thread-3) [29b090ae] Failed to refresh host 'dell-r210ii-04' packages availability, another refresh process is already running.
2017-01-24 11:57:57,674+01 WARN  [org.ovirt.engine.core.bll.hostdeploy.HostUpdatesChecker] (pool-5-thread-4) [1fa56e70] Failed to refresh host 'dell-r210ii-04' packages availability, another refresh process is already running.
2017-01-24 11:57:59,072+01 WARN  [org.ovirt.engine.core.bll.hostdeploy.HostUpdatesChecker] (pool-5-thread-5) [2858eb5b] Failed to refresh host 'dell-r210ii-04' packages availability, another refresh process is already running.
2017-01-24 11:58:00,500+01 WARN  [org.ovirt.engine.core.bll.hostdeploy.HostUpdatesChecker] (pool-5-thread-6) [22695697] Failed to refresh host 'dell-r210ii-04' packages availability, another refresh process is already running.

Comment 4 Red Hat Bugzilla Rules Engine 2017-01-24 11:01:25 UTC
Target release should be placed once a package build is known to fix a issue. Since this bug is not modified, the target version has been reset. Please use target milestone to plan a fix for a oVirt release.

Comment 5 Jiri Belka 2017-01-24 11:03:55 UTC
engine=# select severity,log_type_name,message from audit_log where message like '%Another refresh process is already running%';
 severity |         log_type_name         |                                                         message                                                         
----------+-------------------------------+-------------------------------------------------------------------------------------------------------------------------
        2 | HOST_AVAILABLE_UPDATES_FAILED | Failed to check for available updates on host dell-r210ii-04 with message 'Another refresh process is already running'.
        2 | HOST_AVAILABLE_UPDATES_FAILED | Failed to check for available updates on host dell-r210ii-04 with message 'Another refresh process is already running'.
        2 | HOST_AVAILABLE_UPDATES_FAILED | Failed to check for available updates on host dell-r210ii-04 with message 'Another refresh process is already running'.
        2 | HOST_AVAILABLE_UPDATES_FAILED | Failed to check for available updates on host dell-r210ii-04 with message 'Another refresh process is already running'.
(4 rows)

Comment 6 Moti Asayag 2017-01-25 16:36:42 UTC
Ok, I understand the point: There is a need to separate the event type for the both cases: failure to update as a result of another running upgrade process and any other failure.

For that purpose I'll introduce another audit log type for the specific case of another running process with a WARNING severity.

The log entries which aren't related to the event-log in the engine.log were already reduced to warn severity.

Comment 7 Lucie Leistnerova 2017-03-01 08:19:27 UTC
log entries for already running upgradecheck changed are WARN

verified in ovirt-engine-4.1.1.2-0.1.el7.noarch


Note You need to log in before you can comment on or make changes to this bug.