Bug 1392087

Summary: Heartbeat failure for workers is not reported as ERROR log line but INFO log line
Product: Red Hat CloudForms Management Engine Reporter: Thomas Hennessy <thenness>
Component: ApplianceAssignee: Nick Carboni <ncarboni>
Status: CLOSED CURRENTRELEASE QA Contact: luke couzens <lcouzens>
Severity: low Docs Contact:
Priority: low    
Version: 5.6.0CC: abellott, cpelland, gtanzill, jdeubel, jhardy, jocarter, myoder, ncarboni, obarenbo, saali, simaishi
Target Milestone: GAKeywords: TestOnly, ZStream
Target Release: 5.8.0   
Hardware: x86_64   
OS: Linux   
Whiteboard: appliance
Fixed In Version: 5.8.0.0 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1401935 (view as bug list) Environment:
Last Closed: 2017-06-12 17:07:35 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1401935    
Attachments:
Description Flags
grep of process 26692 containing several examples of problem
none
grep of entire two week logs of wk08 appliance showing scope of inconsistency of log line heartbeat error reporting none

Description Thomas Hennessy 2016-11-04 19:24:08 UTC
Created attachment 1217495 [details]
grep of process 26692 containing several examples of problem

Description of problem: In the following example, a worker process heartbeat is recorded as failing, but the log line type is not ERROR, but INFO.
=====
[----] I, [2016-11-02T05:00:56.641451 #26692:e5798c]  INFO -- : MIQ(MiqVimBrokerWorker::Runner) ID [50000001623777] PID [26692] GUID [f64863c8-a0de-11e6-b77e-005056827e39] Error heartbeating because PG::UniqueViolation: ERROR:  duplicate key value violates unique constraint "replication_set_table_pkey"
DETAIL:  Key (set_id, set_reloid)=(-299507980, 16492) already exists.
=====


Version-Release number of selected component (if applicable):5.6.1.2


How reproducible: FALL-OUT from pglogical heartbeat issues.


Steps to Reproduce:
1.
2.
3.

Actual results:



Expected results:


Additional info: It is expected that a heartbeat failure for any reason would be reported on an ERROR log line and not an INFO log line.  Not reporting this as an error makes this much harder to isolate.

Comment 2 Thomas Hennessy 2016-11-05 02:57:11 UTC
Created attachment 1217537 [details]
grep of entire two week logs of wk08 appliance showing scope of inconsistency of log line heartbeat error reporting

Comment 4 Gregg Tanzillo 2016-11-11 14:05:20 UTC
This issue is resolved by the hot fix provided in BZ 1392087

Comment 6 Gregg Tanzillo 2016-11-16 19:40:10 UTC
PR - https://github.com/ManageIQ/manageiq/pull/12513

Comment 7 Nick Carboni 2016-11-17 14:44:04 UTC
The PR mentioned in comment 6 actually fixes the underlying issue which was causing the log message.

It is possible in the future that other errors in the VimBrokerWorker get logged as info when they should be logged as error, and that is not yet addressed.

Moving this out of POST and changing the priority and severity to low as this is just about the issue of log levels.

Comment 8 Thomas Hennessy 2016-11-17 15:13:58 UTC
Nick,
based on your comment above, I assume that you have disconnected all of the idiot lights on your dashboard, as they are just indicators of a problem and not the real problem.  If I am incorrect in this assumption, allow me to suggest that these indicators of error condition are *not* low priority.

Comment 9 Nick Carboni 2016-11-28 21:22:59 UTC
https://github.com/ManageIQ/manageiq/pull/12889

Comment 11 CFME Bot 2016-12-06 17:15:09 UTC
New commit detected on ManageIQ/manageiq/master:
https://github.com/ManageIQ/manageiq/commit/7596a5526d99bd4d4001d4bbba1e0975e051dfb8

commit 7596a5526d99bd4d4001d4bbba1e0975e051dfb8
Author:     Nick Carboni <ncarboni>
AuthorDate: Mon Nov 28 15:35:33 2016 -0500
Commit:     Nick Carboni <ncarboni>
CommitDate: Mon Nov 28 16:14:33 2016 -0500

    Use the exit code when stopping the vim broker
    
    This will allow exits on error to log an error message instead of
    an info message.
    
    https://bugzilla.redhat.com/show_bug.cgi?id=1392087

 app/models/miq_vim_broker_worker/runner.rb | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

Comment 12 luke couzens 2017-02-28 16:58:34 UTC
Verified in 5.8.0.2