Bug 1392087
Summary: | Heartbeat failure for workers is not reported as ERROR log line but INFO log line | |||
---|---|---|---|---|
Product: | Red Hat CloudForms Management Engine | Reporter: | Thomas Hennessy <thenness> | |
Component: | Appliance | Assignee: | Nick Carboni <ncarboni> | |
Status: | CLOSED CURRENTRELEASE | QA Contact: | luke couzens <lcouzens> | |
Severity: | low | Docs Contact: | ||
Priority: | low | |||
Version: | 5.6.0 | CC: | abellott, cpelland, gtanzill, jdeubel, jhardy, jocarter, myoder, ncarboni, obarenbo, saali, simaishi | |
Target Milestone: | GA | Keywords: | TestOnly, ZStream | |
Target Release: | 5.8.0 | |||
Hardware: | x86_64 | |||
OS: | Linux | |||
Whiteboard: | appliance | |||
Fixed In Version: | 5.8.0.0 | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 1401935 (view as bug list) | Environment: | ||
Last Closed: | 2017-06-12 17:07:35 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1401935 | |||
Attachments: |
Created attachment 1217537 [details]
grep of entire two week logs of wk08 appliance showing scope of inconsistency of log line heartbeat error reporting
This issue is resolved by the hot fix provided in BZ 1392087 The PR mentioned in comment 6 actually fixes the underlying issue which was causing the log message. It is possible in the future that other errors in the VimBrokerWorker get logged as info when they should be logged as error, and that is not yet addressed. Moving this out of POST and changing the priority and severity to low as this is just about the issue of log levels. Nick, based on your comment above, I assume that you have disconnected all of the idiot lights on your dashboard, as they are just indicators of a problem and not the real problem. If I am incorrect in this assumption, allow me to suggest that these indicators of error condition are *not* low priority. New commit detected on ManageIQ/manageiq/master: https://github.com/ManageIQ/manageiq/commit/7596a5526d99bd4d4001d4bbba1e0975e051dfb8 commit 7596a5526d99bd4d4001d4bbba1e0975e051dfb8 Author: Nick Carboni <ncarboni> AuthorDate: Mon Nov 28 15:35:33 2016 -0500 Commit: Nick Carboni <ncarboni> CommitDate: Mon Nov 28 16:14:33 2016 -0500 Use the exit code when stopping the vim broker This will allow exits on error to log an error message instead of an info message. https://bugzilla.redhat.com/show_bug.cgi?id=1392087 app/models/miq_vim_broker_worker/runner.rb | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) Verified in 5.8.0.2 |
Created attachment 1217495 [details] grep of process 26692 containing several examples of problem Description of problem: In the following example, a worker process heartbeat is recorded as failing, but the log line type is not ERROR, but INFO. ===== [----] I, [2016-11-02T05:00:56.641451 #26692:e5798c] INFO -- : MIQ(MiqVimBrokerWorker::Runner) ID [50000001623777] PID [26692] GUID [f64863c8-a0de-11e6-b77e-005056827e39] Error heartbeating because PG::UniqueViolation: ERROR: duplicate key value violates unique constraint "replication_set_table_pkey" DETAIL: Key (set_id, set_reloid)=(-299507980, 16492) already exists. ===== Version-Release number of selected component (if applicable):5.6.1.2 How reproducible: FALL-OUT from pglogical heartbeat issues. Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info: It is expected that a heartbeat failure for any reason would be reported on an ERROR log line and not an INFO log line. Not reporting this as an error makes this much harder to isolate.