Bug 1440316
| Summary: | Kibana ES Connection Errors | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Steven Walter <stwalter> | ||||||
| Component: | Logging | Assignee: | Jeff Cantrill <jcantril> | ||||||
| Status: | CLOSED CURRENTRELEASE | QA Contact: | Xia Zhao <xiazhao> | ||||||
| Severity: | urgent | Docs Contact: | |||||||
| Priority: | urgent | ||||||||
| Version: | 3.4.1 | CC: | aos-bugs, bleanhar, javier.ramirez, jgoulding, misalunk, mnapolis, nnosenzo, pdwyer, pportant, pweil, rmeggins | ||||||
| Target Milestone: | --- | Keywords: | OpsBlocker | ||||||
| Target Release: | 3.4.z | ||||||||
| Hardware: | Unspecified | ||||||||
| OS: | Unspecified | ||||||||
| Whiteboard: | |||||||||
| Fixed In Version: | Doc Type: | No Doc Update | |||||||
| Doc Text: |
undefined
|
Story Points: | --- | ||||||
| Clone Of: | Environment: | ||||||||
| Last Closed: | 2017-05-22 12:40:02 UTC | Type: | Bug | ||||||
| Regression: | --- | Mount Type: | --- | ||||||
| Documentation: | --- | CRM: | |||||||
| Verified Versions: | Category: | --- | |||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||
| Embargoed: | |||||||||
| Attachments: |
|
||||||||
This looks a lot like https://bugzilla.redhat.com/show_bug.cgi?id=1435083 Can you verify that the customer followed the steps from 1435083, and that they are still seeing the issue (e.g. Kibana is unusable) after following those steps? Installed openshift 3.4.1 and deployed the latest logging 3.4.1 stacks there, keep on monitoring for 21h, the kibana webconsole was visible with log entries for different projects. Checked kibana log, no error was reported similar with Unable to revive connection: https://logging-es:9200/" (attached the full log here). Set to verified. # oc get po NAME READY STATUS RESTARTS AGE logging-curator-1-pc4lu 1/1 Running 0 21h logging-deployer-1wy4f 0/1 Completed 0 21h logging-es-e5cpwacm-1-4fdg2 1/1 Running 0 21h logging-fluentd-cpb8a 1/1 Running 0 21h logging-kibana-1-q3lam 2/2 Running 10 21h Images tested with: openshift3/logging-deployer dcee53833a87 openshift3/logging-kibana e7b6eb3c6d3c openshift3/logging-elasticsearch fab3ad9b2410 openshift3/logging-fluentd 10c6287124fc openshift3/logging-curator e8b1bbdfa30f openshift3/logging-auth-proxy f2750505bbf8 # openshift version openshift v3.4.1.18 kubernetes v1.4.0+776c994 etcd 3.1.0-rc.0 Created attachment 1277217 [details]
kibana log (-c kibana) with the bug fix
Created attachment 1277218 [details]
kibana UI with the bug fix
This fix shipped as part of https://access.redhat.com/errata/RHBA-2017:1236. Unfortunately the bug was not properly tracked: https://access.redhat.com/containers/#/registry.access.redhat.com/openshift3/logging-kibana/images/3.4.1-14 We're improved the release automated to catch bugs that move directly from NEW to ON_QA. |
Description of problem: Version-Release number of selected component (if applicable): 3.4.1 How reproducible: Unconfirmed Actual results: In console, "Shard Failure: The following shards failed" (no shards listed) In kibana logs: {"type":"log","@timestamp":"2017-04-06T19:45:53+00:00","tags":["status","plugin:elasticsearch","error"],"pid":9,"name":"plugin:elasticsearch","state":"red","message":"Status changed from green to red - Request Timeout after 30000ms","prevState":"green","prevMsg":"Kibana index ready"} {"type":"log","@timestamp":"2017-04-06T19:45:56+00:00","tags":["status","plugin:elasticsearch","info"],"pid":9,"name":"plugin:elasticsearch","state":"green","message":"Status changed from red to green - Kibana index ready","prevState":"red","prevMsg":"Request Timeout after 30000ms"} ... {"type":"log","@timestamp":"2017-04-06T19:51:43+00:00","tags":["warning","elasticsearch"],"pid":9,"message":"Unable to revive connection: https://logging-es:9200/"} {"type":"log","@timestamp":"2017-04-06T19:51:46+00:00","tags":["warning","elasticsearch"],"pid":9,"message":"Unable to revive connection: https://logging-es:9200/"} {"type":"log","@timestamp":"2017-04-06T19:51:46+00:00","tags":["warning","elasticsearch"],"pid":9,"message":"No living connections"} Expected results: Running logs Additional info: This was reported in https://bugzilla.redhat.com/1432250 and a fix released but issue still occurring in new image.