Bug 1705026 - Gateway timeout when accessing Kibana dashboard
Summary: Gateway timeout when accessing Kibana dashboard
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Logging
Version: 3.10.0
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: 3.10.z
Assignee: Jeff Cantrill
QA Contact: Anping Li
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-05-01 07:20 UTC by Robert Sandu
Modified: 2019-07-02 20:16 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-07-02 20:15:26 UTC
Target Upstream Version:


Attachments (Terms of Use)

Description Robert Sandu 2019-05-01 07:20:38 UTC
Description of problem: Kibana login returns fatal error upon login:

```
Fatal Error
Courier Fetch Error: unhandled courier request error: [security_exception] no permissions for indices:data/read/mget
Version: 4.6.4
Build: 10229
Error: unhandled courier request error: [security_exception] no permissions for indices:data/read/mget"
```

Version-Release number of selected component (if applicable): v3.10.34

Actual results: Kibana login return "Error: unhandled courier request error: [security_exception] no permissions for indices:data/read/mget"


Expected results: Kibana login to work properly.


Additional info:

- Similar issue in [1], solved through an errata in v3.10.15. However, in this case, the customer has v3.10.34 deployed.
- Indices seem healthy.
- Doesn't seems permissions or retention time frame related: same error is reproduced by cluster-admin users upon login.

Comment 3 Jeff Cantrill 2019-05-02 15:35:15 UTC
Reviewing the logs I see several instances of ES nodes dropping our of the ES cluster.  

* Can you check the connectivity to the ES nodes to verify the latency [2]?
* Maybe also check the connectivity from Kibana [3]

The gateway timeout can be mitigated by adjusting the request timeout [1]

[1] https://github.com/openshift/origin-aggregated-logging/tree/release-3.10/kibana#configuration-modifications
[2] https://github.com/jcantrill/cluster-logging-tools/blob/master/scripts/check-es-cluster-connectivity
[3] https://github.com/jcantrill/cluster-logging-tools/blob/master/scripts/check-kibana-to-es-connectivity

Comment 4 Robert Sandu 2019-05-06 09:57:17 UTC
Hi.

Increasing ELASTICSEARCH_REQUESTTIMEOUT seems to make no difference. Customer is still seeing gateway timeout" errors when accessing Kibana dashboard.

Attached you can find the outputs of the ES latency test and Kibana connectivity.

Comment 13 Jeff Cantrill 2019-07-02 13:25:34 UTC
I believe this will be resolved by https://bugzilla.redhat.com/show_bug.cgi?id=1705589 and subsequent backports to 3.11.  The upstream issues are merged and I am doing the work now to get them built for a production release.

Comment 14 Robert Sandu 2019-07-02 13:28:46 UTC
(In reply to Jeff Cantrill from comment #13)
> I believe this will be resolved by
> https://bugzilla.redhat.com/show_bug.cgi?id=1705589 and subsequent backports
> to 3.11.  The upstream issues are merged and I am doing the work now to get
> them built for a production release.

Hi Jeff.

Which 3.11 z-stream version has been (or will be) backported to?

Thank you.

Comment 15 Jeff Cantrill 2019-07-02 20:15:26 UTC
Closing as duplicate to https://bugzilla.redhat.com/show_bug.cgi?id=1705589 .  Will backport to 3.11. Please reopen if not resolved

*** This bug has been marked as a duplicate of bug 1705589 ***

Comment 16 Jeff Cantrill 2019-07-02 20:16:10 UTC
Changing status to WONTFIX as we will resolve foe 3.11 but not 3.10


Note You need to log in before you can comment on or make changes to this bug.