Bug 1530157 - Kibana timeout if viewing many (100M) records
Summary: Kibana timeout if viewing many (100M) records
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Logging
Version: unspecified
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: 3.9.0
Assignee: Jeff Cantrill
QA Contact: Anping Li
URL:
Whiteboard:
: 1509025 (view as bug list)
Depends On:
Blocks: 1538171 1589905
TreeView+ depends on / blocked
 
Reported: 2018-01-02 07:59 UTC by Shirly Radco
Modified: 2020-12-14 10:57 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
undefined
Clone Of:
: 1538171 1589905 (view as bug list)
Environment:
Last Closed: 2018-08-28 17:42:30 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
logging-20180117_111929.tar.gz (574.55 KB, application/x-gzip)
2018-01-17 09:32 UTC, Shirly Radco
no flags Details
The kibana debug logs (17.86 KB, text/plain)
2018-01-22 10:42 UTC, Anping Li
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Github openshift origin-aggregated-logging pull 905 0 None closed bug 1530157. Configure Kibana timeout via env var 2020-12-09 14:49:01 UTC

Description Shirly Radco 2018-01-02 07:59:23 UTC
Description of problem:

When I try to view metrics and logs dashboards for a period longer than like 24 hours, we get the error:
"Visualize: Gateway Timeout More Info OK"

Metrics index includes 175,647,144 docs per day.

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1. Send about the same amount of docs to index for a few days
2. Try to run a report for more than 24 hours.
3.

Actual results:


Expected results:


Additional info:

Comment 1 Jeff Cantrill 2018-01-15 20:47:51 UTC
Can you advise to which 'log' and 'metrics' dashboards you refer?

Comment 2 Jeff Cantrill 2018-01-15 20:51:28 UTC
Can you provide additional information regarding your environment so we can understand if your deployment is properly sized for your requirements.  Additionally, it would help if you could provide information regarding the openshiftcluster (e.g. version) on which you are running

Comment 3 Shirly Radco 2018-01-16 11:29:13 UTC
Rich, Can you please help me to provide the required information?

Comment 4 Rich Megginson 2018-01-16 12:58:51 UTC
(In reply to Jeff Cantrill from comment #2)
> Can you provide additional information regarding your environment so we can
> understand if your deployment is properly sized for your requirements. 
> Additionally, it would help if you could provide information regarding the
> openshiftcluster (e.g. version) on which you are running

@shirly - run https://github.com/openshift/origin-aggregated-logging/blob/master/hack/logging-dump.sh
tar up the resulting files/dirs and attach to this bz

The real problem is that there is no way to tune the kibana timeout . . .

Comment 6 Shirly Radco 2018-01-17 09:32:46 UTC
Created attachment 1382317 [details]
logging-20180117_111929.tar.gz

Comment 7 Shirly Radco 2018-01-17 09:34:50 UTC
I updated the report Rich recommended.
Please give this a high priority since users will not be able to view dashboards based on metrics for a period greater then 12 hours which is a major issue.

Comment 8 Jeff Cantrill 2018-01-18 13:22:55 UTC
@Rich the posted PR will allow you to modify the request timeout.  Pending review of the attached report this may or may not resolve the issue.  I suspect there is additional performance tuning required.

Comment 9 openshift-github-bot 2018-01-18 17:21:30 UTC
Commits pushed to master at https://github.com/openshift/origin-aggregated-logging

https://github.com/openshift/origin-aggregated-logging/commit/969e1302a071ed3549679240892413a37f349297
bug 1530157. Configure Kibana timeout via env var

https://github.com/openshift/origin-aggregated-logging/commit/6f07813f44d0d1a1e86fea9ab2b0f9cfda280fbb
Merge pull request #905 from jcantrill/bz1530157_config_kibana_timeout

Automatic merge from submit-queue.

bug 1530157. Configure Kibana timeout via env var

This PR allows you to modify the Kibana config via env vars

Comment 10 Anping Li 2018-01-22 10:42:00 UTC
Created attachment 1384344 [details]
The kibana debug logs

The kibana reported "Payload timeout must be shorter than socket timeout' when I set ELASTICSEARCH_REQUESTTIMEOUT=1 in DC.
What is the socket timeout number? What the number to use for ELASTICSEARCH_REQUESTTIMEOUT?


image: logging-kibana/images/v3.9.0-0.22.0.0



    spec:
      containers:
      - env:
        - name: ES_HOST
          value: logging-es
        - name: ES_PORT
          value: "9200"
        - name: DEBUG
          value: "true"
        - name: KIBANA_MEMORY_LIMIT
          valueFrom:
            resourceFieldRef:
              containerName: kibana
              divisor: "0"
              resource: limits.memory
        - name: ELASTICSEARCH_REQUESTTIMEOUT
          value: "1"

Comment 11 Jeff Cantrill 2018-01-22 14:02:56 UTC
It looks like you attempted to test the same way I did.  The error is explained here [1] which likes like the value must be greater then 10s

[1] https://stackoverflow.com/questions/48117400/adding-requesttimeout-causes-kibana-to-fail-at-startup

Comment 12 Anping Li 2018-01-23 04:36:57 UTC
Verified the  ELASTICSEARCH_REQUESTTIMEOUT can be send by Environment in logging-kibana/images/v3.9.0-0.22.0.0.

Comment 13 Jeff Cantrill 2018-02-13 18:33:29 UTC
*** Bug 1509025 has been marked as a duplicate of this bug. ***


Note You need to log in before you can comment on or make changes to this bug.