Bug 1324357 - The delay from logging deployment to when logs show up in kibana is too long
Summary: The delay from logging deployment to when logs show up in kibana is too long
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Logging
Version: 3.1.0
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: ---
Assignee: Luke Meyer
QA Contact: chunchen
URL:
Whiteboard:
Depends On: 1325727
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-04-06 07:11 UTC by chunchen
Modified: 2016-09-30 02:16 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-05-12 16:35:41 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
screenshot for kibana console (97.01 KB, image/png)
2016-04-06 07:12 UTC, chunchen
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2016:1064 0 normal SHIPPED_LIVE Important: Red Hat OpenShift Enterprise 3.2 security, bug fix, and enhancement update 2016-05-12 20:19:17 UTC

Description chunchen 2016-04-06 07:11:25 UTC
Description of problem:
It's empty when trying to check the pod logs in project, please refer to the sceenshot in attachment.

Version-Release number of selected component (if applicable):

reproduced on both OSE 3.2 and 3.1

logging-deployment      3.1.1-12            1889baecfc21
logging-fluentd         3.1.1-9             6a4bfd80f3eb
logging-elasticsearch   3.1.1-9             c0901c52554b
logging-kibana          3.1.1-7             3ce38d905617
logging-auth-proxy      latest              3d6792a3aeed

How reproducible:
always

Steps to Reproduce:
1. Login openshift, create a project
oc new-project logging
2. Create supporting service account and deployer secrets
oc create -f - <<API
apiVersion: v1
kind: ServiceAccount
metadata:
 name: logging-deployer
secrets:
- name: logging-deployer
API
oc secrets new logging-deployer nothing=/dev/null

#Login master node and run below commands:
oadm policy add-role-to-user edit
system:serviceaccount:logging:logging-deployer

oadm policy add-scc-to-user privileged
system:serviceaccount:logging:aggregated-logging-fluentd

oadm policy add-cluster-role-to-user cluster-reader
system:serviceaccount:logging:aggregated-logging-fluentd

3. Go back to oc client command line, make sure you are in logging project,
then run the deployer:
wget
https://raw.githubusercontent.com/openshift/origin-aggregated-logging/master/deployment/deployer.yaml
oc process -f deployer.yaml -v IMAGE_PREFIX=${image_prefix},\
              KIBANA_HOSTNAME=${kibana_route},\
              PUBLIC_MASTER_URL=https://${master_dns}:8443,\
              ES_INSTANCE_RAM=1024M,\
              ES_CLUSTER_SIZE=1,IMAGE_VERSION=latest\
              | oc create -f -

4. Wait for deployer pod completed
5. Run "oc process logging-support-template | oc create -f -"
6. After es and kibana pod is running, scale up fluentd rc:
oc scale rc logging-fluentd-1 --replicas=1
7. Wait all pods running, then check logs via kibana console

Actual results:
No logs shown in kibana console

Expected results:
Should show pod logs in kibana console

Additional info:

Comment 1 chunchen 2016-04-06 07:12:58 UTC
Created attachment 1144116 [details]
screenshot for kibana console

Comment 3 chunchen 2016-04-06 07:44:39 UTC
After I re-deployed EFK stack, and wait about one hours, the logs are shown in the kibana console. Is it too long to show logs in kibana console?

Comment 4 Luke Meyer 2016-04-06 08:13:58 UTC
This may be normal, though if so we need to document it. I believe fluentd first processes the system /var/log/messages archives before getting to the pod logs. I need to look through the code to confirm that is still true.

Comment 5 Luke Meyer 2016-04-08 15:12:53 UTC
For version 3.1.1 fluentd imports the /var/log/messages archives before the pod logs. Depending on how much is in those archives this can cause quite a delay before the pod logs are reached. This isn't a regression, it always worked that way.

For version 3.2 the import order is reversed. So in a typical deployment where you only have a few pods, it will run through their logs quickly before moving on to the system logs that are likely to be much longer. Can you verify the delay is improved with 3.2?

I can add some troubleshooting documentation specific to 3.1 that calls out this problem.

Comment 6 chunchen 2016-04-13 02:42:03 UTC
It's fixed, tested with below latest images, the logs can be shown in kibana console in timely.

openshift3/logging-deployment 3.2.0 3c4f9330894b
openshift3/logging-elasticsearch 3.2.0 f4c2de05eadf
openshift3/logging-fluentd 3.2.0 af009c973eaa
openshift3/logging-kibana 3.2.0 23bf82ad03f8
openshift3/logging-auth-proxy 3.2.0 363e6ee61a08

Comment 8 errata-xmlrpc 2016-05-12 16:35:41 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2016:1064


Note You need to log in before you can comment on or make changes to this bug.