Bug 1462584

Summary: [upgrade]Empty pods' log after upgrade to OCP 3.6
Product: OpenShift Container Platform Reporter: Junqi Zhao <juzhao>
Component: LoggingAssignee: Jeff Cantrill <jcantril>
Status: CLOSED NOTABUG QA Contact: Xia Zhao <xiazhao>
Severity: low Docs Contact:
Priority: medium    
Version: 3.6.0CC: aos-bugs, juzhao, nhosoi, pportant
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-08-25 20:38:16 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
es log none

Description Junqi Zhao 2017-06-19 01:35:43 UTC
Description of problem:
Deployed logging 3.5 on OCP 3.5.5.25, and then Upgrade OCP to 3.6.112, es and kibana pods' log are empty

Version-Release number of selected component (if applicable):
# openshift version
openshift v3.6.112
kubernetes v1.6.1+5115d708d7
etcd 3.2.0

# oc get po
NAME                          READY     STATUS    RESTARTS   AGE
logging-curator-1-sdj1h       1/1       Running   0          5h
logging-es-sxdc14oi-1-mxpcm   1/1       Running   0          5h
logging-fluentd-dc7t4         1/1       Running   0          2d
logging-fluentd-n3dvf         1/1       Running   3          2d
logging-kibana-1-d4bls        2/2       Running   1          5h

How reproducible:
Always

Steps to Reproduce:
1. Deploy logging on OCP 3.5.5.25
2. Upgrade OCP to 3.6.112
3. Check logging pods' log.

Actual results:
3. es and kibana pods' log are empty

Expected results:
All pods' log should be retrieved

Additional info:

Comment 1 Jeff Cantrill 2017-06-20 02:27:10 UTC
Please provide additional information of how you are retrieving the logs.  The ES log appender was changed [1] so that ES does not degrade due to a logging feed back loop ingesting its own logs.

By Kibana logs are empty, do you mean as in 'oc logs $KIBANA_POD'?

Please provide as much of the relevant details of of the logging stack as identified here: https://github.com/openshift/origin-aggregated-logging/blob/master/docs/issues.md

[1] https://github.com/openshift/openshift-ansible/blob/master/roles/openshift_logging/defaults/main.yml#L90

Comment 2 Junqi Zhao 2017-06-20 06:40:27 UTC
It is a little weird, kibana pods log could be viewed by console, but there are a few lines there. es pods log could not be viewed by console, but can be found under /elasticsearch/logging-es/logs of es pod, see the attached log

Comment 3 Junqi Zhao 2017-06-20 06:41:08 UTC
Created attachment 1289420 [details]
es  log

Comment 4 Noriko Hosoi 2017-08-08 00:08:00 UTC
Hi @Junqi,

Is the symptom reproducible?

Every time you upgrade from 3.5 to 3.6, you see empty log output from "oc logs"?  It's not clear to me the issue was triggered by the upgrade or it had existed before the upgrade...  Did you have a chance to run "oc logs" on 3.5 just before upgrading it to 3.6?

Also, when you could duplicate the problem, could you try to gather more info?

For instance, please make sure os logs returns nothing.
$ oc logs $espod
$
Then, could you try oc logs options such as these?
  --limit-bytes=1024: Maximum bytes of logs to return. Defaults to no limit.
  --previous=true: If true, print the logs for the previous instance of the container in a pod if it exists.

$ oc exec $espod -- ls -l /elasticsearch/logging-es/logs
total 32
-rw-r--r--. 1 1000050000 root  8836 Aug  8 00:01 logging-es.log
-rw-r--r--. 1 1000050000 root 19168 Aug  7 23:55 logging-es.log.2017-08-07
....

$ oc exec $espod -- id

$ oc exec $espod -- cat /elasticsearch/logging-es/logs/logging-es.log

Any other inputs would be appreciated.

Comment 5 Junqi Zhao 2017-08-08 01:26:46 UTC
(In reply to Noriko Hosoi from comment #4)
> Hi @Junqi,
> 
> Is the symptom reproducible?
> 
> Every time you upgrade from 3.5 to 3.6, you see empty log output from "oc
> logs"?  It's not clear to me the issue was triggered by the upgrade or it
> had existed before the upgrade...  Did you have a chance to run "oc logs" on
> 3.5 just before upgrading it to 3.6?

I did not have environment now, will attach more info when I do upgrade work later

Comment 6 Jeff Cantrill 2017-08-25 20:38:16 UTC
Closing as this is not a bug but by design [1].  This was changed to avoid a feedback loop observed by Peter Portante where ES ingests its own logs and can get into a state where it is no longer usable.

[1] https://github.com/openshift/openshift-ansible/pull/4928