Bug 1420219 - No log entry can be found in Kibana UI after deploying logging stacks with ansible
Summary: No log entry can be found in Kibana UI after deploying logging stacks with an...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Logging
Version: 3.5.0
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: ---
Assignee: Rich Megginson
QA Contact: Junqi Zhao
URL:
Whiteboard:
Depends On: 1421563
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-02-08 08:46 UTC by Junqi Zhao
Modified: 2017-07-24 14:11 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-04-12 19:01:00 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Kibana log (89.66 KB, text/plain)
2017-02-08 08:46 UTC, Junqi Zhao
no flags Details
fluentd log (11.39 KB, text/plain)
2017-02-08 08:49 UTC, Junqi Zhao
no flags Details
screenshot of kibana #1: No results found (154.96 KB, image/png)
2017-02-16 08:18 UTC, Xia Zhao
no flags Details
screenshot of kibana #2: fields can be got in Settings tab (154.92 KB, image/png)
2017-02-16 08:19 UTC, Xia Zhao
no flags Details
curator logs (310 bytes, text/plain)
2017-02-16 08:23 UTC, Xia Zhao
no flags Details
fluentd log (4.43 KB, text/plain)
2017-02-16 08:24 UTC, Xia Zhao
no flags Details
es log (31.84 KB, text/plain)
2017-02-16 08:24 UTC, Xia Zhao
no flags Details
kibana log (148.64 KB, text/plain)
2017-02-16 08:25 UTC, Xia Zhao
no flags Details
curator log (310 bytes, text/plain)
2017-02-16 08:25 UTC, Xia Zhao
no flags Details
inventory file I used for logging 3.5.0 deployment (829 bytes, text/plain)
2017-02-17 09:58 UTC, Xia Zhao
no flags Details
The json output of fluentd daemonset (7.39 KB, text/plain)
2017-02-17 10:00 UTC, Xia Zhao
no flags Details
Kibana UI snapshot (144.88 KB, image/png)
2017-02-21 03:22 UTC, Junqi Zhao
no flags Details
Kibana Settings tab snapshot (144.98 KB, image/png)
2017-02-21 03:23 UTC, Junqi Zhao
no flags Details
es pod log (29.29 KB, text/plain)
2017-03-06 09:22 UTC, Junqi Zhao
no flags Details
.all indicies in kibana (148.69 KB, image/png)
2017-03-06 09:23 UTC, Junqi Zhao
no flags Details
.operations indicies in kibana (156.75 KB, image/png)
2017-03-06 09:24 UTC, Junqi Zhao
no flags Details
logging project indicies in kibana (110.79 KB, image/png)
2017-03-06 09:24 UTC, Junqi Zhao
no flags Details
fluentd log, json-file as logging driver (17.79 KB, text/plain)
2017-03-07 06:18 UTC, Junqi Zhao
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2017:0903 0 normal SHIPPED_LIVE OpenShift Container Platform atomic-openshift-utils bug fix and enhancement 2017-04-12 22:45:42 UTC

Description Junqi Zhao 2017-02-08 08:46:32 UTC
Created attachment 1248560 [details]
Kibana log

Description of problem:
Fluentd can not collect log entry after deployed logging stacks by using ansible, no logs can be found in kibana, JSON-FILE and JOURNALD have the same issue.


# oc get po
NAME                          READY     STATUS    RESTARTS   AGE
logging-curator-1-3br35       1/1       Running   1          1h
logging-es-j43rhofz-1-qq38v   1/1       Running   0          1h
logging-fluentd-drghr         1/1       Running   0          1h
logging-kibana-1-89tvx        2/2       Running   0          1h


Version-Release number of selected component (if applicable):
image id:
openshift3/logging-elasticsearch    7605f043d232
openshift3/logging-kibana    e0ab09c2cbeb
openshift3/logging-fluentd    47057624ecab
openshift3/logging-auth-proxy    139f7943475e
openshift3/logging-curator    7f034fdf7702

# openshift version
openshift v3.5.0.17+c55cf2b
kubernetes v1.5.2+43a9be4
etcd 3.1.0


How reproducible:
Always

Steps to Reproduce:
1. prepare the inventory file

[oo_first_master]
$master-public-dns ansible_user=root ansible_ssh_user=root ansible_ssh_private_key_file="~/cfile/libra.pem" openshift_public_hostname=$master-public-dns

[oo_first_master:vars]
deployment_type=openshift-enterprise
openshift_release=v3.5.0
openshift_logging_install_logging=true

openshift_logging_kibana_hostname=kibana.$sub-domain
public_master_url=https://$master-public-dns:8443

openshift_logging_image_prefix=registry.ops.openshift.com/openshift3/
openshift_logging_image_version=3.5.0

openshift_logging_namespace=juzhao
openshift_logging_fluentd_use_journal=false/true

2. Running the playbook from a control machine (my laptop) which is not oo_master:
git clone https://github.com/openshift/openshift-ansible
ansible-playbook -vvv -i ~/inventory   playbooks/common/openshift-cluster/openshift_logging.yml

3. Log in Kibana UI to find logs.
Actual results:
search the log indices, only kibana, operations, searchguard can be found, but no logs can be found in Kibana UI.
# oc exec logging-curator-1-3br35 -- curator --host logging-es --use_ssl --certificate /etc/curator/keys/ca --client-cert /etc/curator/keys/cert --client-key /etc/curator/keys/key --loglevel ERROR show indices --all-indices
.kibana
.kibana.ef0b7ff169fdc9202e567ce53aa5e17320cb2d7d
.operations.2017.02.08
.searchguard.logging-es-j43rhofz-1-qq38v

PS: There are a lot of warning info "error="no implicit conversion of nil into String"" in fluentd log, see the attached fluentd log.


Expected results:
Log entry can be found in Kibana UI.

Additional info:
Attached kibana,fluentd log.

Comment 1 Junqi Zhao 2017-02-08 08:49:02 UTC
Created attachment 1248561 [details]
fluentd log

Comment 2 Jeff Cantrill 2017-02-08 15:20:36 UTC
Possibly releated to https://bugzilla.redhat.com/show_bug.cgi?id=1420204

Comment 3 Peter Portante 2017-02-09 21:49:00 UTC
The logs seem to show that fluentd pods can't talk to the ES cluster.  I wonder if this is an under-provisioned, or over-taxed ES deployment.

Comment 4 Rich Megginson 2017-02-09 21:51:04 UTC
There are a lot of warning info "error="no implicit conversion of nil into String"" in fluentd log, see the attached fluentd log.

Unfortunately it doesn't tell us which field, but if the problem is that it is using "time" instead of "@timestamp", you would see the above error.

I think this is related to https://bugzilla.redhat.com/show_bug.cgi?id=1420234 which also has the same symptoms with "time"/"@timestamp".

Comment 5 Rich Megginson 2017-02-09 22:42:08 UTC
The "@timestamp" field is missing because the common data model filter is missing.  Here is the fluent.conf as deployed by ansible 3.5:

  @include configs.d/openshift/filter-syslog-record-transform.conf
  @include configs.d/openshift/filter-post-*.conf

It should look like this:

  @include configs.d/openshift/filter-syslog-record-transform.conf
  @include configs.d/openshift/filter-common-data-model.conf
  @include configs.d/openshift/filter-post-*.conf

PR: https://github.com/openshift/openshift-ansible/pull/3323

Comment 6 Rich Megginson 2017-02-10 01:15:07 UTC
Commits pushed to master at https://github.com/openshift/openshift-ansible

https://github.com/openshift/openshift-ansible/commit/f9daaa768ec415c2ae83a8c42ae35776d996083e
Bug 1420234 - illegal_argument_exception in Kibana UI.

https://bugzilla.redhat.com/show_bug.cgi?id=1420234
The problem is that the fluent.conf is missing the common data model
filter which renames the "time" field to the "@timestamp" field.

https://github.com/openshift/openshift-ansible/commit/9e92660ed86623385e48c4960f4db681d86a7a57
Merge pull request #3323 from richm/missing-common-data-model-filter

Bug 1420234 - illegal_argument_exception in Kibana UI.

Moving to MODIFIED because fix committed upstream - not sure if QE will test with upstream or will wait for RH openshift-ansible package to contain fix.

Comment 8 Xia Zhao 2017-02-16 06:02:55 UTC
Tested with dev's repo(to work around the kibana route issue bug #1421563) which containing the fix in https://github.com/ewolinetz/openshift-ansible/blob/logging_fix_kibana_routes/roles/openshift_logging/files/fluent.conf#L25, the original issue reproduced, still no log entries displayed in kibana UI.

Comment 9 Xia Zhao 2017-02-16 08:16:43 UTC
More info:

1. No log entries exist in ES when I queried it with index * and ident:*:
# oc exec logging-es-w0uzrr7f-1-64h1v -- curl -s -k --cert    /etc/elasticsearch/secret/admin-cert --key    /etc/elasticsearch/secret/admin-key https://logging-es:9200/*/_count\?q=ident:*  | python -mjson.tool | more
{
    "_shards": {
        "failed": 0,
        "successful": 6,
        "total": 6
    },
    "count": 0
}

2. The indices are listed in the left panel on kibana, but "No results found" when the UI filter was set to "this year" (as in the attached picture #1)

3. Clicking on Settings tab of each indices, the fields column was displayed with numbers e.g. fields(23) for index .operations* (as in the attached picture #2)

4. Fluentd logs show a lot of error about " temporarily failed to flush the buffer", and stated "retry succeeded" with plugin_id in its log in the end.

5. Fluentd was already configured with "USE_JOURNAL=true" which is fit for the default docker log driver with master/node:

# oc rsh logging-fluentd-qcjrk
sh-4.2# env | grep JOUR
USE_JOURNAL=true
JOURNAL_READ_FROM_HEAD=
JOURNAL_SOURCE=


6. kibana & es, curator logs appeared to be good. (They are attached)

Comment 10 Xia Zhao 2017-02-16 08:18:11 UTC
Created attachment 1250796 [details]
screenshot of kibana #1: No results found

Comment 11 Xia Zhao 2017-02-16 08:19:01 UTC
Created attachment 1250797 [details]
screenshot of kibana #2: fields can be got in Settings tab

Comment 12 Xia Zhao 2017-02-16 08:23:12 UTC
Created attachment 1250800 [details]
curator logs

Comment 13 Xia Zhao 2017-02-16 08:24:13 UTC
Created attachment 1250802 [details]
fluentd log

Comment 14 Xia Zhao 2017-02-16 08:24:38 UTC
Created attachment 1250803 [details]
es log

Comment 15 Xia Zhao 2017-02-16 08:25:00 UTC
Created attachment 1250804 [details]
kibana log

Comment 16 Xia Zhao 2017-02-16 08:25:24 UTC
Created attachment 1250805 [details]
curator log

Comment 20 Xia Zhao 2017-02-17 09:58:53 UTC
Created attachment 1251862 [details]
inventory file I used for logging 3.5.0 deployment

Comment 21 Xia Zhao 2017-02-17 10:00:11 UTC
Created attachment 1251864 [details]
The json output of fluentd daemonset

Comment 22 Rich Megginson 2017-02-17 15:24:58 UTC
I'm not sure what's going on with Kibana.  Was this a brand new install, or an upgrade of a previously failed install?  What I see in Kibana is as described in the attached screenshots.  In the Discover tab I see ".all", ".operations", and two namespace indices which are not prefixed by "project.".  In the Settings tab the fields in these indices are not the common data model fields.  All of this means this data/indices/configuration was created by a 3.3 deployment.

If I go into the Settings tab, delete the ".operations" index, and recreate it, and set it as the default, then go back to the Discover tab, everything is working as expected.  I can also go into Settings and add "project.logging.*" and "project.install-test.*", "project.logging.*" etc. and see the data as expected for those indices.

Please confirm if this was an upgrade or fresh install.

Comment 23 Xia Zhao 2017-02-20 02:00:47 UTC
@rmeggins I'm doing a fresh install with this playbook: https://github.com/openshift/openshift-ansible/blob/master/playbooks/common/openshift-cluster/openshift_logging.yml. Seems there is something wrong with the ansible install for logging.

Comment 26 Junqi Zhao 2017-02-21 03:22:56 UTC
Created attachment 1255922 [details]
Kibana UI snapshot

Comment 27 Junqi Zhao 2017-02-21 03:23:57 UTC
Created attachment 1255923 [details]
Kibana Settings tab snapshot

Comment 28 Rich Megginson 2017-02-21 03:37:25 UTC
Which ansible version are you using?  Are you using ansible from a git clone, or are you using some version of the openshift-ansible rpm packages?

Comment 29 Junqi Zhao 2017-02-21 05:19:26 UTC
@rmeggins
installed from rpm packages

ansible version: 2.2.0.0, 

# rpm -qa | grep ansible
ansible-2.2.0.0-3.fc24.noarch

It was installed from our puddle server 'puddles/RHAOS/AtomicOpenShift-errata/3.4/' at Dec. 22, 2016

Comment 30 Junqi Zhao 2017-02-21 06:37:49 UTC
@rmeggins
playbooks are git cloned from https://github.com/openshift/openshift-ansible/ when we running ansible installtion.

Comment 31 Rich Megginson 2017-02-21 16:38:27 UTC
(In reply to Junqi Zhao from comment #29)
> @rmeggins
> installed from rpm packages

I don't mean ansible itself, I mean the playbooks used to install openshift ansible.  These are either from git clone (and if you are using a git clone of https://github.com/openshift/openshift-ansible, please indicate which branch you are using, and which commit is the HEAD commit of the branch), or you can get the playbooks from the openshift-ansible RPM package.

> 
> ansible version: 2.2.0.0, 
> 
> # rpm -qa | grep ansible
> ansible-2.2.0.0-3.fc24.noarch
> 
> It was installed from our puddle server
> 'puddles/RHAOS/AtomicOpenShift-errata/3.4/' at Dec. 22, 2016

What is "It"?  The ansible RPM package or something else?

Comment 32 Rich Megginson 2017-02-21 16:38:57 UTC
(In reply to Junqi Zhao from comment #30)
> @rmeggins
> playbooks are git cloned from
> https://github.com/openshift/openshift-ansible/ when we running ansible
> installtion.

What branch?  master?  What is the HEAD commit?

Comment 33 Rich Megginson 2017-02-21 20:21:26 UTC
Submitted PR for fix: https://github.com/openshift/openshift-ansible/pull/3444

Comment 34 openshift-github-bot 2017-02-21 20:28:51 UTC
Commits pushed to master at https://github.com/openshift/openshift-ansible

https://github.com/openshift/openshift-ansible/commit/cf9ad2651915c2c328c4402e410feb4c7abd122f
Bug 1420219 - No log entry can be found in Kibana UI after deploying logging stacks with ansible

https://bugzilla.redhat.com/show_bug.cgi?id=1420219
The elasticsearch config was missing the common data model stanza

https://github.com/openshift/openshift-ansible/commit/8b27f4909d91850dad92e7bcbd660c0b674876c8
Merge pull request #3444 from richm/bug-1420219

Bug 1420219 - No log entry can be found in Kibana UI after deploying logging stacks with ansible

Comment 35 Junqi Zhao 2017-02-22 00:29:52 UTC
(In reply to Rich Megginson from comment #32)
> (In reply to Junqi Zhao from comment #30)
> > @rmeggins
> > playbooks are git cloned from
> > https://github.com/openshift/openshift-ansible/ when we running ansible
> > installtion.
> 
> What branch?  master?  What is the HEAD commit?

Yes, I used master branch and git cloned every day in case there are changes in openshift-ansible

Comment 36 Rich Megginson 2017-02-22 01:04:35 UTC
(In reply to Junqi Zhao from comment #35)
> (In reply to Rich Megginson from comment #32)
> > (In reply to Junqi Zhao from comment #30)
> > > @rmeggins
> > > playbooks are git cloned from
> > > https://github.com/openshift/openshift-ansible/ when we running ansible
> > > installtion.
> > 
> > What branch?  master?  What is the HEAD commit?
> 
> Yes, I used master branch and git cloned every day in case there are changes
> in openshift-ansible

Thanks.  I think I have identified the problem - see https://bugzilla.redhat.com/show_bug.cgi?id=1420219#c34

Comment 37 Junqi Zhao 2017-02-22 05:54:18 UTC
Tested with the latest puddle and openshift-ansible playbooks, log entries can be found in Kibana UI now.

Image id:
openshift3/logging-elasticsearch    d715f4d34ad4
openshift3/logging-kibana    e0ab09c2cbeb
openshift3/logging-fluentd    47057624ecab
openshift3/logging-auth-proxy    139f7943475e
openshift3/logging-curator    7f034fdf7702

Comment 38 Junqi Zhao 2017-03-06 09:21:05 UTC
This issue is reproduced in today's testing, covered json-file and journald log driver, log entry can not be shown in kibana UI now.

See the attached pictures, indices for .all and .operation can be shown under Kibana 'Settings -> Indices' tab, but there is no index for user project, such as the logging project.

The attached ES log shows, there is no update_mapping processs for logging project and other projects.

from https://github.com/openshift/openshift-ansible/pull/3323/files
it is :
@include configs.d/openshift/filter-common-data-model.conf

but now, it is:
@include configs.d/openshift/filter-viaq-data-model.conf

Does this caused this issue?

Comment 39 Junqi Zhao 2017-03-06 09:22:28 UTC
Created attachment 1260322 [details]
es pod log

Comment 40 Junqi Zhao 2017-03-06 09:23:25 UTC
Created attachment 1260324 [details]
.all indicies in kibana

Comment 41 Junqi Zhao 2017-03-06 09:24:01 UTC
Created attachment 1260325 [details]
.operations indicies in kibana

Comment 42 Junqi Zhao 2017-03-06 09:24:57 UTC
Created attachment 1260327 [details]
logging project indicies in kibana

Comment 43 Rich Megginson 2017-03-06 21:41:49 UTC
(In reply to Junqi Zhao from comment #38)
> from https://github.com/openshift/openshift-ansible/pull/3323/files
> it is :
> @include configs.d/openshift/filter-common-data-model.conf
> 
> but now, it is:
> @include configs.d/openshift/filter-viaq-data-model.conf
> 
> Does this caused this issue?

Yes.  Try these builds/images:

12696338 buildContainer (noarch) completed successfully
koji_builds:
  https://brewweb.engineering.redhat.com/brew/buildinfo?buildID=542335
repositories:
  brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/openshift3/logging-fluentd:rhaos-3.5-rhel-7-docker-candidate-20170306162647
  brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/openshift3/logging-fluentd:3.5.0-3
  brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/openshift3/logging-fluentd:3.5.0
  brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/openshift3/logging-fluentd:latest
  brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/openshift3/logging-fluentd:v3.5


commit 32720f438f97a5a8cb20d888708aeb8c90074313
Author: Rich Megginson <rmeggins>
Date:   Mon Mar 6 14:25:35 2017 -0700

    fix typo

commit 0c9e90e4f67df02b015ac3d877c214f5ec969668
Author: Rich Megginson <rmeggins>
Date:   Mon Mar 6 14:12:32 2017 -0700

    [Bug 1420219] No log entry can be found in Kibana UI after deploying logging stacks with ansible
    
    https://bugzilla.redhat.com/show_bug.cgi?id=1420219
    The common data model plugin was renamed to viaq.
    We use the packaged viaq filter rpm now.
    This commit also adds some additional debugging capabilities when
    there are problems with data used to construct index names.

Comment 44 Junqi Zhao 2017-03-07 06:16:57 UTC
Log entries can be found in Kibana UI, covered JSON-FILE and journald logging driver.

Image id:
openshift3/logging-fluentd    8cd33de7939c
openshift3/logging-curator    8cfcb23f26b6
openshift3/logging-elasticsearch    d715f4d34ad4
openshift3/logging-kibana    e0ab09c2cbeb
openshift3/logging-auth-proxy    139f7943475e



for JSON-FILE logging driver, there are a lot of warning errors, but it seems it does not affect the fluentd function, will continue to monitor, if it still exists, will submit a low severity defect. logs see attached file

Comment 45 Junqi Zhao 2017-03-07 06:18:00 UTC
Created attachment 1260690 [details]
fluentd log, json-file as logging driver

Comment 46 Rich Megginson 2017-03-08 19:47:58 UTC
I don't think docs are needed - this was not a customer bug, only internal QE.

Comment 48 errata-xmlrpc 2017-04-12 19:01:00 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:0903


Note You need to log in before you can comment on or make changes to this bug.