Created attachment 1248560 [details] Kibana log Description of problem: Fluentd can not collect log entry after deployed logging stacks by using ansible, no logs can be found in kibana, JSON-FILE and JOURNALD have the same issue. # oc get po NAME READY STATUS RESTARTS AGE logging-curator-1-3br35 1/1 Running 1 1h logging-es-j43rhofz-1-qq38v 1/1 Running 0 1h logging-fluentd-drghr 1/1 Running 0 1h logging-kibana-1-89tvx 2/2 Running 0 1h Version-Release number of selected component (if applicable): image id: openshift3/logging-elasticsearch 7605f043d232 openshift3/logging-kibana e0ab09c2cbeb openshift3/logging-fluentd 47057624ecab openshift3/logging-auth-proxy 139f7943475e openshift3/logging-curator 7f034fdf7702 # openshift version openshift v3.5.0.17+c55cf2b kubernetes v1.5.2+43a9be4 etcd 3.1.0 How reproducible: Always Steps to Reproduce: 1. prepare the inventory file [oo_first_master] $master-public-dns ansible_user=root ansible_ssh_user=root ansible_ssh_private_key_file="~/cfile/libra.pem" openshift_public_hostname=$master-public-dns [oo_first_master:vars] deployment_type=openshift-enterprise openshift_release=v3.5.0 openshift_logging_install_logging=true openshift_logging_kibana_hostname=kibana.$sub-domain public_master_url=https://$master-public-dns:8443 openshift_logging_image_prefix=registry.ops.openshift.com/openshift3/ openshift_logging_image_version=3.5.0 openshift_logging_namespace=juzhao openshift_logging_fluentd_use_journal=false/true 2. Running the playbook from a control machine (my laptop) which is not oo_master: git clone https://github.com/openshift/openshift-ansible ansible-playbook -vvv -i ~/inventory playbooks/common/openshift-cluster/openshift_logging.yml 3. Log in Kibana UI to find logs. Actual results: search the log indices, only kibana, operations, searchguard can be found, but no logs can be found in Kibana UI. # oc exec logging-curator-1-3br35 -- curator --host logging-es --use_ssl --certificate /etc/curator/keys/ca --client-cert /etc/curator/keys/cert --client-key /etc/curator/keys/key --loglevel ERROR show indices --all-indices .kibana .kibana.ef0b7ff169fdc9202e567ce53aa5e17320cb2d7d .operations.2017.02.08 .searchguard.logging-es-j43rhofz-1-qq38v PS: There are a lot of warning info "error="no implicit conversion of nil into String"" in fluentd log, see the attached fluentd log. Expected results: Log entry can be found in Kibana UI. Additional info: Attached kibana,fluentd log.
Created attachment 1248561 [details] fluentd log
Possibly releated to https://bugzilla.redhat.com/show_bug.cgi?id=1420204
The logs seem to show that fluentd pods can't talk to the ES cluster. I wonder if this is an under-provisioned, or over-taxed ES deployment.
There are a lot of warning info "error="no implicit conversion of nil into String"" in fluentd log, see the attached fluentd log. Unfortunately it doesn't tell us which field, but if the problem is that it is using "time" instead of "@timestamp", you would see the above error. I think this is related to https://bugzilla.redhat.com/show_bug.cgi?id=1420234 which also has the same symptoms with "time"/"@timestamp".
The "@timestamp" field is missing because the common data model filter is missing. Here is the fluent.conf as deployed by ansible 3.5: @include configs.d/openshift/filter-syslog-record-transform.conf @include configs.d/openshift/filter-post-*.conf It should look like this: @include configs.d/openshift/filter-syslog-record-transform.conf @include configs.d/openshift/filter-common-data-model.conf @include configs.d/openshift/filter-post-*.conf PR: https://github.com/openshift/openshift-ansible/pull/3323
Commits pushed to master at https://github.com/openshift/openshift-ansible https://github.com/openshift/openshift-ansible/commit/f9daaa768ec415c2ae83a8c42ae35776d996083e Bug 1420234 - illegal_argument_exception in Kibana UI. https://bugzilla.redhat.com/show_bug.cgi?id=1420234 The problem is that the fluent.conf is missing the common data model filter which renames the "time" field to the "@timestamp" field. https://github.com/openshift/openshift-ansible/commit/9e92660ed86623385e48c4960f4db681d86a7a57 Merge pull request #3323 from richm/missing-common-data-model-filter Bug 1420234 - illegal_argument_exception in Kibana UI. Moving to MODIFIED because fix committed upstream - not sure if QE will test with upstream or will wait for RH openshift-ansible package to contain fix.
Tested with dev's repo(to work around the kibana route issue bug #1421563) which containing the fix in https://github.com/ewolinetz/openshift-ansible/blob/logging_fix_kibana_routes/roles/openshift_logging/files/fluent.conf#L25, the original issue reproduced, still no log entries displayed in kibana UI.
More info: 1. No log entries exist in ES when I queried it with index * and ident:*: # oc exec logging-es-w0uzrr7f-1-64h1v -- curl -s -k --cert /etc/elasticsearch/secret/admin-cert --key /etc/elasticsearch/secret/admin-key https://logging-es:9200/*/_count\?q=ident:* | python -mjson.tool | more { "_shards": { "failed": 0, "successful": 6, "total": 6 }, "count": 0 } 2. The indices are listed in the left panel on kibana, but "No results found" when the UI filter was set to "this year" (as in the attached picture #1) 3. Clicking on Settings tab of each indices, the fields column was displayed with numbers e.g. fields(23) for index .operations* (as in the attached picture #2) 4. Fluentd logs show a lot of error about " temporarily failed to flush the buffer", and stated "retry succeeded" with plugin_id in its log in the end. 5. Fluentd was already configured with "USE_JOURNAL=true" which is fit for the default docker log driver with master/node: # oc rsh logging-fluentd-qcjrk sh-4.2# env | grep JOUR USE_JOURNAL=true JOURNAL_READ_FROM_HEAD= JOURNAL_SOURCE= 6. kibana & es, curator logs appeared to be good. (They are attached)
Created attachment 1250796 [details] screenshot of kibana #1: No results found
Created attachment 1250797 [details] screenshot of kibana #2: fields can be got in Settings tab
Created attachment 1250800 [details] curator logs
Created attachment 1250802 [details] fluentd log
Created attachment 1250803 [details] es log
Created attachment 1250804 [details] kibana log
Created attachment 1250805 [details] curator log
Created attachment 1251862 [details] inventory file I used for logging 3.5.0 deployment
Created attachment 1251864 [details] The json output of fluentd daemonset
I'm not sure what's going on with Kibana. Was this a brand new install, or an upgrade of a previously failed install? What I see in Kibana is as described in the attached screenshots. In the Discover tab I see ".all", ".operations", and two namespace indices which are not prefixed by "project.". In the Settings tab the fields in these indices are not the common data model fields. All of this means this data/indices/configuration was created by a 3.3 deployment. If I go into the Settings tab, delete the ".operations" index, and recreate it, and set it as the default, then go back to the Discover tab, everything is working as expected. I can also go into Settings and add "project.logging.*" and "project.install-test.*", "project.logging.*" etc. and see the data as expected for those indices. Please confirm if this was an upgrade or fresh install.
@rmeggins I'm doing a fresh install with this playbook: https://github.com/openshift/openshift-ansible/blob/master/playbooks/common/openshift-cluster/openshift_logging.yml. Seems there is something wrong with the ansible install for logging.
Created attachment 1255922 [details] Kibana UI snapshot
Created attachment 1255923 [details] Kibana Settings tab snapshot
Which ansible version are you using? Are you using ansible from a git clone, or are you using some version of the openshift-ansible rpm packages?
@rmeggins installed from rpm packages ansible version: 2.2.0.0, # rpm -qa | grep ansible ansible-2.2.0.0-3.fc24.noarch It was installed from our puddle server 'puddles/RHAOS/AtomicOpenShift-errata/3.4/' at Dec. 22, 2016
@rmeggins playbooks are git cloned from https://github.com/openshift/openshift-ansible/ when we running ansible installtion.
(In reply to Junqi Zhao from comment #29) > @rmeggins > installed from rpm packages I don't mean ansible itself, I mean the playbooks used to install openshift ansible. These are either from git clone (and if you are using a git clone of https://github.com/openshift/openshift-ansible, please indicate which branch you are using, and which commit is the HEAD commit of the branch), or you can get the playbooks from the openshift-ansible RPM package. > > ansible version: 2.2.0.0, > > # rpm -qa | grep ansible > ansible-2.2.0.0-3.fc24.noarch > > It was installed from our puddle server > 'puddles/RHAOS/AtomicOpenShift-errata/3.4/' at Dec. 22, 2016 What is "It"? The ansible RPM package or something else?
(In reply to Junqi Zhao from comment #30) > @rmeggins > playbooks are git cloned from > https://github.com/openshift/openshift-ansible/ when we running ansible > installtion. What branch? master? What is the HEAD commit?
Submitted PR for fix: https://github.com/openshift/openshift-ansible/pull/3444
Commits pushed to master at https://github.com/openshift/openshift-ansible https://github.com/openshift/openshift-ansible/commit/cf9ad2651915c2c328c4402e410feb4c7abd122f Bug 1420219 - No log entry can be found in Kibana UI after deploying logging stacks with ansible https://bugzilla.redhat.com/show_bug.cgi?id=1420219 The elasticsearch config was missing the common data model stanza https://github.com/openshift/openshift-ansible/commit/8b27f4909d91850dad92e7bcbd660c0b674876c8 Merge pull request #3444 from richm/bug-1420219 Bug 1420219 - No log entry can be found in Kibana UI after deploying logging stacks with ansible
(In reply to Rich Megginson from comment #32) > (In reply to Junqi Zhao from comment #30) > > @rmeggins > > playbooks are git cloned from > > https://github.com/openshift/openshift-ansible/ when we running ansible > > installtion. > > What branch? master? What is the HEAD commit? Yes, I used master branch and git cloned every day in case there are changes in openshift-ansible
(In reply to Junqi Zhao from comment #35) > (In reply to Rich Megginson from comment #32) > > (In reply to Junqi Zhao from comment #30) > > > @rmeggins > > > playbooks are git cloned from > > > https://github.com/openshift/openshift-ansible/ when we running ansible > > > installtion. > > > > What branch? master? What is the HEAD commit? > > Yes, I used master branch and git cloned every day in case there are changes > in openshift-ansible Thanks. I think I have identified the problem - see https://bugzilla.redhat.com/show_bug.cgi?id=1420219#c34
Tested with the latest puddle and openshift-ansible playbooks, log entries can be found in Kibana UI now. Image id: openshift3/logging-elasticsearch d715f4d34ad4 openshift3/logging-kibana e0ab09c2cbeb openshift3/logging-fluentd 47057624ecab openshift3/logging-auth-proxy 139f7943475e openshift3/logging-curator 7f034fdf7702
This issue is reproduced in today's testing, covered json-file and journald log driver, log entry can not be shown in kibana UI now. See the attached pictures, indices for .all and .operation can be shown under Kibana 'Settings -> Indices' tab, but there is no index for user project, such as the logging project. The attached ES log shows, there is no update_mapping processs for logging project and other projects. from https://github.com/openshift/openshift-ansible/pull/3323/files it is : @include configs.d/openshift/filter-common-data-model.conf but now, it is: @include configs.d/openshift/filter-viaq-data-model.conf Does this caused this issue?
Created attachment 1260322 [details] es pod log
Created attachment 1260324 [details] .all indicies in kibana
Created attachment 1260325 [details] .operations indicies in kibana
Created attachment 1260327 [details] logging project indicies in kibana
(In reply to Junqi Zhao from comment #38) > from https://github.com/openshift/openshift-ansible/pull/3323/files > it is : > @include configs.d/openshift/filter-common-data-model.conf > > but now, it is: > @include configs.d/openshift/filter-viaq-data-model.conf > > Does this caused this issue? Yes. Try these builds/images: 12696338 buildContainer (noarch) completed successfully koji_builds: https://brewweb.engineering.redhat.com/brew/buildinfo?buildID=542335 repositories: brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/openshift3/logging-fluentd:rhaos-3.5-rhel-7-docker-candidate-20170306162647 brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/openshift3/logging-fluentd:3.5.0-3 brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/openshift3/logging-fluentd:3.5.0 brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/openshift3/logging-fluentd:latest brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/openshift3/logging-fluentd:v3.5 commit 32720f438f97a5a8cb20d888708aeb8c90074313 Author: Rich Megginson <rmeggins> Date: Mon Mar 6 14:25:35 2017 -0700 fix typo commit 0c9e90e4f67df02b015ac3d877c214f5ec969668 Author: Rich Megginson <rmeggins> Date: Mon Mar 6 14:12:32 2017 -0700 [Bug 1420219] No log entry can be found in Kibana UI after deploying logging stacks with ansible https://bugzilla.redhat.com/show_bug.cgi?id=1420219 The common data model plugin was renamed to viaq. We use the packaged viaq filter rpm now. This commit also adds some additional debugging capabilities when there are problems with data used to construct index names.
Log entries can be found in Kibana UI, covered JSON-FILE and journald logging driver. Image id: openshift3/logging-fluentd 8cd33de7939c openshift3/logging-curator 8cfcb23f26b6 openshift3/logging-elasticsearch d715f4d34ad4 openshift3/logging-kibana e0ab09c2cbeb openshift3/logging-auth-proxy 139f7943475e for JSON-FILE logging driver, there are a lot of warning errors, but it seems it does not affect the fluentd function, will continue to monitor, if it still exists, will submit a low severity defect. logs see attached file
Created attachment 1260690 [details] fluentd log, json-file as logging driver
I don't think docs are needed - this was not a customer bug, only internal QE.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:0903