Bug 1610224 - Unable to find container log in Elasticsearch when using cri-o
Summary: Unable to find container log in Elasticsearch when using cri-o
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Logging
Version: 3.11.0
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: 3.11.0
Assignee: Rich Megginson
QA Contact: Anping Li
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-07-31 09:06 UTC by Qiaoling Tang
Modified: 2018-10-11 07:23 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-10-11 07:22:47 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift openshift-ansible pull 9379 0 'None' closed Bug 1610224 - Unable to find container log in Elasticsearch when using cri-o 2020-11-11 07:44:00 UTC
Red Hat Product Errata RHBA-2018:2652 0 None None None 2018-10-11 07:23:06 UTC

Description Qiaoling Tang 2018-07-31 09:06:12 UTC
Description of problem:
Unable to find container log in Elasticsearch.

Check indices in Elasticsearch, only two indices in elasticsearch, no project indices. 
# oc exec -c elasticsearch logging-es-data-master-p2gllq0e-1-ztjvd -- curl -s -k --cert /etc/elasticsearch/secret/admin-cert --key /etc/elasticsearch/secret/admin-key https://logging-es.openshift-logging.svc.cluster.local:9200/_cat/indices
green open .operations.2018.07.31 EiWF72TIQXmcwXGW7fpzzA 1 0 325910 0 521.9mb 521.9mb
green open .searchguard           12MeqXjFSVeJt0u2_WFGkA 1 0      5 0    32kb    32kb


Check fluentd pod, 
# oc exec logging-fluentd-qqnjp -- ls /var/log/containers
dockergc-w2tr2_default_dockergc-c41f580c49b7a32a384709ea6fe178b1cc4cd029f109d3cdc0d842cda2bc82d3.log
java-mainclass-1-267lt_project1_java-mainclass-e491c9af9549e3523400fc050acd403b24296a8023422d5300c8d55cbcfea4c2.log
java-mainclass-1-5pbtq_project3_java-mainclass-4060e62914de7f709b6c9a299e243556ea5602d21d8188865573502deeca796a.log
java-mainclass-1-dv7kq_project2_java-mainclass-9eb4f07e27ca85bb29fe95e0cfeb9b34d79967363d55dd77373ac6927e186201.log
logging-fluentd-qqnjp_openshift-logging_fluentd-elasticsearch-cf5203c399101f713be12f216c36af0bbb07703554e9407f13314424c62f7bc9.log
mongodb-1-6qbxx_install-test_mongodb-4b0e72f3aad1ee5a8a6cbccce8325b288e9eb13a2cedc9831a5c1f4c1f0d0504.log
nodejs-mongodb-example-1-build_install-test_git-clone-e38e85826debe347cb8c03cb6e3dbb8bab9f21991ee392f8f268f7db8771b715.log
nodejs-mongodb-example-1-build_install-test_manage-dockerfile-aa01c7f44acfa17c071abfcc8eb693cd886046f7b33985dfc44417bfeb4bdd2e.log
nodejs-mongodb-example-1-build_install-test_sti-build-fbe9cd599a7a853b9d4e9137f0af1e4101641973a3a603fdc2c87dfdd4aa95b7.log
nodejs-mongodb-example-1-nxmc8_install-test_nodejs-mongodb-example-dcefc149e5c95bfa431dd6d511edc1e552bea8ab4ee2acc7b1badaf6831b689f.log
ovs-26kmc_openshift-sdn_openvswitch-96daea220806c9af5a06f4a3cc819edcda9b3f6644a1c4a523f92418a5cd5159.log
prometheus-0_openshift-metrics_alert-buffer-b8229de41d7984dad157bb0234469b86440cb1bba7c871c9c5659cbb2d1ab68a.log
prometheus-0_openshift-metrics_alertmanager-fc9a641b87c16c1db6f60f748a1bc3d81aac0fc6082ea3f1f48d7fa143afe1d7.log
prometheus-0_openshift-metrics_alertmanager-proxy-7ea6994e90b45ab5c7537f8f1914a21e365e50c67f6ac4892e4e46cc7aaa1508.log
prometheus-0_openshift-metrics_alerts-proxy-62e0b79491c4da10c9012768ccd0226390c46cf54f163b2e7163e961368b4f14.log
prometheus-0_openshift-metrics_prom-proxy-66ea7fb452160ed94992c0ca1c26ea5257b68ce8b3a0a78d8b9638a89b109c10.log
prometheus-0_openshift-metrics_prometheus-c776f420a903eb36b67ee84a4a0b9cb0881ce0bca9ad67252bc694031bede51f.log
prometheus-node-exporter-cqhv9_openshift-metrics_node-exporter-2bd3c39d592d8d323bc54fb42e35840b30ec119d6a7b199be3ebf878cc3a29c6.log
registry-console-1-srln4_default_registry-console-ae741f0851313d805d6af8af5acfc55b8f9da5a9d84bfc112cef00911171028b.log
sdn-524v4_openshift-sdn_sdn-56c22717c242e6588e34433fb37561723cbdd8a3568a483648a84e2866d07815.log
sync-8rjd9_openshift-node_sync-92b5bc6fd1182c108929f62872ce34b1168c3d4d1bf428405d178fbfce4f9cd9.log

Version-Release number of selected component (if applicable):
logging-elasticsearch5-v3.11.0-0.10.0.0
logging-fluentd-v3.11.0-0.10.0.0

How reproducible:
Always

Steps to Reproduce:
1.Deploy logging, wait until all pods become ready
2.Create some projects and pod
3.Try to find logs in Elasticserach 

Actual results:
No container logs could be found in Elasticsearch

Expected results:
Container logs can be found in Elasticsearch

Additional info:

Comment 1 Qiaoling Tang 2018-07-31 09:10:15 UTC
Add keyword 'TestBlocker' because most of the cases are blocked by this issue.

Comment 3 Jeff Cantrill 2018-07-31 12:31:34 UTC
There looks to be an issue with fluent pod connecting to Elasticsearch.  Please verify the pod is able to curl the ES endpoint [1].

[1] https://gist.github.com/jcantrill/163a0ba40bc441a1bb73fb049aaddad6

I have seen recently on a 3.10 cluster where, for example, the ES pod gets the IP of the other instances but is unable to communicate to it with a 'no route to host'

Comment 4 Rich Megginson 2018-07-31 14:14:52 UTC
You are using cri-o?
Looks like we have another bug:

2018-07-31 02:36:34 -0400 [warn]: plugin/in_tail.rb:364:block in convert_line_to_event: pattern not match: "2018-07-31T02:36:23.770806551-04:00 stderr F I0731 06:36:23.770760       1 handler.go:153] kube-aggregator: GET \"/api/v1/namespaces/kube-system/configmaps/kube-scheduler\" satisfied by nonGoRestful"

fluentd should not be spewing these.  Could be fluentd is not configured properly for crio or could not automatically detect that crio is being used.

Comment 5 Rich Megginson 2018-07-31 14:19:24 UTC
2018-07-31 02:50:37 -0400 [warn]: fluent/output.rb:381:rescue in try_flush: temporarily failed to flush the buffer. next_retry=2018-07-31 02:50:30 -0400 error_class="Fluent::ElasticsearchOutput::ConnectionFailure" error="Can not reach Elasticsearch cluster ({:host=>\"logging-es\", :port=>9200, :scheme=>\"https\", :user=>\"fluentd\", :password=>\"obfuscated\"})!" plugin_id="elasticsearch-apps"

So yes, a connection error.

In addition to the link provided by Jeff:

oc rsh $fluentdpod
getent hosts logging-es
getent hosts logging-es.openshift-logging.svc
getent hosts logging-es.openshift-logging.svc.cluster.local
cat /etc/resolv.conf
getent hosts kubernetes.default.svc.cluster.local

Comment 6 openshift-github-bot 2018-08-01 00:43:05 UTC
Commits pushed to master at https://github.com/openshift/openshift-ansible

https://github.com/openshift/openshift-ansible/commit/6fc86e69de8737c9134cf3b961a0e58128552286
Bug 1610224 - Unable to find container log in Elasticsearch when using cri-o

https://bugzilla.redhat.com/show_bug.cgi?id=1610224
The automatic detect of cri-o was broken in the merge of the
3.10 code to 3.11

https://github.com/openshift/openshift-ansible/commit/6ec6503a59cf7d456db2c0565c7cfc59bc7793df
Merge pull request #9379 from richm/logging-detect-crio

Bug 1610224 - Unable to find container log in Elasticsearch when using cri-o

Comment 9 Qiaoling Tang 2018-08-23 02:20:25 UTC
Move to VERIFIED. Logs can be found in es when using crio as container runtime.

Remove keyword 'TestBlocker'.

Comment 11 errata-xmlrpc 2018-10-11 07:22:47 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:2652


Note You need to log in before you can comment on or make changes to this bug.