Created attachment 1293087 [details] inventory file used for logging deployment Description of problem: After deploy logging 3.5.0 stacks, the es pod is in error status. # oc logs -f logging-es-r2zzmmss-1-wqn9r Comparing the specificed RAM to the maximum recommended for ElasticSearch... Inspecting the maximum RAM available... ES_JAVA_OPTS: '-Dmapper.allow_dots_in_name=true -Xms4096m -Xmx4096m' Exception in thread "main" java.lang.RuntimeException: Unable to load index mapping for io.fabric8.elasticsearch.kibana.mapping.empty. The key was not in the settings or it specified a file that does not exists. at io.fabric8.elasticsearch.plugin.kibana.IndexMappingLoader.loadMapping(IndexMappingLoader.java:56) at io.fabric8.elasticsearch.plugin.kibana.IndexMappingLoader.<init>(IndexMappingLoader.java:42) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at <<<guice>>> at org.elasticsearch.node.Node.<init>(Node.java:213) at org.elasticsearch.node.Node.<init>(Node.java:140) at org.elasticsearch.node.NodeBuilder.build(NodeBuilder.java:143) at org.elasticsearch.bootstrap.Bootstrap.setup(Bootstrap.java:194) at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:286) at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:45) Checking if Elasticsearch is ready on https://localhost:9200 .... Version-Release number of selected component (if applicable): ansible version: openshift-ansible-playbooks-3.5.91-1.git.0.28b3ddb.el7.noarch # openshift version openshift v3.5.5.31 kubernetes v1.5.2+43a9be4 etcd 3.1.0 Images tested with: openshift3/logging-kibana 277c4a616a5a openshift3/logging-elasticsearch a7989e457354 openshift3/logging-fluentd c09565262cad openshift3/logging-curator 0aa259fbc36e openshift3/logging-auth-proxy d79212db0381 How reproducible: Always Steps to Reproduce: 1.Deploy logging 3.5.0 stacks with the attached inventory file 2.After ansible execution completed successfully, check EFK pod status Actual results: the es pod is in error status: logging logging-curator-1-d744c 1/1 Running 0 1m logging logging-es-r2zzmmss-1-wqn9r 0/1 Error 4 1m logging logging-fluentd-bm2l6 1/1 Running 0 1m logging logging-fluentd-k0s83 1/1 Running 0 1m logging logging-kibana-1-pqj12 2/2 Running 0 1m Expected results: es should be running Additional info:
You can manually add the config entry per the attached PR to workaround
Do you only see this error in 3.5? Do you see it in 3.6? Just wondering if this needs to be a blocker for 3.6. Note that the 3.5 images are way behind upstream . . . we need to update soon . . .
fixed github PR link - looks like 3.6 has the correct settings already
The work around worked fine, test result was added into https://bugzilla.redhat.com/show_bug.cgi?id=1463046#c5 This issue didn't happen on v3.6 images.
Verified with the latest ansible package, bug is fixed, es is able to start up well after the ansible deployment, set to verified: # rpm -qa | grep ansible openshift-ansible-docs-3.5.94-1.git.0.1b33481.el7.noarch openshift-ansible-lookup-plugins-3.5.94-1.git.0.1b33481.el7.noarch ansible-2.2.3.0-1.el7.noarch openshift-ansible-3.5.94-1.git.0.1b33481.el7.noarch openshift-ansible-filter-plugins-3.5.94-1.git.0.1b33481.el7.noarch openshift-ansible-roles-3.5.94-1.git.0.1b33481.el7.noarch openshift-ansible-callback-plugins-3.5.94-1.git.0.1b33481.el7.noarch openshift-ansible-playbooks-3.5.94-1.git.0.1b33481.el7.noarch
@Xia, 3.5.94 version is released? The latest version of the package in "rhel-7-server-ose-3.5-rpms/x86_64" is 3.5.91. Is there any other work-around to avoid it? Thanks, Jooho Lee.
@jooho lee. Manually edit the logging-elasticsearch configmap to add something like: io.fabric8.elasticsearch.kibana.mapping.empty: /usr/share/elasticsearch/index_patterns/com.redhat.viaq-openshift.index-pattern.json
Thanks @Jeff, I did it and I got new issues. After I add the parameter, I got another issue. There are 3 elasticsearch containers and 2 of them run well at first but the other one could not run with following message: ~~~ 2017-07-25 15:52:00 INFO SearchGuardSSLPlugin:84 - Search Guard 2 plugin not available 2017-07-25 15:52:00 INFO SearchGuardPlugin:58 - Clustername: elasticsearch 2017-07-25 15:52:00 INFO SearchGuardPlugin:70 - Node [null] is a transportClient: true/tribeNode: false/tribeNodeClient: false 2017-07-25 15:52:00 INFO plugins:180 - [Tom Corsi] modules [], plugins [search-guard-ssl, search-guard2], sites [] 2017-07-25 15:52:00 INFO DefaultSearchGuardKeyStore:423 - Open SSL not available (this is not an error, we simply fallback to built-in JDK SSL) because of java.lang.ClassNotFoundException: org.apache.tomcat.jni.SSL 2017-07-25 15:52:00 INFO DefaultSearchGuardKeyStore:173 - Config directory is /usr/share/java/elasticsearch/config/, from there the key- and truststore files are resolved relatively 2017-07-25 15:52:00 INFO DefaultSearchGuardKeyStore:142 - sslTransportClientProvider:JDK with ciphers [TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA384, TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384, TLS_DHE_RSA_WITH_AES_256_CBC_SHA256, TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA, TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA, TLS_DHE_RSA_WITH_AES_256_CBC_SHA, TLS_DHE_DSS_WITH_AES_256_CBC_SHA, TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA256, TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256, TLS_DHE_RSA_WITH_AES_128_CBC_SHA256, TLS_DHE_DSS_WITH_AES_128_CBC_SHA256, TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA, TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA, TLS_DHE_RSA_WITH_AES_128_CBC_SHA, TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384, TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256, TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384, TLS_DHE_RSA_WITH_AES_256_GCM_SHA384, TLS_DHE_DSS_WITH_AES_256_GCM_SHA384, TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256, TLS_DHE_RSA_WITH_AES_128_GCM_SHA256, TLS_DHE_DSS_WITH_AES_128_GCM_SHA256] 2017-07-25 15:52:00 INFO DefaultSearchGuardKeyStore:144 - sslTransportServerProvider:JDK with ciphers [TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA384, TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384, TLS_DHE_RSA_WITH_AES_256_CBC_SHA256, TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA, TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA, TLS_DHE_RSA_WITH_AES_256_CBC_SHA, TLS_DHE_DSS_WITH_AES_256_CBC_SHA, TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA256, TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256, TLS_DHE_RSA_WITH_AES_128_CBC_SHA256, TLS_DHE_DSS_WITH_AES_128_CBC_SHA256, TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA, TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA, TLS_DHE_RSA_WITH_AES_128_CBC_SHA, TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384, TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256, TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384, TLS_DHE_RSA_WITH_AES_256_GCM_SHA384, TLS_DHE_DSS_WITH_AES_256_GCM_SHA384, TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256, TLS_DHE_RSA_WITH_AES_128_GCM_SHA256, TLS_DHE_DSS_WITH_AES_128_GCM_SHA256] 2017-07-25 15:52:00 INFO DefaultSearchGuardKeyStore:146 - sslHTTPProvider:null with ciphers [] 2017-07-25 15:52:00 INFO DefaultSearchGuardKeyStore:148 - sslTransport protocols [TLSv1.2, TLSv1.1] 2017-07-25 15:52:00 INFO DefaultSearchGuardKeyStore:149 - sslHTTP protocols [TLSv1.2, TLSv1.1] 2017-07-25 15:52:00 INFO transport:99 - [Tom Corsi] Using [com.floragunn.searchguard.ssl.transport.SearchGuardSSLNettyTransport] as transport, overridden by [search-guard-ssl] Contacting elasticsearch cluster 'elasticsearch' and wait for YELLOW clusterstate ... Cannot retrieve cluster state due to: ClusterService was close during health call. This is not an error, will keep on trying ... * Try running sgadmin.sh with -icl and -nhnv (If thats works you need to check your clustername as well as hostnames in your SSL certificates) * If this is not working, try running sgadmin.sh with --diagnose and see diagnose trace log file) ~~~ Moreover, if i delete the 2 es that was running well, the 2 containers also failed to start up with the same error message. Is it totally different issue from this ticket? Thanks, Jooho lee.
@joolee, Please do not use this issue as a catch all. Based on the output I see, there is nothing wrong with your cluster. It looks to be in an initialization state, possibly allocating indicies
@Jeff, Ok, I think I found another bug about it so I am going to file a new ticket. Thanks, Jooho Lee.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:1810
FYI: I just upgraded my 3.5 AWS cluster to the v3.6 via the official upgrade playbook (all-in-one), and I still had to manually add the io.fabric8.elasticsearch.kibana.mapping.empty: /usr/share/elasticsearch/index_patterns/com.redhat.viaq-openshift.index-pattern.json