Bug 1466626 - Unable to load index mapping for io.fabric8.elasticsearch.kibana.mapping.empty.
Unable to load index mapping for io.fabric8.elasticsearch.kibana.mapping.empty.
Status: CLOSED ERRATA
Product: OpenShift Container Platform
Classification: Red Hat
Component: Logging (Show other bugs)
3.5.1
Unspecified Unspecified
high Severity high
: ---
: 3.5.z
Assigned To: Jeff Cantrill
Xia Zhao
: Regression
Depends On:
Blocks: 1472144
  Show dependency treegraph
 
Reported: 2017-06-30 01:47 EDT by Xia Zhao
Modified: 2017-10-03 10:17 EDT (History)
10 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: Property missing from configuration file Consequence: Elasticsearch fails to start and generates a large stack trace. Fix: Modify installer code to create configuration with required property Result: Elasticsearch starts as desired
Story Points: ---
Clone Of:
: 1472144 (view as bug list)
Environment:
Last Closed: 2017-07-27 14:02:01 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
inventory file used for logging deployment (714 bytes, text/plain)
2017-06-30 01:47 EDT, Xia Zhao
no flags Details


External Trackers
Tracker ID Priority Status Summary Last Updated
Github https://github.com/openshift/openshift-ansible/pull/4657 None None None 2017-06-30 13:46 EDT

  None (edit)
Description Xia Zhao 2017-06-30 01:47:37 EDT
Created attachment 1293087 [details]
inventory file used for logging deployment

Description of problem:
After deploy logging 3.5.0 stacks, the es pod is in error status.

# oc logs -f logging-es-r2zzmmss-1-wqn9r
Comparing the specificed RAM to the maximum recommended for ElasticSearch...
Inspecting the maximum RAM available...
ES_JAVA_OPTS: '-Dmapper.allow_dots_in_name=true -Xms4096m -Xmx4096m'
Exception in thread "main" java.lang.RuntimeException: Unable to load index mapping for io.fabric8.elasticsearch.kibana.mapping.empty.  The key was not in the settings or it specified a file that does not exists.
	at io.fabric8.elasticsearch.plugin.kibana.IndexMappingLoader.loadMapping(IndexMappingLoader.java:56)
	at io.fabric8.elasticsearch.plugin.kibana.IndexMappingLoader.<init>(IndexMappingLoader.java:42)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
	at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
	at <<<guice>>>
	at org.elasticsearch.node.Node.<init>(Node.java:213)
	at org.elasticsearch.node.Node.<init>(Node.java:140)
	at org.elasticsearch.node.NodeBuilder.build(NodeBuilder.java:143)
	at org.elasticsearch.bootstrap.Bootstrap.setup(Bootstrap.java:194)
	at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:286)
	at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:45)
Checking if Elasticsearch is ready on https://localhost:9200 ....

Version-Release number of selected component (if applicable):
ansible version:
openshift-ansible-playbooks-3.5.91-1.git.0.28b3ddb.el7.noarch

# openshift version
openshift v3.5.5.31
kubernetes v1.5.2+43a9be4
etcd 3.1.0

Images tested with:
openshift3/logging-kibana    277c4a616a5a
openshift3/logging-elasticsearch    a7989e457354
openshift3/logging-fluentd    c09565262cad
openshift3/logging-curator    0aa259fbc36e
openshift3/logging-auth-proxy    d79212db0381

How reproducible:
Always

Steps to Reproduce:
1.Deploy logging 3.5.0 stacks with the attached inventory file
2.After ansible execution completed successfully, check EFK pod status

Actual results:
the es pod is in error status:
logging        logging-curator-1-d744c          1/1       Running     0          1m
logging        logging-es-r2zzmmss-1-wqn9r      0/1       Error       4          1m
logging        logging-fluentd-bm2l6            1/1       Running     0          1m
logging        logging-fluentd-k0s83            1/1       Running     0          1m
logging        logging-kibana-1-pqj12           2/2       Running     0          1m


Expected results:
es should be running

Additional info:
Comment 2 Jeff Cantrill 2017-06-30 13:09:25 EDT
You can manually add the config entry per the attached PR to workaround
Comment 3 Rich Megginson 2017-06-30 13:46:58 EDT
Do you only see this error in 3.5?  Do you see it in 3.6?

Just wondering if this needs to be a blocker for 3.6.  Note that the 3.5 images are way behind upstream . . . we need to update soon . . .
Comment 4 Rich Megginson 2017-06-30 13:47:36 EDT
fixed github PR link - looks like 3.6 has the correct settings already
Comment 5 Xia Zhao 2017-07-03 01:51:08 EDT
The work around worked fine, test result was added into https://bugzilla.redhat.com/show_bug.cgi?id=1463046#c5

This issue didn't happen on v3.6 images.
Comment 8 Xia Zhao 2017-07-11 22:55:13 EDT
Verified with the latest ansible package, bug is fixed, es is able to start up well after the ansible deployment, set to verified:

# rpm -qa | grep ansible
openshift-ansible-docs-3.5.94-1.git.0.1b33481.el7.noarch
openshift-ansible-lookup-plugins-3.5.94-1.git.0.1b33481.el7.noarch
ansible-2.2.3.0-1.el7.noarch
openshift-ansible-3.5.94-1.git.0.1b33481.el7.noarch
openshift-ansible-filter-plugins-3.5.94-1.git.0.1b33481.el7.noarch
openshift-ansible-roles-3.5.94-1.git.0.1b33481.el7.noarch
openshift-ansible-callback-plugins-3.5.94-1.git.0.1b33481.el7.noarch
openshift-ansible-playbooks-3.5.94-1.git.0.1b33481.el7.noarch
Comment 9 jooho lee 2017-07-24 23:00:08 EDT
@Xia,

3.5.94 version is released? The latest version of the package in "rhel-7-server-ose-3.5-rpms/x86_64" is 3.5.91. 

Is there any other work-around to avoid it?

Thanks,
Jooho Lee.
Comment 10 Jeff Cantrill 2017-07-25 10:55:26 EDT
@jooho lee.  Manually edit the logging-elasticsearch configmap to add something like:

io.fabric8.elasticsearch.kibana.mapping.empty: /usr/share/elasticsearch/index_patterns/com.redhat.viaq-openshift.index-pattern.json
Comment 11 jooho lee 2017-07-25 12:56:31 EDT
Thanks @Jeff,

I did it and I got new issues.

After I add the parameter, I got another issue. There are 3 elasticsearch containers and 2 of them run well at first but the other one could not run with following message:
~~~
2017-07-25 15:52:00 INFO  SearchGuardSSLPlugin:84 - Search Guard 2 plugin not available
2017-07-25 15:52:00 INFO  SearchGuardPlugin:58 - Clustername: elasticsearch
2017-07-25 15:52:00 INFO  SearchGuardPlugin:70 - Node [null] is a transportClient: true/tribeNode: false/tribeNodeClient: false
2017-07-25 15:52:00 INFO  plugins:180 - [Tom Corsi] modules [], plugins [search-guard-ssl, search-guard2], sites []
2017-07-25 15:52:00 INFO  DefaultSearchGuardKeyStore:423 - Open SSL not available (this is not an error, we simply fallback to built-in JDK SSL) because of java.lang.ClassNotFoundException: org.apache.tomcat.jni.SSL
2017-07-25 15:52:00 INFO  DefaultSearchGuardKeyStore:173 - Config directory is /usr/share/java/elasticsearch/config/, from there the key- and truststore files are resolved relatively
2017-07-25 15:52:00 INFO  DefaultSearchGuardKeyStore:142 - sslTransportClientProvider:JDK with ciphers [TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA384, TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384, TLS_DHE_RSA_WITH_AES_256_CBC_SHA256, TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA, TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA, TLS_DHE_RSA_WITH_AES_256_CBC_SHA, TLS_DHE_DSS_WITH_AES_256_CBC_SHA, TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA256, TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256, TLS_DHE_RSA_WITH_AES_128_CBC_SHA256, TLS_DHE_DSS_WITH_AES_128_CBC_SHA256, TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA, TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA, TLS_DHE_RSA_WITH_AES_128_CBC_SHA, TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384, TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256, TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384, TLS_DHE_RSA_WITH_AES_256_GCM_SHA384, TLS_DHE_DSS_WITH_AES_256_GCM_SHA384, TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256, TLS_DHE_RSA_WITH_AES_128_GCM_SHA256, TLS_DHE_DSS_WITH_AES_128_GCM_SHA256]
2017-07-25 15:52:00 INFO  DefaultSearchGuardKeyStore:144 - sslTransportServerProvider:JDK with ciphers [TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA384, TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384, TLS_DHE_RSA_WITH_AES_256_CBC_SHA256, TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA, TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA, TLS_DHE_RSA_WITH_AES_256_CBC_SHA, TLS_DHE_DSS_WITH_AES_256_CBC_SHA, TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA256, TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256, TLS_DHE_RSA_WITH_AES_128_CBC_SHA256, TLS_DHE_DSS_WITH_AES_128_CBC_SHA256, TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA, TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA, TLS_DHE_RSA_WITH_AES_128_CBC_SHA, TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384, TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256, TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384, TLS_DHE_RSA_WITH_AES_256_GCM_SHA384, TLS_DHE_DSS_WITH_AES_256_GCM_SHA384, TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256, TLS_DHE_RSA_WITH_AES_128_GCM_SHA256, TLS_DHE_DSS_WITH_AES_128_GCM_SHA256]
2017-07-25 15:52:00 INFO  DefaultSearchGuardKeyStore:146 - sslHTTPProvider:null with ciphers []
2017-07-25 15:52:00 INFO  DefaultSearchGuardKeyStore:148 - sslTransport protocols [TLSv1.2, TLSv1.1]
2017-07-25 15:52:00 INFO  DefaultSearchGuardKeyStore:149 - sslHTTP protocols [TLSv1.2, TLSv1.1]
2017-07-25 15:52:00 INFO  transport:99 - [Tom Corsi] Using [com.floragunn.searchguard.ssl.transport.SearchGuardSSLNettyTransport] as transport, overridden by [search-guard-ssl]
Contacting elasticsearch cluster 'elasticsearch' and wait for YELLOW clusterstate ...
Cannot retrieve cluster state due to: ClusterService was close during health call. This is not an error, will keep on trying ...
   * Try running sgadmin.sh with -icl and -nhnv (If thats works you need to check your clustername as well as hostnames in your SSL certificates)
   * If this is not working, try running sgadmin.sh with --diagnose and see diagnose trace log file)
~~~

Moreover, if i delete the 2 es that was running well, the 2 containers also failed to start up with the same error message.

Is it totally different issue from this ticket?

Thanks,
Jooho lee.
Comment 12 Jeff Cantrill 2017-07-25 13:30:46 EDT
@joolee,

Please do not use this issue as a catch all.  Based on the output I see, there is nothing wrong with your cluster.  It looks to be in an initialization state, possibly allocating indicies
Comment 13 jooho lee 2017-07-25 14:44:28 EDT
@Jeff,

Ok, I think I found another bug about it so I am going to file a new ticket.

Thanks,
Jooho Lee.
Comment 15 errata-xmlrpc 2017-07-27 14:02:01 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:1810
Comment 16 Roel Hodzelmans 2017-10-03 10:17:18 EDT
FYI: I just upgraded my 3.5 AWS cluster to the v3.6 via the official upgrade playbook (all-in-one), and I still had to manually add the io.fabric8.elasticsearch.kibana.mapping.empty: 
 /usr/share/elasticsearch/index_patterns/com.redhat.viaq-openshift.index-pattern.json

Note You need to log in before you can comment on or make changes to this bug.