Bug 1460883 - [paid][online-stg] Encounter ClusterBlockException inside elasticsearch cluster since no master service available on online-stg env
[paid][online-stg] Encounter ClusterBlockException inside elasticsearch clust...
Status: CLOSED CURRENTRELEASE
Product: OpenShift Online
Classification: Red Hat
Component: Logging (Show other bugs)
3.x
Unspecified Unspecified
medium Severity medium
: ---
: ---
Assigned To: Jeff Cantrill
Xia Zhao
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2017-06-12 23:26 EDT by Xia Zhao
Modified: 2017-11-09 13:57 EST (History)
4 users (show)

See Also:
Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2017-11-09 13:57:05 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
oc logs logging-es-8m7ovij5-1-nnjsh -n logging (574.33 KB, text/plain)
2017-06-13 03:35 EDT, Xia Zhao
no flags Details
oc logs logging-es-xacrjjge-1-rgnw8 -n logging (26.85 KB, text/plain)
2017-06-13 03:37 EDT, Xia Zhao
no flags Details
Kibana UI, Unable to connect to Elasticsearch (102.06 KB, image/png)
2017-06-15 02:15 EDT, Junqi Zhao
no flags Details
kibana UI, could be accessed now (195.46 KB, image/png)
2017-06-20 05:31 EDT, Junqi Zhao
no flags Details

  None (edit)
Description Xia Zhao 2017-06-12 23:26:04 EDT
Description of problem:
es is in error status because of the following exception:
[2017-06-09 17:59:02,921][INFO ][cluster.metadata         ] [Maxwell Dillon] [.operations.2017.06.09] update_mapping [com.redhat.viaq.common]
[2017-06-09 21:14:23,947][WARN ][io.fabric8.elasticsearch.discovery.kubernetes.KubernetesDiscovery] [Maxwell Dillon] not enough master nodes, current nodes: {{Maxwell Dillon}{dsRcSQSmQ4OpulpWYyRniA}{10.131.4.23}{10.131.4.23:9300}{master=true},}
[2017-06-09 21:14:23,947][INFO ][cluster.service          ] [Maxwell Dillon] removed {{Donald Ritter}{z1HIjwq5Qq2zAKeo2u9hdQ}{10.130.4.14}{10.130.4.14:9300}{master=true},}, reason: zen-disco-node-left({Donald Ritter}{z1HIjwq5Qq2zAKeo2u9hdQ}{10.130.4.14}{10.130.4.14:9300}{master=true}), reason(left)
[2017-06-09 21:14:23,988][WARN ][rest.suppressed          ] path: /_bulk, params: {}
ClusterBlockException[blocked by: [SERVICE_UNAVAILABLE/2/no master];]
        at org.elasticsearch.cluster.block.ClusterBlocks.globalBlockedException(ClusterBlocks.java:158)
        at org.elasticsearch.cluster.block.ClusterBlocks.globalBlockedRaiseException(ClusterBlocks.java:144)
        at org.elasticsearch.action.bulk.TransportBulkAction.executeBulk(TransportBulkAction.java:204)
        at org.elasticsearch.action.bulk.TransportBulkAction.doExecute(TransportBulkAction.java:151)
        at org.elasticsearch.action.bulk.TransportBulkAction.doExecute(TransportBulkAction.java:71)
        at org.elasticsearch.action.support.TransportAction.doExecute(TransportAction.java:149)
        at org.elasticsearch.action.support.TransportAction$RequestFilterChain.proceed(TransportAction.java:172)
        at io.fabric8.elasticsearch.plugin.KibanaUserReindexAction.apply(KibanaUserReindexAction.java:81)
        at org.elasticsearch.action.support.TransportAction$RequestFilterChain.proceed(TransportAction.java:170)
        at com.floragunn.searchguard.filter.SearchGuardFilter.apply(SearchGuardFilter.java:169)
        at org.elasticsearch.action.support.TransportAction$RequestFilterChain.proceed(TransportAction.java:170)
        at org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:144)
        at org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:85)
        at org.elasticsearch.client.node.NodeClient.doExecute(NodeClient.java:58)
        at org.elasticsearch.client.support.AbstractClient.execute(AbstractClient.java:359)
        at org.elasticsearch.client.FilterClient.doExecute(FilterClient.java:52)
        at org.elasticsearch.rest.BaseRestHandler$HeadersAndContextCopyClient.doExecute(BaseRestHandler.java:88)
        at org.elasticsearch.client.support.AbstractClient.execute(AbstractClient.java:359)
        at org.elasticsearch.client.support.AbstractClient.bulk(AbstractClient.java:436)
        at org.elasticsearch.rest.action.bulk.RestBulkAction.handleRequest(RestBulkAction.java:90)
        at org.elasticsearch.rest.BaseRestHandler.handleRequest(BaseRestHandler.java:54)
        at org.elasticsearch.rest.RestController.executeHandler(RestController.java:198)
        at org.elasticsearch.rest.RestController$RestHandlerFilter.process(RestController.java:280)
        at org.elasticsearch.rest.RestController$ControllerFilterChain.continueProcessing(RestController.java:261)
        at io.fabric8.elasticsearch.plugin.KibanaUserReindexFilter.process(KibanaUserReindexFilter.java:78)
        at org.elasticsearch.rest.RestController$ControllerFilterChain.continueProcessing(RestController.java:264)
        at com.floragunn.searchguard.filter.SearchGuardRestFilter.process(SearchGuardRestFilter.java:65)
        at org.elasticsearch.rest.RestController$ControllerFilterChain.continueProcessing(RestController.java:264)
        at io.fabric8.elasticsearch.plugin.acl.DynamicACLFilter.process(DynamicACLFilter.java:197)
        at org.elasticsearch.rest.RestController$ControllerFilterChain.continueProcessing(RestController.java:264)
        at org.elasticsearch.rest.RestController.dispatchRequest(RestController.java:161)
        at org.elasticsearch.http.HttpServer.internalDispatchRequest(HttpServer.java:153)
        at org.elasticsearch.http.HttpServer$Dispatcher.dispatchRequest(HttpServer.java:101)
        at org.elasticsearch.http.netty.NettyHttpServerTransport.dispatchRequest(NettyHttpServerTransport.java:451)
        ...

Version-Release number of selected component (if applicable):
online-stg env with OCP master v3.5.5.25

How reproducible:
Always

Steps to Reproduce:
1.Visit the kibana route directly with proper credentials
2.
3.

Actual results:
kibana is in red status since "Unable to connect to Elasticsearch at https://logging-es:9200. ", check from backend, the es is in error status:
Command ***** oc get pod -n logging ***** result as below:
 
NAME                          READY     STATUS    RESTARTS   AGE
logging-curator-1-3bd4p       1/1       Running   519        4d
logging-es-8m7ovij5-1-nnjsh   0/1       Error     0          4d
logging-es-xacrjjge-1-rgnw8   0/1       Error     0          4d
logging-fluentd-0kw2x         1/1       Running   3          4d
logging-fluentd-2fzrm         1/1       Running   3          4d
logging-fluentd-69fkl         1/1       Running   3          4d
logging-fluentd-7phpj         1/1       Running   4          4d
logging-fluentd-gmptb         1/1       Running   3          4d
logging-fluentd-htmdq         1/1       Running   4          4d
logging-fluentd-nx59f         1/1       Running   3          4d
logging-kibana-1-kkrvx        2/2       Running   12         4d
logging-kibana-1-wnwvw        2/2       Running   12         4d

Expected results:
es/ kibana in green status and log entries can be presented on logging UI

Additional info:
Output of the following commands were attached:
oc logs logging-es-8m7ovij5-1-nnjsh -n logging
oc logs logging-es-xacrjjge-1-rgnw8 -n logging
Comment 1 Xia Zhao 2017-06-13 03:35 EDT
Created attachment 1287181 [details]
oc logs logging-es-8m7ovij5-1-nnjsh -n logging
Comment 2 Xia Zhao 2017-06-13 03:37 EDT
Created attachment 1287182 [details]
oc logs logging-es-xacrjjge-1-rgnw8 -n logging
Comment 3 Xia Zhao 2017-06-13 03:44:50 EDT
The es image tested with is:

openshift3/logging-elasticsearch      v3.5                6fed373197a3        5 days ago          399.5 MB
Comment 4 Junqi Zhao 2017-06-15 02:14:38 EDT
Tested again on
OpenShift Master:v3.5.5.26 (online version 3.5.1.34)
Kubernetes Master:v1.5.2+43a9be4 

Still Unable to connect to Elasticsearch at https://logging-es:9200.
Comment 5 Junqi Zhao 2017-06-15 02:15 EDT
Created attachment 1287910 [details]
Kibana UI, Unable to connect to Elasticsearch
Comment 6 Junqi Zhao 2017-06-15 02:15:49 EDT
es pod is in error status
Command ***** oc get pod -n logging ***** result as below:
 
NAME                          READY     STATUS             RESTARTS   AGE
logging-curator-1-3bd4p       0/1       CrashLoopBackOff   855        6d
logging-es-8m7ovij5-1-nnjsh   0/1       Error              0          6d
logging-es-xacrjjge-1-rgnw8   1/1       Running            1          6d
logging-fluentd-0kw2x         1/1       Running            4          6d
logging-fluentd-2fzrm         1/1       Running            4          6d
logging-fluentd-69fkl         1/1       Running            4          6d
logging-fluentd-7phpj         1/1       Running            5          6d
logging-fluentd-gmptb         1/1       Running            4          6d
logging-fluentd-htmdq         1/1       Running            5          6d
logging-fluentd-nx59f         1/1       Running            4          6d
logging-kibana-1-kkrvx        2/2       Running            16         6d
logging-kibana-1-wnwvw        2/2       Running            17         6d
Comment 7 Stefanie Forrester 2017-06-19 16:11:02 EDT
Status shows green now. I didn't make any changes today, but it was probably redeployed with one of the upgrades.
Comment 8 Junqi Zhao 2017-06-20 05:30:26 EDT
project logs could be retrieved from kibana ui

Testing environment:
OpenShift Master: v3.5.5.27 (online version 3.5.1.36)
Kubernetes Master: v1.5.2+43a9be4 

Command ***** oc get pod -n logging ***** result as below:
 
NAME                          READY     STATUS    RESTARTS   AGE
logging-curator-1-3bd4p       1/1       Running   1289       11d
logging-es-8m7ovij5-1-nnjsh   1/1       Running   1          11d
logging-es-xacrjjge-1-rgnw8   1/1       Running   1          11d
logging-fluentd-0kw2x         1/1       Running   5          11d
logging-fluentd-2fzrm         1/1       Running   5          11d
logging-fluentd-69fkl         1/1       Running   5          11d
logging-fluentd-7phpj         1/1       Running   6          11d
logging-fluentd-gmptb         1/1       Running   5          11d
logging-fluentd-htmdq         1/1       Running   6          11d
logging-fluentd-nx59f         1/1       Running   5          11d
logging-kibana-1-kkrvx        2/2       Running   35         11d
logging-kibana-1-wnwvw        2/2       Running   35         11d

Attached kibana UI
Comment 9 Junqi Zhao 2017-06-20 05:31 EDT
Created attachment 1289540 [details]
kibana UI, could be accessed now

Note You need to log in before you can comment on or make changes to this bug.