Bug 1459054
Summary: | Timeout creating SearchGuard index | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Ruben Romero Montes <rromerom> | ||||||||||
Component: | Logging | Assignee: | Jeff Cantrill <jcantril> | ||||||||||
Status: | CLOSED DUPLICATE | QA Contact: | Xia Zhao <xiazhao> | ||||||||||
Severity: | urgent | Docs Contact: | |||||||||||
Priority: | high | ||||||||||||
Version: | 3.4.1 | CC: | aivaras.laimikis, aos-bugs, erich, jcantril, nnosenzo, pdwyer, pportant, tlarsson | ||||||||||
Target Milestone: | --- | ||||||||||||
Target Release: | --- | ||||||||||||
Hardware: | Unspecified | ||||||||||||
OS: | Unspecified | ||||||||||||
Whiteboard: | |||||||||||||
Fixed In Version: | Doc Type: | If docs needed, set a value | |||||||||||
Doc Text: | Story Points: | --- | |||||||||||
Clone Of: | Environment: | ||||||||||||
Last Closed: | 2017-06-27 16:34:11 UTC | Type: | Bug | ||||||||||
Regression: | --- | Mount Type: | --- | ||||||||||
Documentation: | --- | CRM: | |||||||||||
Verified Versions: | Category: | --- | |||||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||||
Embargoed: | |||||||||||||
Attachments: |
|
Created attachment 1285281 [details]
docker inspect
Created attachment 1285282 [details]
all_logging
Created attachment 1285283 [details]
nodes description
The manual workaround can be to initialize SearchGuard from inside all three pods. $ oc rsh <logging-es-pod> # /usr/share/java/elasticsearch/plugins/search-guard-2/tools/sgadmin.sh \ -cd ${HOME}/sgconfig \ -i .searchguard.${HOSTNAME} \ -ks /etc/elasticsearch/secret/searchguard.key \ -kst JKS \ -kspass kspass \ -ts /etc/elasticsearch/secret/searchguard.truststore \ -tst JKS \ -tspass tspass \ -nhnv \ -icl Or also try to close some old indices manually in order to speed up the initialization. Closing this as a dup since its all related to the initialization of the SG index for which we have a fix and needs to be ported to 3.4.1. We will resolve against #1449378. Ref upstream PR to be backported: https://github.com/openshift/origin-aggregated-logging/pull/469 *** This bug has been marked as a duplicate of bug 1449378 *** |
Created attachment 1285280 [details] docker logs Description of problem: SearchGuard is not able to initialize after a timeout. [2017-06-05 08:15:36,606][INFO ][cluster.routing.allocation] [Crimson] Cluster health status changed from [RED] to [YELLOW] (reason: [shards started [[project.aes-mbaas-infra.26cf63cd-2b47-11e7-a35d-0acaab79e3f7.2017.05.22][0], [.searchguard.logging-es-xgrcmvev-3-5nb30][0], [project.nagp-il-core-int-01.7c640146-08a0-11e7-8d5d-0610033e8e3f.2017.05.22][0], [.searchguard.logging-es-sm5vnjla-3-55uhm][0]] ...]). Clustername: logging-es Clusterstate: YELLOW Number of nodes: 3 Number of data nodes: 3 .searchguard.logging-es-dyppkops-2-qapt3 index does not exists, attempt to create it ... [2017-06-05 08:15:46,856][ERROR][com.floragunn.searchguard.auth.BackendRegistry] Not yet initialized [2017-06-05 08:15:54,250][ERROR][com.floragunn.searchguard.auth.BackendRegistry] Not yet initialized ... ERR: An unexpected ProcessClusterEventTimeoutException occured: failed to process cluster event (create-index [.searchguard.logging-es-dyppkops-2-qapt3], cause [api]) within 30s Trace: ProcessClusterEventTimeoutException[failed to process cluster event (create-index [.searchguard.logging-es-dyppkops-2-qapt3], cause [api]) within 30s] at org.elasticsearch.cluster.service.InternalClusterService$2$1.run(InternalClusterService.java:349) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:748) ... [2017-06-05 08:25:06,818][INFO ][cluster.routing.allocation] [Crimson] Cluster health status changed from [YELLOW] to [GREEN] (reason: [shards started [[.searchguard.logging-es-xgrcmvev-3-5nb30][0], [.searchguard.logging-es-xgrcmvev-3-5nb30][0]] ...]). Version-Release number of selected component (if applicable): openshift3-logging-elasticsearch-3.4.1-26 How reproducible: Only in the reporting environment. Steps to Reproduce: 1. Scale down to 0 all the 3 deploymentConfigs 2. Scale up to 1 all 3 deploymentconfigs Actual results: failed to process cluster event (create-index [.searchguard.logging-es-dyppkops-2-qapt3], cause [api]) within 30s Expected results: The SearchGuard index to be initialized Additional info: Volume type is gp2 with 1500 / 3000 iops, with 500GiB of storage 148G of data free /dev/mapper/vg01-data 500G 353G 148G 71% /data Deployment: AWS Memory: 16GB Ec2 instances are m4.xlarge for the masters, and r4.xlarge for the nodes Ensured they have auto_expand_replicas: 2 in the configmap.