Bug 1460373

Summary: Red searchguard indicies and "Try running sgadmin.sh" message in es logs
Product: OpenShift Container Platform Reporter: Steven Walter <stwalter>
Component: LoggingAssignee: Jeff Cantrill <jcantril>
Status: CLOSED DUPLICATE QA Contact: Xia Zhao <xiazhao>
Severity: high Docs Contact:
Priority: unspecified    
Version: 3.5.0CC: aos-bugs, pportant, pweil
Target Milestone: ---   
Target Release: 3.7.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-09-11 18:27:20 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Steven Walter 2017-06-09 21:22:40 UTC
Description of problem:
Logging stack started but logging-es logs have:

   * Try running sgadmin.sh with -icl and -nhnv (If thats works you need to check your clustername as well as hostnames in your SSL certificates)
[2017-06-08 14:04:59,526][ERROR][com.floragunn.searchguard.auth.BackendRegistry] Not yet initialized (you may need to run sgadmin)

In indicies check we see:


green  open   .searchguard.logging-es-3ec2rtmj-2-d6wzd                                               1   2          4            0    138.4kb         46.1kb
red    open   .searchguard.logging-es-3ec2rtmj-1-evitf                                               1   2                                            

green  open   .searchguard.logging-es-l5nl20ab-1-emkhw                                               1   2          4            0     97.2kb         32.4kb
red    open   .searchguard.logging-es-l5nl20ab-1-88m47                                               1   2                                            


Version-Release number of selected component (if applicable):
v3.5

Jeff Cantril indicates PR might backport a fix to this issue:
https://github.com/openshift/origin-aggregated-logging/pull/460


Potential workaround: 
scaling down elasticsearch cluster to 1 node; delete all .searchguard indicies; redeploy.
Is this a viable workaround? Can we clarify if "redeploy" means run the deployer template with MODE=redeploy or if it means `oc rollout latest` or something similar. Or do we just scale up to 3?

Comment 4 Steven Walter 2017-06-29 18:12:52 UTC
Is this bug related to https://bugzilla.redhat.com/show_bug.cgi?id=1449378 and if so can we close this one?

Comment 5 Peter Portante 2017-07-12 16:40:48 UTC
Regarding the referenced PR #460 in the description, it does not seem that will be a fix.

This problem does appear to be another manifestation of #1449378, and I think this can be safely closed as a duplicate of that bug.

Comment 6 Jeff Cantrill 2017-09-11 18:27:20 UTC

*** This bug has been marked as a duplicate of bug 1449378 ***