Bug 1505860

Summary: Ansible playbooks are not aligned with EFK images
Product: OpenShift Container Platform Reporter: Ruben Romero Montes <rromerom>
Component: LoggingAssignee: Jeff Cantrill <jcantril>
Status: CLOSED DUPLICATE QA Contact: Anping Li <anli>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: 3.6.0CC: aos-bugs, jwozniak, pportant, rmeggins
Target Milestone: ---Keywords: OpsBlocker
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-10-24 17:18:11 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Ruben Romero Montes 2017-10-24 12:12:32 UTC
Description of problem:
After running the playbooks to upgrade aggregated-logging it causes several errors due to the lack of synchronization between configurations.

Version-Release number of selected component (if applicable):
openshift-ansible-3.6.173.0.48-1.git.0.1609d30.el7.noarch

logging-curator-v3.6.173.0.21-17
logging-elasticsearch-v3.6.173.0.5-5
logging-fluentd-v3.6.173.0.21-17
logging-kibana-v3.6.173.0.21-22
logging-auth-proxy-v3.6.173.0.21-17

How reproducible:
Always

Steps to Reproduce:
1. yum update on masters
2. ansible-playbook openshift-logging.yml

Actual results:

Error 1. Elasticsearch pod is unable to start

[2017-10-24 11:53:27,760][INFO ][container.run            ] Checking if Elasticsearch is ready on https://localhost:9200
Exception in thread "main" java.lang.IllegalArgumentException: Unknown Discovery type [kubernetes]
	at org.elasticsearch.discovery.DiscoveryModule.configure(DiscoveryModule.java:100)
	at <<<guice>>>
	at org.elasticsearch.node.Node.<init>(Node.java:213)
	at org.elasticsearch.node.Node.<init>(Node.java:140)
	at org.elasticsearch.node.NodeBuilder.build(NodeBuilder.java:143)
	at org.elasticsearch.bootstrap.Bootstrap.setup(Bootstrap.java:194)
	at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:286)
	at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:45)
This is caused by the livenessProbes configuration in the configMap
  cloud:
   kubernetes:
     pod_label: ${POD_LABEL}
     pod_port: 9300
     namespace: ${NAMESPACE}

Still needs to be this:
    cloud:
      kubernetes:
        service: ${SERVICE_DNS}
        namespace: ${NAMESPACE}
After changing it the Elasticsearch pod starts

Error 2. Fluentd does not send any log to elasticsearch.

This is caused by the addition of the <label @OUTPUT> to the configmap but fluentd images still don't add this label. The following errors are appreciated:

2017-10-24 07:57:51 -0400 [warn]: no patterns matched tag="journal.system"

The logging-fluentd configmap looks like this:

      @include configs.d/openshift/filter-post-*.conf
    ##
    </label>

    <label @OUTPUT>
    ## matches
      @include configs.d/openshift/output-pre-*.conf

Should still be:

      @include configs.d/openshift/filter-post-*.conf
    ##

    ## matches
      @include configs.d/openshift/output-pre-*.conf

After removing the </label> and <label @OUTPUT> it works again.

Expected results:
Elasticsearch pods to be deployed successfully
Fluentd pods to forward logs to elasticsearch 

Additional info:

Comment 1 Jan Wozniak 2017-10-24 16:20:39 UTC
The ES image with readiness probe has been waiting for a release for more than a month and a half. It is not out of sync config between ansible and image, repositories have been in sync since it merged, we only need the image to be rebuilt and pushed to the registries.

For more information, take a look at https://bugzilla.redhat.com/show_bug.cgi?id=1503563

Comment 2 Jeff Cantrill 2017-10-24 17:18:11 UTC

*** This bug has been marked as a duplicate of bug 1503563 ***