Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1428249

Summary: [IntService_public_324] Elasticsearch stayed at 3.3.1 level and reported "java.lang.IllegalArgumentException: Could not resolve placeholder 'NAMESPACE'" after logging was upgraded to 3.5.0
Product: OpenShift Container Platform Reporter: Xia Zhao <xiazhao>
Component: InstallerAssignee: Jeff Cantrill <jcantril>
Status: CLOSED ERRATA QA Contact: Xia Zhao <xiazhao>
Severity: medium Docs Contact:
Priority: medium    
Version: 3.5.0CC: aos-bugs, ewolinet, jokerman, mmccomas, rmeggins, xiazhao
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: No Doc Update
Doc Text:
undefined
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-04-12 19:03:12 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Part of the ansible execution log when upgrade the logging stacks
none
the inventory file used for logging upgrade to 3.5.0
none
full ansible upgrade log none

Description Xia Zhao 2017-03-02 06:45:07 UTC
Created attachment 1259013 [details]
Part of the ansible execution log when upgrade the logging stacks

Description of problem:
Upgrade logging stacks from 3.3.1 to 3.5.0 by using ansible scripts, after ansible execution process completed successfully, the es pod stayed at 3.3.1 level (while all the other guys including curator,kibana,fluentd are at 3.5.0 level) and reported "java.lang.IllegalArgumentException: Could not resolve placeholder 'NAMESPACE'".

Version-Release number of selected component (if applicable):
openshift-ansible-3.5.20-1.git.0.5a5fcd5.el7.noarch

How reproducible:
Always

Steps to Reproduce:
1. Install logging 3.3.1 stacks on a OCP 3.5.0 master, attach elasticsearch with the HostPath PV
2. Upgrade logging stacks to 3.5.0 by using ansible scripts (inventory file attached)
3. Check elasticsearch status post upgrade

Actual results:
Elasticsearch stayed at 3.3.1 level and reported "java.lang.IllegalArgumentException: Could not resolve placeholder 'NAMESPACE'" after logging was upgraded to 3.5.0:

# oc get po
NAME                          READY     STATUS             RESTARTS   AGE
logging-curator-2-l8jb8       1/1       Running            8          44m
logging-deployer-4bjvf        0/1       Completed          0          20h
logging-es-5bd3p6ko-3-rkbt9   0/1       Error              9          26m
logging-fluentd-8k1m2         1/1       Running            0          46m
logging-fluentd-rg7d4         1/1       Running            0          41m
logging-kibana-2-4v0hx        2/2       Running            0          44m

# oc get dc logging-es-5bd3p6ko -o yaml | grep image
        image: brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/openshift3/logging-elasticsearch:3.3.1
        imagePullPolicy: Always

# oc logs -f logging-es-5bd3p6ko-3-rkbt9
Comparing the specificed RAM to the maximum recommended for ElasticSearch...
Inspecting the maximum RAM available...
ES_JAVA_OPTS: '-Des.path.home=/usr/share/java/elasticsearch -Des.config=/usr/share/java/elasticsearch/config/elasticsearch.yml -Xms128M -Xmx512m'
{1.5.2}: Setup Failed ...
- IllegalArgumentException[Could not resolve placeholder 'NAMESPACE']
java.lang.IllegalArgumentException: Could not resolve placeholder 'NAMESPACE'
    at org.elasticsearch.common.property.PropertyPlaceholder.parseStringValue(PropertyPlaceholder.java:124)
    at org.elasticsearch.common.property.PropertyPlaceholder.replacePlaceholders(PropertyPlaceholder.java:81)
    at org.elasticsearch.common.settings.ImmutableSettings$Builder.replacePropertyPlaceholders(ImmutableSettings.java:1098)
    at org.elasticsearch.node.internal.InternalSettingsPreparer.prepareSettings(InternalSettingsPreparer.java:101)
    at org.elasticsearch.bootstrap.Bootstrap.initialSettings(Bootstrap.java:112)
    at org.elasticsearch.bootstrap.Bootstrap.main(Bootstrap.java:183)
    at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:32)


Expected results:
ES should be running fine after upgrade

Additional info:
Ansible log (part of it) attached
Upgrade inventory file attached

Comment 2 Xia Zhao 2017-03-02 06:55:27 UTC
Created attachment 1259029 [details]
the inventory file used for logging upgrade to 3.5.0

Comment 3 ewolinet 2017-03-02 16:23:54 UTC
Xia,

Can you please provide the entirety of the ansible playbook output? The logic we would need to look at is where we would generate the ES DC template, the portion of logs you pasted is just the 'oc apply' of them.

Comment 4 Rich Megginson 2017-03-02 19:56:00 UTC
So, in the old deployer/scripts/upgrade.sh there is this function:

# this is required for the upgrade to ES 2.3.5
function update_es_for_235() {

That adds the downward API NAMESPACE var to ES config

Do we still need that in the new ansible es_migration.sh?

Comment 5 ewolinet 2017-03-02 20:02:05 UTC
We already provide that as part of the ES dc we generate:
https://github.com/openshift/openshift-ansible/blob/master/roles/openshift_logging/templates/es.j2#L61-L64

The problem it seems like, is that we didn't even generate an ES DC template...

Comment 6 Jeff Cantrill 2017-03-02 21:41:23 UTC
@eric I wonder if we suffer from the same thing that fixed pvc: https://github.com/openshift/openshift-ansible/pull/3548/files#diff-8484225afbf0539375b973fb21b46838R68

Comment 7 Xia Zhao 2017-03-03 08:38:10 UTC
(In reply to ewolinet from comment #3)
> Xia,
> 
> Can you please provide the entirety of the ansible playbook output? The
> logic we would need to look at is where we would generate the ES DC
> template, the portion of logs you pasted is just the 'oc apply' of them.

No problem, let me redo the upgrade to provide the full log. Just need some more hours for 3.3.1 logging systems to generate log entries on journald log driver system. Will attach it soon later.

Comment 8 Xia Zhao 2017-03-03 13:39:35 UTC
Created attachment 1259549 [details]
full ansible upgrade log

Comment 9 Xia Zhao 2017-03-03 13:40:45 UTC
@ewolinet 
The original issue was reproduced and full ansible upgrade log was attached.

Comment 10 ewolinet 2017-03-03 16:31:15 UTC
Thanks Xia,

I see this in the log:

TASK [openshift_logging : Applying /tmp/openshift-logging-ansible-VcmLKX/templates/logging-logging-es-5a36v5kl-dc.yaml] ***
task path: /usr/share/ansible/openshift-ansible/roles/openshift_logging/tasks/oc_apply.yaml:13
Using module file /usr/lib/python2.7/site-packages/ansible/modules/core/commands/command.py
<host-8-173-207.host.centralci.eng.rdu2.redhat.com> ESTABLISH SSH CONNECTION FOR USER: root
<host-8-173-207.host.centralci.eng.rdu2.redhat.com> SSH: EXEC ssh -C -o ControlMaster=auto -o ControlPersist=60s -o 'IdentityFile="/root/libra.pem"' -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o User=root -o ConnectTimeout=10 -o ControlPath=/root/.ansible/cp/ansible-ssh-%h-%p-%r host-8-173-207.host.centralci.eng.rdu2.redhat.com '/bin/sh -c '"'"'( umask 77 && mkdir -p "` echo ~/.ansible/tmp/ansible-tmp-1488547884.75-30025641072589 `" && echo ansible-tmp-1488547884.75-30025641072589="` echo ~/.ansible/tmp/ansible-tmp-1488547884.75-30025641072589 `" ) && sleep 0'"'"''
<host-8-173-207.host.centralci.eng.rdu2.redhat.com> PUT /tmp/tmpcn7Mcq TO /root/.ansible/tmp/ansible-tmp-1488547884.75-30025641072589/command.py
<host-8-173-207.host.centralci.eng.rdu2.redhat.com> SSH: EXEC sftp -b - -C -o ControlMaster=auto -o ControlPersist=60s -o 'IdentityFile="/root/libra.pem"' -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o User=root -o ConnectTimeout=10 -o ControlPath=/root/.ansible/cp/ansible-ssh-%h-%p-%r '[host-8-173-207.host.centralci.eng.rdu2.redhat.com]'
<host-8-173-207.host.centralci.eng.rdu2.redhat.com> ESTABLISH SSH CONNECTION FOR USER: root
<host-8-173-207.host.centralci.eng.rdu2.redhat.com> SSH: EXEC ssh -C -o ControlMaster=auto -o ControlPersist=60s -o 'IdentityFile="/root/libra.pem"' -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o User=root -o ConnectTimeout=10 -o ControlPath=/root/.ansible/cp/ansible-ssh-%h-%p-%r host-8-173-207.host.centralci.eng.rdu2.redhat.com '/bin/sh -c '"'"'chmod u+x /root/.ansible/tmp/ansible-tmp-1488547884.75-30025641072589/ /root/.ansible/tmp/ansible-tmp-1488547884.75-30025641072589/command.py && sleep 0'"'"''
<host-8-173-207.host.centralci.eng.rdu2.redhat.com> ESTABLISH SSH CONNECTION FOR USER: root
<host-8-173-207.host.centralci.eng.rdu2.redhat.com> SSH: EXEC ssh -C -o ControlMaster=auto -o ControlPersist=60s -o 'IdentityFile="/root/libra.pem"' -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o User=root -o ConnectTimeout=10 -o ControlPath=/root/.ansible/cp/ansible-ssh-%h-%p-%r -tt host-8-173-207.host.centralci.eng.rdu2.redhat.com '/bin/sh -c '"'"'/usr/bin/python /root/.ansible/tmp/ansible-tmp-1488547884.75-30025641072589/command.py; rm -rf "/root/.ansible/tmp/ansible-tmp-1488547884.75-30025641072589/" > /dev/null 2>&1 && sleep 0'"'"''
ok: [host-8-173-207.host.centralci.eng.rdu2.redhat.com] => {
    "changed": false, 
    "cmd": [
        "/usr/local/bin/oc", 
        "--config=/tmp/openshift-logging-ansible-VcmLKX/admin.kubeconfig", 
        "apply", 
        "-f", 
        "/tmp/openshift-logging-ansible-VcmLKX/templates/logging-logging-es-5a36v5kl-dc.yaml", 
        "-n", 
        "logging"
    ], 
    "delta": "0:00:00.167743", 
    "end": "2017-03-03 13:31:24.028327", 
    "failed": false, 
    "failed_when_result": false, 
    "invocation": {
        "module_args": {
            "_raw_params": "/usr/local/bin/oc --config=/tmp/openshift-logging-ansible-VcmLKX/admin.kubeconfig apply -f /tmp/openshift-logging-ansible-VcmLKX/templates/logging-logging-es-5a36v5kl-dc.yaml -n logging", 
            "_uses_shell": false, 
            "chdir": null, 
            "creates": null, 
            "executable": null, 
            "removes": null, 
            "warn": true
        }, 
        "module_name": "command"
    }, 
    "rc": 1, 
    "start": "2017-03-03 13:31:23.860584", 
    "warnings": []
}

STDERR:

The DeploymentConfig "logging-es-5a36v5kl" is invalid: 
* spec.template.spec.volumes[2].hostPath: Forbidden: may not specify more than 1 volume type
* spec.template.spec.containers[0].volumeMounts[2].name: Not found: "elasticsearch-storage"


I'll log on and see how the DC was generated

Comment 11 ewolinet 2017-03-03 17:06:41 UTC
So this looks to be due to the fact that the currently deployed ES DC uses a host mount, however the generated DC template for ES uses an emptyDir.

So when the role tries to apply it clobbers and then it seems that its left without an 'elasticsearch-storage' volume definition.

Current DC snippet:
        volumeMounts:
        - mountPath: /etc/elasticsearch/secret
          name: elasticsearch
          readOnly: true
        - mountPath: /usr/share/elasticsearch/config
          name: elasticsearch-config
          readOnly: true
        - mountPath: /elasticsearch/persistent
          name: elasticsearch-storage
      volumes:
      - name: elasticsearch
        secret:
          defaultMode: 420
          secretName: logging-elasticsearch
      - configMap:
          defaultMode: 420
          name: logging-elasticsearch
        name: elasticsearch-config
      - hostPath:
          path: /usr/local/es-storage
        name: elasticsearch-storage

Template snippet:
          volumeMounts:
            - name: elasticsearch
              mountPath: /etc/elasticsearch/secret
              readOnly: true
            - name: elasticsearch-config
              mountPath: /usr/share/java/elasticsearch/config
              readOnly: true
            - name: elasticsearch-storage
              mountPath: /elasticsearch/persistent
      volumes:
        - name: elasticsearch
          secret:
            secretName: logging-elasticsearch
        - name: elasticsearch-config
          configMap:
            name: logging-elasticsearch
        - name: elasticsearch-storage
          emptyDir: {}

Comment 12 Jeff Cantrill 2017-03-08 14:22:28 UTC
Prefer hostMount to other storage if exists in facts: https://github.com/openshift/openshift-ansible/pull/3596

Comment 13 Jeff Cantrill 2017-03-09 17:14:56 UTC
1.5 fix in https://github.com/openshift/openshift-ansible/pull/3608

Comment 14 openshift-github-bot 2017-03-10 21:55:12 UTC
Commit pushed to master at https://github.com/openshift/openshift-ansible

https://github.com/openshift/openshift-ansible/commit/5e952859247d28abe6d5efb794ff6a1f8639000d
bug 1428249. Use ES hostmount storage if it exists

Comment 16 Xia Zhao 2017-03-14 05:30:57 UTC
blocked by https://bugzilla.redhat.com/show_bug.cgi?id=1431935

Comment 17 Xia Zhao 2017-03-16 08:29:00 UTC
Verified with openshift-ansible-3.5.35-1.git.0.7aa4728.el7.noarch, after upgrade, meet the exception described in https://bugzilla.redhat.com/show_bug.cgi?id=1428711 which is considered not to be a real support case. Set to verified since the original issue did not repro.

Comment 19 errata-xmlrpc 2017-04-12 19:03:12 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:0903