Bug 1428249 - [IntService_public_324] Elasticsearch stayed at 3.3.1 level and reported "java.lang.IllegalArgumentException: Could not resolve placeholder 'NAMESPACE'" after logging was upgraded to 3.5.0
Summary: [IntService_public_324] Elasticsearch stayed at 3.3.1 level and reported "jav...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer
Version: 3.5.0
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: ---
Assignee: Jeff Cantrill
QA Contact: Xia Zhao
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-03-02 06:45 UTC by Xia Zhao
Modified: 2017-07-24 14:11 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
undefined
Clone Of:
Environment:
Last Closed: 2017-04-12 19:03:12 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Part of the ansible execution log when upgrade the logging stacks (574.32 KB, text/plain)
2017-03-02 06:45 UTC, Xia Zhao
no flags Details
the inventory file used for logging upgrade to 3.5.0 (797 bytes, text/plain)
2017-03-02 06:55 UTC, Xia Zhao
no flags Details
full ansible upgrade log (2.38 MB, text/plain)
2017-03-03 13:39 UTC, Xia Zhao
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2017:0903 0 normal SHIPPED_LIVE OpenShift Container Platform atomic-openshift-utils bug fix and enhancement 2017-04-12 22:45:42 UTC

Description Xia Zhao 2017-03-02 06:45:07 UTC
Created attachment 1259013 [details]
Part of the ansible execution log when upgrade the logging stacks

Description of problem:
Upgrade logging stacks from 3.3.1 to 3.5.0 by using ansible scripts, after ansible execution process completed successfully, the es pod stayed at 3.3.1 level (while all the other guys including curator,kibana,fluentd are at 3.5.0 level) and reported "java.lang.IllegalArgumentException: Could not resolve placeholder 'NAMESPACE'".

Version-Release number of selected component (if applicable):
openshift-ansible-3.5.20-1.git.0.5a5fcd5.el7.noarch

How reproducible:
Always

Steps to Reproduce:
1. Install logging 3.3.1 stacks on a OCP 3.5.0 master, attach elasticsearch with the HostPath PV
2. Upgrade logging stacks to 3.5.0 by using ansible scripts (inventory file attached)
3. Check elasticsearch status post upgrade

Actual results:
Elasticsearch stayed at 3.3.1 level and reported "java.lang.IllegalArgumentException: Could not resolve placeholder 'NAMESPACE'" after logging was upgraded to 3.5.0:

# oc get po
NAME                          READY     STATUS             RESTARTS   AGE
logging-curator-2-l8jb8       1/1       Running            8          44m
logging-deployer-4bjvf        0/1       Completed          0          20h
logging-es-5bd3p6ko-3-rkbt9   0/1       Error              9          26m
logging-fluentd-8k1m2         1/1       Running            0          46m
logging-fluentd-rg7d4         1/1       Running            0          41m
logging-kibana-2-4v0hx        2/2       Running            0          44m

# oc get dc logging-es-5bd3p6ko -o yaml | grep image
        image: brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/openshift3/logging-elasticsearch:3.3.1
        imagePullPolicy: Always

# oc logs -f logging-es-5bd3p6ko-3-rkbt9
Comparing the specificed RAM to the maximum recommended for ElasticSearch...
Inspecting the maximum RAM available...
ES_JAVA_OPTS: '-Des.path.home=/usr/share/java/elasticsearch -Des.config=/usr/share/java/elasticsearch/config/elasticsearch.yml -Xms128M -Xmx512m'
{1.5.2}: Setup Failed ...
- IllegalArgumentException[Could not resolve placeholder 'NAMESPACE']
java.lang.IllegalArgumentException: Could not resolve placeholder 'NAMESPACE'
    at org.elasticsearch.common.property.PropertyPlaceholder.parseStringValue(PropertyPlaceholder.java:124)
    at org.elasticsearch.common.property.PropertyPlaceholder.replacePlaceholders(PropertyPlaceholder.java:81)
    at org.elasticsearch.common.settings.ImmutableSettings$Builder.replacePropertyPlaceholders(ImmutableSettings.java:1098)
    at org.elasticsearch.node.internal.InternalSettingsPreparer.prepareSettings(InternalSettingsPreparer.java:101)
    at org.elasticsearch.bootstrap.Bootstrap.initialSettings(Bootstrap.java:112)
    at org.elasticsearch.bootstrap.Bootstrap.main(Bootstrap.java:183)
    at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:32)


Expected results:
ES should be running fine after upgrade

Additional info:
Ansible log (part of it) attached
Upgrade inventory file attached

Comment 2 Xia Zhao 2017-03-02 06:55:27 UTC
Created attachment 1259029 [details]
the inventory file used for logging upgrade to 3.5.0

Comment 3 ewolinet 2017-03-02 16:23:54 UTC
Xia,

Can you please provide the entirety of the ansible playbook output? The logic we would need to look at is where we would generate the ES DC template, the portion of logs you pasted is just the 'oc apply' of them.

Comment 4 Rich Megginson 2017-03-02 19:56:00 UTC
So, in the old deployer/scripts/upgrade.sh there is this function:

# this is required for the upgrade to ES 2.3.5
function update_es_for_235() {

That adds the downward API NAMESPACE var to ES config

Do we still need that in the new ansible es_migration.sh?

Comment 5 ewolinet 2017-03-02 20:02:05 UTC
We already provide that as part of the ES dc we generate:
https://github.com/openshift/openshift-ansible/blob/master/roles/openshift_logging/templates/es.j2#L61-L64

The problem it seems like, is that we didn't even generate an ES DC template...

Comment 6 Jeff Cantrill 2017-03-02 21:41:23 UTC
@eric I wonder if we suffer from the same thing that fixed pvc: https://github.com/openshift/openshift-ansible/pull/3548/files#diff-8484225afbf0539375b973fb21b46838R68

Comment 7 Xia Zhao 2017-03-03 08:38:10 UTC
(In reply to ewolinet from comment #3)
> Xia,
> 
> Can you please provide the entirety of the ansible playbook output? The
> logic we would need to look at is where we would generate the ES DC
> template, the portion of logs you pasted is just the 'oc apply' of them.

No problem, let me redo the upgrade to provide the full log. Just need some more hours for 3.3.1 logging systems to generate log entries on journald log driver system. Will attach it soon later.

Comment 8 Xia Zhao 2017-03-03 13:39:35 UTC
Created attachment 1259549 [details]
full ansible upgrade log

Comment 9 Xia Zhao 2017-03-03 13:40:45 UTC
@ewolinet 
The original issue was reproduced and full ansible upgrade log was attached.

Comment 10 ewolinet 2017-03-03 16:31:15 UTC
Thanks Xia,

I see this in the log:

TASK [openshift_logging : Applying /tmp/openshift-logging-ansible-VcmLKX/templates/logging-logging-es-5a36v5kl-dc.yaml] ***
task path: /usr/share/ansible/openshift-ansible/roles/openshift_logging/tasks/oc_apply.yaml:13
Using module file /usr/lib/python2.7/site-packages/ansible/modules/core/commands/command.py
<host-8-173-207.host.centralci.eng.rdu2.redhat.com> ESTABLISH SSH CONNECTION FOR USER: root
<host-8-173-207.host.centralci.eng.rdu2.redhat.com> SSH: EXEC ssh -C -o ControlMaster=auto -o ControlPersist=60s -o 'IdentityFile="/root/libra.pem"' -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o User=root -o ConnectTimeout=10 -o ControlPath=/root/.ansible/cp/ansible-ssh-%h-%p-%r host-8-173-207.host.centralci.eng.rdu2.redhat.com '/bin/sh -c '"'"'( umask 77 && mkdir -p "` echo ~/.ansible/tmp/ansible-tmp-1488547884.75-30025641072589 `" && echo ansible-tmp-1488547884.75-30025641072589="` echo ~/.ansible/tmp/ansible-tmp-1488547884.75-30025641072589 `" ) && sleep 0'"'"''
<host-8-173-207.host.centralci.eng.rdu2.redhat.com> PUT /tmp/tmpcn7Mcq TO /root/.ansible/tmp/ansible-tmp-1488547884.75-30025641072589/command.py
<host-8-173-207.host.centralci.eng.rdu2.redhat.com> SSH: EXEC sftp -b - -C -o ControlMaster=auto -o ControlPersist=60s -o 'IdentityFile="/root/libra.pem"' -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o User=root -o ConnectTimeout=10 -o ControlPath=/root/.ansible/cp/ansible-ssh-%h-%p-%r '[host-8-173-207.host.centralci.eng.rdu2.redhat.com]'
<host-8-173-207.host.centralci.eng.rdu2.redhat.com> ESTABLISH SSH CONNECTION FOR USER: root
<host-8-173-207.host.centralci.eng.rdu2.redhat.com> SSH: EXEC ssh -C -o ControlMaster=auto -o ControlPersist=60s -o 'IdentityFile="/root/libra.pem"' -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o User=root -o ConnectTimeout=10 -o ControlPath=/root/.ansible/cp/ansible-ssh-%h-%p-%r host-8-173-207.host.centralci.eng.rdu2.redhat.com '/bin/sh -c '"'"'chmod u+x /root/.ansible/tmp/ansible-tmp-1488547884.75-30025641072589/ /root/.ansible/tmp/ansible-tmp-1488547884.75-30025641072589/command.py && sleep 0'"'"''
<host-8-173-207.host.centralci.eng.rdu2.redhat.com> ESTABLISH SSH CONNECTION FOR USER: root
<host-8-173-207.host.centralci.eng.rdu2.redhat.com> SSH: EXEC ssh -C -o ControlMaster=auto -o ControlPersist=60s -o 'IdentityFile="/root/libra.pem"' -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o User=root -o ConnectTimeout=10 -o ControlPath=/root/.ansible/cp/ansible-ssh-%h-%p-%r -tt host-8-173-207.host.centralci.eng.rdu2.redhat.com '/bin/sh -c '"'"'/usr/bin/python /root/.ansible/tmp/ansible-tmp-1488547884.75-30025641072589/command.py; rm -rf "/root/.ansible/tmp/ansible-tmp-1488547884.75-30025641072589/" > /dev/null 2>&1 && sleep 0'"'"''
ok: [host-8-173-207.host.centralci.eng.rdu2.redhat.com] => {
    "changed": false, 
    "cmd": [
        "/usr/local/bin/oc", 
        "--config=/tmp/openshift-logging-ansible-VcmLKX/admin.kubeconfig", 
        "apply", 
        "-f", 
        "/tmp/openshift-logging-ansible-VcmLKX/templates/logging-logging-es-5a36v5kl-dc.yaml", 
        "-n", 
        "logging"
    ], 
    "delta": "0:00:00.167743", 
    "end": "2017-03-03 13:31:24.028327", 
    "failed": false, 
    "failed_when_result": false, 
    "invocation": {
        "module_args": {
            "_raw_params": "/usr/local/bin/oc --config=/tmp/openshift-logging-ansible-VcmLKX/admin.kubeconfig apply -f /tmp/openshift-logging-ansible-VcmLKX/templates/logging-logging-es-5a36v5kl-dc.yaml -n logging", 
            "_uses_shell": false, 
            "chdir": null, 
            "creates": null, 
            "executable": null, 
            "removes": null, 
            "warn": true
        }, 
        "module_name": "command"
    }, 
    "rc": 1, 
    "start": "2017-03-03 13:31:23.860584", 
    "warnings": []
}

STDERR:

The DeploymentConfig "logging-es-5a36v5kl" is invalid: 
* spec.template.spec.volumes[2].hostPath: Forbidden: may not specify more than 1 volume type
* spec.template.spec.containers[0].volumeMounts[2].name: Not found: "elasticsearch-storage"


I'll log on and see how the DC was generated

Comment 11 ewolinet 2017-03-03 17:06:41 UTC
So this looks to be due to the fact that the currently deployed ES DC uses a host mount, however the generated DC template for ES uses an emptyDir.

So when the role tries to apply it clobbers and then it seems that its left without an 'elasticsearch-storage' volume definition.

Current DC snippet:
        volumeMounts:
        - mountPath: /etc/elasticsearch/secret
          name: elasticsearch
          readOnly: true
        - mountPath: /usr/share/elasticsearch/config
          name: elasticsearch-config
          readOnly: true
        - mountPath: /elasticsearch/persistent
          name: elasticsearch-storage
      volumes:
      - name: elasticsearch
        secret:
          defaultMode: 420
          secretName: logging-elasticsearch
      - configMap:
          defaultMode: 420
          name: logging-elasticsearch
        name: elasticsearch-config
      - hostPath:
          path: /usr/local/es-storage
        name: elasticsearch-storage

Template snippet:
          volumeMounts:
            - name: elasticsearch
              mountPath: /etc/elasticsearch/secret
              readOnly: true
            - name: elasticsearch-config
              mountPath: /usr/share/java/elasticsearch/config
              readOnly: true
            - name: elasticsearch-storage
              mountPath: /elasticsearch/persistent
      volumes:
        - name: elasticsearch
          secret:
            secretName: logging-elasticsearch
        - name: elasticsearch-config
          configMap:
            name: logging-elasticsearch
        - name: elasticsearch-storage
          emptyDir: {}

Comment 12 Jeff Cantrill 2017-03-08 14:22:28 UTC
Prefer hostMount to other storage if exists in facts: https://github.com/openshift/openshift-ansible/pull/3596

Comment 13 Jeff Cantrill 2017-03-09 17:14:56 UTC
1.5 fix in https://github.com/openshift/openshift-ansible/pull/3608

Comment 14 openshift-github-bot 2017-03-10 21:55:12 UTC
Commit pushed to master at https://github.com/openshift/openshift-ansible

https://github.com/openshift/openshift-ansible/commit/5e952859247d28abe6d5efb794ff6a1f8639000d
bug 1428249. Use ES hostmount storage if it exists

Comment 16 Xia Zhao 2017-03-14 05:30:57 UTC
blocked by https://bugzilla.redhat.com/show_bug.cgi?id=1431935

Comment 17 Xia Zhao 2017-03-16 08:29:00 UTC
Verified with openshift-ansible-3.5.35-1.git.0.7aa4728.el7.noarch, after upgrade, meet the exception described in https://bugzilla.redhat.com/show_bug.cgi?id=1428711 which is considered not to be a real support case. Set to verified since the original issue did not repro.

Comment 19 errata-xmlrpc 2017-04-12 19:03:12 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:0903


Note You need to log in before you can comment on or make changes to this bug.