Created attachment 1245241 [details] logging-deployer pod logs and oc describe output Description of problem: Logging-deployer fails to execute during OCP 3.5 cluster advanced installation on AWS EC2. This after running ansible playbook "openshift-ansible/playbooks/byo/config.yml", with openshift_ansible_vars: "openshift_hosted_logging_deploy: true". Error from logging-deployer pod log from logging project: + oc new-app logging-pvc-dynamic-template -p NAME=logging-es1,SIZE=10Gi warning: --param no longer accepts comma-separated lists of values. "NAME=logging-es1,SIZE=10Gi" will be treated as a single key-value pair. error: error processing template "logging/logging-pvc-dynamic-template": Template "logging-pvc-dynamic-template" is invalid: template.parameters[1]: Required value: template.parameters[1]: parameter SIZE is required and must be specified openshift-ansible variables in yml file: openshift_hosted_logging_deploy: true openshift_hosted_logging_deployer_prefix: registry.ops.openshift.com/openshift3/ openshift_hosted_logging_elasticsearch_pvc_size: 25G openshift_hosted_logging_storage_kind: dynamic openshift_hosted_logging_deployer_version: 3.5.0 openshift_hosted_logging_elasticsearch_cluster_size: 1 Version-Release number of selected component (if applicable): openshift v3.5.0.9+e84be2b kubernetes v1.5.2+43a9be4 How reproducible: Reproducible Steps to Reproduce: 1. Install OCP 3.5 Cluster by running ansible playbook "openshift-ansible/playbooks/byo/config.yml" 2. oc project logging 3. oc get pods Actual results: # oc get pods NAME READY STATUS RESTARTS AGE logging-deployer-3g4s6 0/1 Error 0 21h logging-deployer-xm699 0/1 Error 0 22h Expected results: Logging-deployer should run successfully and deploy logging pods Additional info: logging-deployer pod logs and oc describe pod outputs are attached.
As of 3.5, you should be using the 'openshift_logging' role and not openshift_hosted_logging. Ref pending documentation: https://github.com/openshift/openshift-docs/pull/3559
@Walid can you provide additional info of what is happening here? IMO this issue should be closed since the logging deployer is not the desired way to deploy the 3.5 logging stack
I installed new OCP v3.5.0.14 cluster using advanced installation method with openshift/openshift-ansible/blob/master/playbooks/byo/config.yml. I set openshift_ansible_var "openshift_logging_install_logging: true". Logging pods did not deploy. Looks like "openshift-ansible/blob/master/playbooks/common/openshift-cluster/config.yml" playbook is including "openshift_hosted.yml" which is still using "openshift_hosted_logging" role.
Fix https://github.com/openshift/openshift-ansible/pull/3257 updates the referenced playbook to use the correct role for both logging and metrics.
*** Bug 1419398 has been marked as a duplicate of this bug. ***
After PR https://github.com/openshift/openshift-ansible/pull/3257 merged, it is still failing to deploy logging with BYO install playbook "openshift-ansible/blob/master/playbooks/common/openshift-cluster/config.yml" on OCP version: v3.5.0.17+c55cf2b Error at: TASK [openshift_logging : Generating secrets for logging components] *********** Wednesday 08 February 2017 20:53:03 +0000 (0:00:02.993) 0:12:07.247 **** failed: [<master-Public-DNS-removed>] (item=kibana) => {"changed": false, "component": "kibana", "failed": true, "msg": "AnsibleUndefinedVariable: Unable to look up a name or access an attribute in template string (apiVersion: v1\nkind: Secret\nmetadata:\n name: {{secret_name}}\ntype: Opaque\ndata:\n{% for s in secrets %}\n {{s.key}}: {{s.value | b64encode}}\n{% endfor %}\n).\nMake sure your variable name does not contain invalid characters like '-': must be convertible to a buffer, not StrictUndefined"} failed: [<master-Public-DNS-removed>] (item=curator) => {"changed": false, "component": "curator", "failed": true, "msg": "AnsibleUndefinedVariable: Unable to look up a name or access an attribute in template string (apiVersion: v1\nkind: Secret\nmetadata:\n name: {{secret_name}}\ntype: Opaque\ndata:\n{% for s in secrets %}\n {{s.key}}: {{s.value | b64encode}}\n{% endfor %}\n).\nMake sure your variable name does not contain invalid characters like '-': must be convertible to a buffer, not StrictUndefined"} failed: [<master-Public-DNS-removed>] (item=fluentd) => {"changed": false, "component": "fluentd", "failed": true, "msg": "AnsibleUndefinedVariable: Unable to look up a name or access an attribute in template string (apiVersion: v1\nkind: Secret\nmetadata:\n name: {{secret_name}}\ntype: Opaque\ndata:\n{% for s in secrets %}\n {{s.key}}: {{s.value | b64encode}}\n{% endfor %}\n).\nMake sure your variable name does not contain invalid characters like '-': must be convertible to a buffer, not StrictUndefined"} Ansible log attached (BYO_install-Ansible_log_20170208)
Comment6 should be the same issue as mentioned in BZ#1419838 after logging role replacement in openshift-ansible-3.5.4-1.git.0.034b615.el7.noarch.rpm.
Still fails per comment 6 (https://bugzilla.redhat.com/show_bug.cgi?id=1417261#c6)
I'm interested in getting more env information as I am unable to reproduce. Locally, my env seems to pass through those tasks without issue. This is what I have: openshift-ansible commit: a3cbd7333922a1c5941193a31878cda8d85807c4 # rpm -qva | grep ansible ansible-2.2.0.0-1.el7.noarch [root@vagrant openshift-ansible]# uname -a Linux vagrant.172.28.128.4.xip.io 3.10.0-229.el7.x86_64 #1 SMP Fri Mar 6 11:36:42 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux [root@vagrant openshift-ansible]# python --version Python 2.7.5 Can you please provide some additional information regarding your host.
Are you not able to reproduce when running from openshift-ansible/playbooks/byo/config.yml ? The openshift-ansible/playbooks/byo/config.yml playbook is run from a freshly cloned openshift-ansible repo on jenkins slave, back on Feb 7, with host env info: $ rpm -qva | grep ansible ansible-2.2.1.0-2.el7.noarch $ uname -a Linux preserve-jenkins-slave-install35.novalocal 4.4.6-301.fc23.x86_64 #1 SMP Wed Mar 30 16:43:58 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux $ python --version Python 2.7.11 Env info from my OCP master host where the EFK stack should deploy: # rpm -qva | grep ansible openshift-ansible-callback-plugins-3.5.3-1.git.0.80c2436.el7.noarch openshift-ansible-playbooks-3.5.3-1.git.0.80c2436.el7.noarch openshift-ansible-3.5.3-1.git.0.80c2436.el7.noarch openshift-ansible-roles-3.5.3-1.git.0.80c2436.el7.noarch openshift-ansible-docs-3.5.3-1.git.0.80c2436.el7.noarch openshift-ansible-filter-plugins-3.5.3-1.git.0.80c2436.el7.noarch ansible-2.2.1.0-1.el7.noarch openshift-ansible-lookup-plugins-3.5.3-1.git.0.80c2436.el7.noarch # uname -a Linux 3.10.0-514.6.1.el7.x86_64 #1 SMP Sat Dec 10 11:15:38 EST 2016 x86_64 x86_64 x86_64 GNU/Linux # # python --version Python 2.7.5 Last commit in Master openshift-ansible repo: ~/openshift-ansible # git log commit 76d0a4538baa3b59085d6dd57b92ffd145c76f93 Merge: acdedc8 1ea1d19 Author: Scott Dodson <sdodson> Date: Mon Feb 6 13:33:59 2017 -0500 Merge pull request #3270 from tbielawa/fix_rhel_sub_path Fix RHEL Subscribe std_include path ---------- I am running now the openshift-ansible/playbooks/byo/config.yml using a gold image AMI with updated ansible version ansible-2.2.1.0-2 and with latest openshift-ansible repo on jenkins slave and AMI for hosts. Will update results.
Getting new errors in openshift_logging task when re-running openshift-ansible/playbooks/byo/config.yml playbook. This is from jenkins slave running ansible version ansible-2.2.1.0-2 and latest openshift-ansible repo (latest commit 3921f01be97ccfbb54e11666ce3647774c3fdbb9). OCP Gold image AMI used for all nodes is also on ansible version ansible-2.2.1.0-2 and latest openshift-ansible repo: TASK [openshift_logging : template] ******************************************** Saturday 11 February 2017 13:21:15 +0000 (0:00:00.391) 0:15:24.749 ***** changed: [<Master node public DNS>] => {"changed": true, "checksum": "a5a1bda430be44f982fa9097778b7d35d2e42780", "dest": "/etc/origin/logging/signing.conf", "gid": 0, "group": "root", "md5sum": "449087446670073f2899aac33113350c", "mode": "0644", "owner": "root", "secontext": "system_u:object_r:etc_t:s0", "size": 4263, "src": "/root/.ansible/tmp/ansible-tmp-1486819275.99-196090292088145/source", "state": "file", "uid": 0} TASK [openshift_logging : include] ********************************************* Saturday 11 February 2017 13:21:17 +0000 (0:00:01.207) 0:15:25.957 ***** fatal: [<Master node public DNS>]: FAILED! => {"failed": true, "msg": "{{ openshift_hosted_logging_hostname | default(kibana.{{openshift.common.dns_domain}}) }}: template error while templating string: expected name or number. String: {{ openshift_hosted_logging_hostname | default(kibana.{{openshift.common.dns_domain}}) }}"} Please see next comment for link to host env info and logs from both Jenkins slave running openshift-ansible BYO playbook and Master node openshift_logging role tasks.
Should be fixed as part of https://github.com/openshift/openshift-ansible/pull/3373 which is merged.
BYO config.yml playbook run failed again at task openshift_logging: TASK [openshift_logging : Generating secrets for logging components] *********** Thursday 16 February 2017 22:59:20 +0000 (0:00:03.139) 0:13:22.055 ***** failed: [<Master node public DNS>] (item=kibana) => {"changed": false, "component": "kibana", "failed": true, "msg": "AnsibleUndefinedVariable: Unable to look up a name or access an attribute in template string (apiVersion: v1\nkind: Secret\nmetadata:\n name: {{secret_name}}\ntype: Opaque\ndata:\n{% for s in secrets %}\n {{s.key}}: {{s.value | b64encode}}\n{% endfor %}\n).\nMake sure your variable name does not contain invalid characters like '-': must be convertible to a buffer, not StrictUndefined"} failed: [<Master node public DNS>] (item=curator) => {"changed": false, "component": "curator", "failed": true, "msg": "AnsibleUndefinedVariable: Unable to look up a name or access an attribute in template string (apiVersion: v1\nkind: Secret\nmetadata:\n name: {{secret_name}}\ntype: Opaque\ndata:\n{% for s in secrets %}\n {{s.key}}: {{s.value | b64encode}}\n{% endfor %}\n).\nMake sure your variable name does not contain invalid characters like '-': must be convertible to a buffer, not StrictUndefined"} failed: [<Master node public DNS>] (item=fluentd) => {"changed": false, "component": "fluentd", "failed": true, "msg": "AnsibleUndefinedVariable: Unable to look up a name or access an attribute in template string (apiVersion: v1\nkind: Secret\nmetadata:\n name: {{secret_name}}\ntype: Opaque\ndata:\n{% for s in secrets %}\n {{s.key}}: {{s.value | b64encode}}\n{% endfor %}\n).\nMake sure your variable name does not contain invalid characters like '-': must be convertible to a buffer, not StrictUndefined"} # oc version oc v3.5.0.30-1+35fa4bf kubernetes v1.5.2+43a9be4 features: Basic-Auth GSSAPI Kerberos SPNEGO Server https://ip-172-31-1-39.us-west-2.compute.internal:8443 openshift v3.5.0.30-1+35fa4bf kubernetes v1.5.2+43a9be4 # rpm -qva | grep ansible openshift-ansible-docs-3.5.8-1.git.0.0e02ef8.el7.noarch openshift-ansible-filter-plugins-3.5.8-1.git.0.0e02ef8.el7.noarch openshift-ansible-callback-plugins-3.5.8-1.git.0.0e02ef8.el7.noarch openshift-ansible-lookup-plugins-3.5.8-1.git.0.0e02ef8.el7.noarch openshift-ansible-playbooks-3.5.8-1.git.0.0e02ef8.el7.noarch ansible-2.2.1.0-2.el7.noarch openshift-ansible-3.5.8-1.git.0.0e02ef8.el7.noarch openshift-ansible-roles-3.5.8-1.git.0.0e02ef8.el7.noarch # uname -a Linux ip-172-31-1-39.us-west-2.compute.internal 3.10.0-514.10.1.el7.x86_64 #1 SMP Mon Jan 30 11:07:00 EST 2017 x86_64 x86_64 x86_64 GNU/Linux # python --version Python 2.7.5 I also re-ran the BYO config.yml on the Master host, and got the same errors Please see next comment for link to host env info, inventory file used, and logs from BYO config.yaml playbook run.
Re-ran BYO config.yml installer on newer gold image after pulling latest openshift-ansible rpms, and latest git repo updates, I am still getting same errors as above. TASK [openshift_logging : Generating secrets for logging components] *********** Saturday 18 February 2017 16:42:48 +0000 (0:00:03.079) 0:12:47.605 ***** failed: [<Master node public DNS>] (item=kibana) => {"changed": false, "component": "kibana", "failed": true, "msg": "AnsibleUndefinedVariable: Unable to look up a name or access an attribute in template string (apiVersion: v1\nkind: Secret\nmetadata:\n name: {{secret_name}}\ntype: Opaque\ndata:\n{% for s in secrets %}\n {{s.key}}: {{s.value | b64encode}}\n{% endfor %}\n).\nMake sure your variable name does not contain invalid characters like '-': must be convertible to a buffer, not StrictUndefined"} failed: [<Master node public DNS>] (item=curator) => {"changed": false, "component": "curator", "failed": true, "msg": "AnsibleUndefinedVariable: Unable to look up a name or access an attribute in template string (apiVersion: v1\nkind: Secret\nmetadata:\n name: {{secret_name}}\ntype: Opaque\ndata:\n{% for s in secrets %}\n {{s.key}}: {{s.value | b64encode}}\n{% endfor %}\n).\nMake sure your variable name does not contain invalid characters like '-': must be convertible to a buffer, not StrictUndefined"} failed: <Master node public DNS>] (item=fluentd) => {"changed": false, "component": "fluentd", "failed": true, "msg": "AnsibleUndefinedVariable: Unable to look up a name or access an attribute in template string (apiVersion: v1\nkind: Secret\nmetadata:\n name: {{secret_name}}\ntype: Opaque\ndata:\n{% for s in secrets %}\n {{s.key}}: {{s.value | b64encode}}\n{% endfor %}\n).\nMake sure your variable name does not contain invalid characters like '-': must be convertible to a buffer, not StrictUndefined"} Host info, master node: ~/ # rpm -qva | grep ansible openshift-ansible-callback-plugins-3.5.10-1.git.0.ba66b63.el7.noarch openshift-ansible-docs-3.5.10-1.git.0.ba66b63.el7.noarch openshift-ansible-filter-plugins-3.5.10-1.git.0.ba66b63.el7.noarch openshift-ansible-lookup-plugins-3.5.10-1.git.0.ba66b63.el7.noarch openshift-ansible-playbooks-3.5.10-1.git.0.ba66b63.el7.noarch ansible-2.2.1.0-2.el7.noarch openshift-ansible-3.5.10-1.git.0.ba66b63.el7.noarch openshift-ansible-roles-3.5.10-1.git.0.ba66b63.el7.noarch ~/openshift-ansible # git rev-parse HEAD 701bd042f01df17d3363c8c31755eafabcf21fa7 ~/openshift-ansible # oc version oc v3.5.0.31-1+d55d08f kubernetes v1.5.2+43a9be4 features: Basic-Auth GSSAPI Kerberos SPNEGO openshift v3.5.0.31-1+d55d08f kubernetes v1.5.2+43a9be4
I am unable to reproduce. The best guess I have is the jenkins slave is not running the same version of playbooks and tasks. When I stand up a local centos vm and us vars as listed in #1 and run: ansible-playbook -i hosts.ose playbooks/byo/config.yml using clone of openshift-ansible with commit: 3ba41f17f7ea7d05a291a59f4d22b31ba159579b I end up with a functioning oc cluster and logging stack
Also made change https://github.com/openshift/openshift-ansible/pull/3430
This looks like its a duplicate of https://bugzilla.redhat.com/show_bug.cgi?id=1419838
Still hitting same errors in comment 17, when running from Jenkins slave with openshift-ansible cloned from jcantrill:bz_1417261_quote_secret. # git rev-parse HEAD dee01dd82e4acc74512b4b40fb56aa5ea024921f # git status # On branch bz_1417261_quote_secret nothing to commit, working directory clean # # git log commit dee01dd82e4acc74512b4b40fb56aa5ea024921f Author: Jeff Cantrill <jcantril> Date: Mon Feb 20 15:01:38 2017 -0500 bug 1417261. Quote name and secrets in logging templates --------- I also re-ran the same BYO config.yaml inventory that was run on Jenkins slave on the AWS master node after cloning the openshift-ansible jcantrill:bz_1417261_quote_secret branch, which was used for PR https://github.com/openshift/openshift-ansible/pull/3430. I got the same error. Both Jenkins and master slave are running ansible version 2.2.1.0-2.el7
Commit pushed to master at https://github.com/openshift/openshift-ansible https://github.com/openshift/openshift-ansible/commit/cecc1f83edf078842e7ca104f9dc495bf65b1e9b Merge pull request #3430 from jcantrill/bz_1417261_quote_secret bug 1417261. Quote name and secrets in logging templates
Verified the EFK logging stack can now be deployed during BYO config.yaml advanced cluster install on AWS EC2. Openshift-ansible_vars used for logging: ---------------------------------------- openshift_hosted_logging_deploy=true openshift_hosted_logging_deployer_prefix=registry.ops.openshift.com/openshift3/ openshift_hosted_logging_deployer_version=3.5.0 ~/openshift-ansible # oc get pods -n logging NAME READY STATUS RESTARTS AGE logging-curator-1-nmj6k 1/1 Running 0 9h logging-es-lu0d4wpc-1-gxxx4 1/1 Running 0 9h logging-fluentd-3dpt7 1/1 Running 0 9h logging-fluentd-f9q3z 1/1 Running 0 9h logging-fluentd-pww4k 1/1 Running 0 9h logging-fluentd-rxsc0 1/1 Running 0 9h logging-kibana-1-r2cnl 2/2 Running 0 9h # oc version oc v3.5.0.34 kubernetes v1.5.2+43a9be4 features: Basic-Auth GSSAPI Kerberos SPNEGO openshift v3.5.0.34 kubernetes v1.5.2+43a9be4 ~/openshift-ansible # git rev-parse HEAD 77c7b1c264f6a023a61a0656899bc2121921a03c ~/openshift-ansible # git log commit 77c7b1c264f6a023a61a0656899bc2121921a03c Merge: a46b63f 659826a Author: Scott Dodson <sdodson> Date: Fri Feb 24 16:18:29 2017 -0500 Merge pull request #3448 from smarterclayton/version Prepare for origin moving to OCP version scheme ---------
Since this never reached customers, I am closing this request.