Created attachment 1242658 [details] ansible full logs Description of problem: Deploy logging with ansible, failed at TASK [openshift_logging : create JKS generation pod]: Enable pod scheduling on OCP master to workaround bug #1415056 : # oc get node NAME STATUS AGE $master Ready 3h $node Ready,SchedulingDisabled 3h Make sure jks-cert-gen pod was scheduled on master node # oc get po -o wide -n xiazhao NAME READY STATUS RESTARTS AGE IP NODE jks-cert-gen-b1c8g 0/1 Error 0 17m 10.128.0.2 $master # oc logs -f jks-cert-gen-b1c8g + dir=/etc/origin/logging + SCRATCH_DIR=/etc/origin/logging + [[ ! -f /etc/origin/logging/system.admin.jks ]] + generate_JKS_client_cert system.admin + NODE_NAME=system.admin + ks_pass=kspass + ts_pass=tspass + dir=/etc/origin/logging + echo Generating keystore and certificate for node system.admin + keytool -genkey -alias system.admin -keystore /etc/origin/logging/system.admin.jks -keyalg RSA -keysize 2048 -validity 712 -keypass kspass -storepass kspass -dname 'CN=system.admin, OU=OpenShift, O=Logging' Generating keystore and certificate for node system.admin keytool error: java.io.FileNotFoundException: /etc/origin/logging/system.admin.jks (Permission denied) And checked on master that file system.admin.jks didn't exist in directory /etc/origin/logging/: # ls -al /etc/origin/logging/ total 132 drwxr-xr-x. 2 root root 4096 1月 19 22:27 . drwx------. 7 root root 4096 1月 19 22:23 .. -rw-r--r--. 1 root root 1196 1月 19 22:25 02.pem -rw-r--r--. 1 root root 1196 1月 19 22:26 03.pem -rw-r--r--. 1 root root 1196 1月 19 22:26 04.pem -rw-r--r--. 1 root root 1184 1月 19 22:26 05.pem -rw-r--r--. 1 root root 1050 1月 19 22:24 ca.crt -rw-r--r--. 1 root root 0 1月 19 22:25 ca.crt.srl -rw-r--r--. 1 root root 301 1月 19 22:26 ca.db -rw-r--r--. 1 root root 20 1月 19 22:26 ca.db.attr -rw-r--r--. 1 root root 20 1月 19 22:26 ca.db.attr.old -rw-r--r--. 1 root root 233 1月 19 22:26 ca.db.old -rw-------. 1 root root 1675 1月 19 22:24 ca.key -rw-r--r--. 1 root root 3 1月 19 22:26 ca.serial.txt -rw-r--r--. 1 root root 3 1月 19 22:26 ca.serial.txt.old -rw-r--r--. 1 root root 4679 1月 19 22:27 generate-jks.sh -rw-r--r--. 1 root root 2242 1月 19 22:24 kibana-internal.crt -rw-------. 1 root root 1679 1月 19 22:24 kibana-internal.key -rw-r--r--. 1 root root 321 1月 19 22:24 server-tls.json -rw-r--r--. 1 root root 4263 1月 19 22:24 signing.conf -rw-r--r--. 1 root root 1184 1月 19 22:26 system.admin.crt -rw-r--r--. 1 root root 948 1月 19 22:26 system.admin.csr -rw-r--r--. 1 root root 1708 1月 19 22:26 system.admin.key -rw-r--r--. 1 root root 1196 1月 19 22:26 system.logging.curator.crt -rw-r--r--. 1 root root 960 1月 19 22:26 system.logging.curator.csr -rw-r--r--. 1 root root 1704 1月 19 22:26 system.logging.curator.key -rw-r--r--. 1 root root 1196 1月 19 22:25 system.logging.fluentd.crt -rw-r--r--. 1 root root 960 1月 19 22:25 system.logging.fluentd.csr -rw-r--r--. 1 root root 1704 1月 19 22:25 system.logging.fluentd.key -rw-r--r--. 1 root root 1196 1月 19 22:26 system.logging.kibana.crt -rw-r--r--. 1 root root 960 1月 19 22:26 system.logging.kibana.csr -rw-r--r--. 1 root root 1704 1月 19 22:26 system.logging.kibana.key Part of the ansible execution log (full log in attachment): TASK [openshift_logging : create JKS generation pod] *************************** task path: /home/xiazhao/openshift-ansible/roles/openshift_logging/tasks/generate_certs.yaml:149 Using module file /usr/lib/python2.7/site-packages/ansible/modules/core/commands/command.py ... <$master-public-dns> ESTABLISH SSH CONNECTION FOR USER: root <$master-public-dns> SSH: EXEC ssh -C -o ControlMaster=auto -o ControlPersist=60s -o 'IdentityFile="/home/xiazhao/cfile/libra.pem"' -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o User=root -o ConnectTimeout=10 -o ControlPath=/home/xiazhao/.ansible/cp/ansible-ssh-%h-%p-%r -tt $master-public-dns '/bin/sh -c '"'"'/usr/bin/python /root/.ansible/tmp/ansible-tmp-1484890889.35-91809174259402/command.py; rm -rf "/root/.ansible/tmp/ansible-tmp-1484890889.35-91809174259402/" > /dev/null 2>&1 && sleep 0'"'"'' fatal: [$master-public-dns]: FAILED! => { "attempts": 5, "changed": false, "cmd": [ "oc", "--config=/tmp/openshift-logging-ansible-dbkNx8/admin.kubeconfig", "get", "pod/jks-cert-gen-b1c8g", "-o", "jsonpath={.status.phase}", "-n", "xiazhao" ], "delta": "0:00:01.228427", "end": "2017-01-20 00:41:34.030142", "failed": true, "invocation": { "module_args": { "_raw_params": "oc --config=/tmp/openshift-logging-ansible-dbkNx8/admin.kubeconfig get pod/jks-cert-gen-b1c8g -o jsonpath='{.status.phase}' -n xiazhao", "_uses_shell": false, "chdir": null, "creates": null, "executable": null, "removes": null, "warn": true }, "module_name": "command" }, "rc": 0, "start": "2017-01-20 00:41:32.801715", "stderr": "", "stdout": "Pending", "stdout_lines": [ "Pending" ], "warnings": [] } to retry, use: --limit @/home/xiazhao/openshift-ansible/playbooks/common/openshift-cluster/openshift_logging.retry PLAY RECAP ********************************************************************* $master-public-dns : ok=59 changed=1 unreachable=0 failed=1 Version-Release number of selected component (if applicable): # openshift version openshift v3.5.0.6+87f6173 kubernetes v1.5.2+43a9be4 etcd 3.1.0-rc.0 How reproducible: Always Steps to Reproduce: 1. prepare the inventory file [oo_first_master] $master-public-dns ansible_user=root ansible_ssh_user=root ansible_ssh_private_key_file="~/cfile/libra.pem" openshift_public_hostname=$master-public-dns [oo_first_master:vars] deployment_type=openshift-enterprise openshift_release=v3.5.0 openshift_logging_install_logging=true openshift_logging_kibana_hostname=kibana.$sub-domain public_master_url=https://$master-public-dns:8443 openshift_logging_image_prefix=registry.ops.openshift.com/openshift3/ openshift_logging_image_version=3.5.0 openshift_logging_namespace=xiazhao 2. Running the playbook from a control machine (my laptop) which is not oo_master: git clone https://github.com/openshift/openshift-ansible ansible-playbook -vvv -i ~/inventory playbooks/common/openshift-cluster/openshift_logging.yml Actual results: failed at TASK [openshift_logging : create JKS generation pod] Expected results: Should complete successfully Additional info: Full ansible log attached
WIP PR to fix: https://github.com/openshift/openshift-ansible/pull/3135
I believe this is a duplicate of https://bugzilla.redhat.com/show_bug.cgi?id=1415056 Same root cause
*** Bug 1415056 has been marked as a duplicate of this bug. ***
Above PR has been merged in
Tested according to xiazhao's step from my local desktop, error "Aborting, target uses selinux but python bindings (libselinux-python) aren't installed!" throws out. and jks-cert pod can't be generated. libselinux-python package already installed on both desktop and master. <localhost> EXEC /bin/sh -c 'rm -f -r /root/.ansible/tmp/ansible-tmp-1485237195.37-77603840609488/ > /dev/null 2>&1 && sleep 0' fatal: [ec2-52-204-85-177.compute-1.amazonaws.com -> localhost]: FAILED! => { "changed": true, "failed": true, "invocation": { "module_args": { "backup": false, "content": null, "delimiter": null, "dest": "/tmp/openshift-logging-ansible-QTRjW5/signing.conf", "directory_mode": null, "follow": true, "force": true, "group": null, "mode": null, "original_basename": "signing.conf.j2", "owner": null, "regexp": null, "remote_src": null, "selevel": null, "serole": null, "setype": null, "seuser": null, "src": "/root/.ansible/tmp/ansible-tmp-1485237195.37-77603840609488/source", "unsafe_writes": null, "validate": null } }, "msg": "Aborting, target uses selinux but python bindings (libselinux-python) aren't installed!" } to retry, use: --limit @/home/fedora/openshift-ansible/playbooks/common/openshift-cluster/openshift_logging.retry attached ansible log
Created attachment 1243830 [details] ansible log -20170124
Isn't the proper solution to install libselinux-python per the instructions for using ansible? http://docs.ansible.com/ansible/intro_installation.html#managed-node-requirements
checked libselinux-python on the master $ rpm -qa | grep libselinux-python libselinux-python-2.5-6.el7.x86_64 checked libselinux-python on my desktop $ rpm -qa | grep libselinux-python libselinux-python3-2.5-3.fc24.x86_64 libselinux-python and libselinux-python3 are different, so install libselinux-python on my desktop, and run the ansible script again, error "Aborting, target uses selinux but python bindings (libselinux-python) aren't installed!" don't throw out.
run the ansible again after installing libselinux-python on my desktop, and can find system.admin.jks. This bug is fixed, although there are other bugs need to be filed. Set it to VERIFIED and close it. # ls -al /etc/origin/logging/ total 140 drwxr-xr-x. 2 root root 4096 Jan 25 00:12 . drwx------. 7 root root 4096 Jan 24 21:57 .. -rw-r--r--. 1 root root 1196 Jan 24 21:59 02.pem -rw-r--r--. 1 root root 1196 Jan 24 21:59 03.pem -rw-r--r--. 1 root root 1196 Jan 24 22:00 04.pem -rw-r--r--. 1 root root 1184 Jan 24 22:00 05.pem -rw-r--r--. 1 root root 1050 Jan 24 21:57 ca.crt -rw-r--r--. 1 root root 0 Jan 24 21:59 ca.crt.srl -rw-r--r--. 1 root root 301 Jan 24 22:00 ca.db -rw-r--r--. 1 root root 20 Jan 24 22:00 ca.db.attr -rw-r--r--. 1 root root 20 Jan 24 22:00 ca.db.attr.old -rw-r--r--. 1 root root 233 Jan 24 22:00 ca.db.old -rw-------. 1 root root 1675 Jan 24 21:57 ca.key -rw-r--r--. 1 root root 3 Jan 24 22:00 ca.serial.txt -rw-r--r--. 1 root root 3 Jan 24 22:00 ca.serial.txt.old -rw-r--r--. 1 root root 3768 Jan 25 00:12 elasticsearch.jks -rw-r--r--. 1 root root 2242 Jan 24 21:58 kibana-internal.crt -rw-------. 1 root root 1679 Jan 24 21:58 kibana-internal.key -rw-r--r--. 1 root root 3979 Jan 25 00:12 logging-es.jks -rw-r--r--. 1 root root 321 Jan 24 21:58 server-tls.json -rw-r--r--. 1 root root 4263 Jan 24 21:57 signing.conf -rw-r--r--. 1 root root 1184 Jan 24 22:00 system.admin.crt -rw-r--r--. 1 root root 948 Jan 24 22:00 system.admin.csr -rw-r--r--. 1 root root 3701 Jan 25 00:12 system.admin.jks -rw-r--r--. 1 root root 1704 Jan 24 22:00 system.admin.key -rw-r--r--. 1 root root 1196 Jan 24 22:00 system.logging.curator.crt -rw-r--r--. 1 root root 960 Jan 24 22:00 system.logging.curator.csr -rw-r--r--. 1 root root 1704 Jan 24 22:00 system.logging.curator.key -rw-r--r--. 1 root root 1196 Jan 24 21:59 system.logging.fluentd.crt -rw-r--r--. 1 root root 960 Jan 24 21:59 system.logging.fluentd.csr -rw-r--r--. 1 root root 1704 Jan 24 21:59 system.logging.fluentd.key -rw-r--r--. 1 root root 1196 Jan 24 21:59 system.logging.kibana.crt -rw-r--r--. 1 root root 960 Jan 24 21:59 system.logging.kibana.csr -rw-r--r--. 1 root root 1708 Jan 24 21:59 system.logging.kibana.key -rw-r--r--. 1 root root 797 Jan 25 00:12 truststore.jks
Created attachment 1244163 [details] system.admin.jks is under /etc/origin/logging
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:3049