Description of problem: Installer failed when creating cfme with an external NFS server at running task "Ensure the CFME App PV is created", the error was: 'openshift_management_nfs_server' is undefined Version-Release number of the following components: openshift-ansible-3.7.0-0.167.0.git.0.0e34535.el7.noarch.rpm How reproducible: Steps to Reproduce: 1. With the following options added into ansible inventory file in addition, run playbooks/byo/config.yml openshift_management_install_management=true openshift_management_app_template=miq-template openshift_management_storage_class=nfs_external openshift_management_storage_nfs_external_hostname=openshift-144.x.com openshift_management_storage_nfs_base_dir=/nfsshare/cfme-test Actual results: TASK [openshift_management : Check if the CFME DB PV has been created] ********* Friday 20 October 2017 06:17:20 +0000 (0:00:02.023) 1:05:31.660 ******** ok: [openshift-128.lab.sjc.redhat.com] => {"changed": false, "failed": false, "results": {"cmd": "/usr/bin/oc get pv miq-db -o json -n openshift-management", "results": [{}], "returncode": 0, "stderr": "Error from server (NotFound): persistentvolumes \"miq-db\" not found\n", "stdout": ""}, "state": "list"} TASK [openshift_management : Ensure the CFME App PV is created] **************** Friday 20 October 2017 06:17:22 +0000 (0:00:02.040) 1:05:33.700 ******** fatal: [openshift-128.lab.sjc.redhat.com]: FAILED! => {"failed": true, "msg": "The task includes an option with an undefined variable. The error was: 'openshift_management_nfs_server' is undefined\n\nThe error appears to have been in '/home/slave5/workspace/Launch-Environment-Flexy/private-openshift-ansible/roles/openshift_management/tasks/storage/create_nfs_pvs.yml': line 47, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n- name: Ensure the CFME App PV is created\n ^ here\n\nexception type: <class 'ansible.errors.AnsibleUndefinedVariable'>\nexception: 'openshift_management_nfs_server' is undefined"} to retry, use: --limit @/home/slave5/workspace/Launch-Environment-Flexy/private-openshift-ansible/playbooks/byo/config.retry Expected results: Additional info:
That's strange that you got that far without it erroring. The validation steps should have noticed something was wrong. I just reproduced this error on my local workstation. Should be a quick fix.
Confirmed. I have identified the issue and the fix will be shortly coming.
Fix submitted in my other cleanup PR for CFME https://github.com/openshift/openshift-ansible/pull/5793
Still reproducible with openshift-ansible-3.7.0-0.178.0.git.0.27a1039.el7.noarch.rpm. PR https://github.com/openshift/openshift-ansible/pull/5793 not merged yet.
PR is merged. Please try again.
$ git tag --contains ac62ea0066934877f94e99bda6ec53a9c03ababb openshift-ansible-3.7.0-0.178.2 openshift-ansible-3.7.0-0.182.0 openshift-ansible-3.7.0-0.183.0 openshift-ansible-3.7.0-0.184.0 openshift-ansible-3.7.0-0.185.0 openshift-ansible-3.7.0-0.186.0 openshift-ansible-3.7.0-0.187.0 openshift-ansible-3.7.0-0.188.0
Tried with openshift-ansible-3.7.0-0.188.0.git.0.aebb674.el7.noarch.rpm, still fails as below: ... TASK [openshift_management : Ensure we save the external NFS server] *********** Wednesday 01 November 2017 04:13:17 +0000 (0:00:00.033) 1:05:17.982 **** ok: [openshift-119.lab.sjc.redhat.com] => {"ansible_facts": {"openshift_management_nfs_server": "openshift-144.lab.sjc.redhat.com"}, "changed": false, "failed": false} TASK [openshift_management : Failed NFS server detection] ********************** Wednesday 01 November 2017 04:13:17 +0000 (0:00:00.049) 1:05:18.031 **** fatal: [openshift-119.lab.sjc.redhat.com]: FAILED! => {"failed": true, "msg": "The task includes an option with an undefined variable. The error was: 'dict object' has no attribute 'oo_nfs_to_config'\n\nThe error appears to have been in '/home/slave6/workspace/Launch-Environment-Flexy/private-openshift-ansible/roles/openshift_management/tasks/storage/nfs_server.yml': line 23, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n- name: Failed NFS server detection\n ^ here\n\nexception type: <class 'ansible.errors.AnsibleUndefinedVariable'>\nexception: 'dict object' has no attribute 'oo_nfs_to_config'"}
Reproduced this successfully. The error is now caused by me using an undefined variable in the error message. It's funny, because that task wouldn't even have ran.
I have pushed a fix for this bug https://github.com/openshift/openshift-ansible/pull/5974
Tried this with openshift-ansible-3.7.0-0.196.0.git.0.27cd7ec.el7.noarch for PR#5974 has been merged in. Deploying cfme with an external NFS server by the installation playbook is working well now, will move this bug as verified once it's changed to ON_QA, thanks! With the following parameters set in ansible inventory file: openshift_management_install_beta=true openshift_management_app_template=miq-template openshift_management_storage_class=nfs_external openshift_management_storage_nfs_external_hostname=openshift-x.x.com openshift_management_storage_nfs_base_dir=/nfsshare/cfme-test Run cfme deployment playbook: ansible-playbook -i host -v /usr/share/ansible/openshift-ansible/playbooks/byo/openshift-management/config.yml After playbook finished, check cfme pod status: [root@host-192-168-2-144 ~]# oc get pod NAME READY STATUS RESTARTS AGE httpd-1-pd4g9 1/1 Running 0 17m manageiq-0 1/1 Running 0 17m memcached-1-q26sh 1/1 Running 0 17m postgresql-1-xw59h 1/1 Running 0 17m Check the pv used in miq and psql pod and the data created on NFS directory: [root@host-192-168-2-144 ~]# oc rsh manageiq-0 sh-4.2# df -h Filesystem Size Used Avail Use% Mounted on overlay 59G 7.0G 52G 12% / tmpfs 3.9G 0 3.9G 0% /dev tmpfs 3.9G 0 3.9G 0% /sys/fs/cgroup openshift-x.x.com:/nfsshare/cfme-test/miq-app 8.8G 1.5G 7.4G 17% /persistent /dev/mapper/rhel-root 59G 7.0G 52G 12% /etc/hosts shm 64M 0 64M 0% /dev/shm tmpfs 3.9G 16K 3.9G 1% /run/secrets/kubernetes.io/serviceaccount sh-4.2# ls /persistent/server-d server-data/ server-deploy/ sh-4.2# ls /persistent/* /persistent/server-data: var /persistent/server-deploy: backup log [root@host-192-168-2-144 ~]# oc rsh postgresql-1-xw59h sh-4.2$ df -h Filesystem Size Used Avail Use% Mounted on overlay 59G 6.9G 52G 12% / tmpfs 3.9G 0 3.9G 0% /dev tmpfs 3.9G 0 3.9G 0% /sys/fs/cgroup /dev/mapper/rhel-root 59G 6.9G 52G 12% /etc/hosts shm 64M 4.0K 64M 1% /dev/shm openshift-144.lab.sjc.redhat.com:/nfsshare/cfme-test/miq-db 8.8G 1.5G 7.4G 17% /var/lib/pgsql/data tmpfs 3.9G 16K 3.9G 1% /run/secrets/kubernetes.io/serviceaccount sh-4.2$ ls /var/lib/pgsql/data/* PG_VERSION pg_clog pg_hba.conf pg_logical pg_replslot pg_stat pg_tblspc postgresql.auto.conf postmaster.pid base pg_commit_ts pg_ident.conf pg_multixact pg_serial pg_stat_tmp pg_twophase postgresql.conf global pg_dynshmem pg_log pg_notify pg_snapshots pg_subtrans pg_xlog postmaster.opts
Gaoyun, thanks for verifying it!
Woo Hoo!
Mark this bug as verified according to Comment 15
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2017:3188
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days