Description of problem: The latest OpenShift 3.11 deployment is failing at the task "TASK [openshift_persistent_volumes : Create PersistentVolumes]" with latest ansible version when NFS storage variables are used for registry in the inventory. The workaround for this issue is to downgrade the ansible version to 2.7. Also the "package_availability" needs to be disabled otherwise the playbook will fail. These were the customer's word. ~~~ Also note we needed to disable package check .. This has to do the new version version of bind being advertised by Red Hat bind-utils-9.11.4-16.P2.el7_8.3.x86_64 This new version is missing the bind-libs-9.11.4-16.P2.el7_8.3.x86_64 and bind-libs-lite-9.11.4-16.P2.el7_8.2.x86_64 . When the openshift anisble installer requests an yum update during install it fails the playbook with a reference to the missing bind-libs-9.11.4-16.P2.el7_8.3.x86_64 and bind-libs-lite-9.11.4-16.P2.el7_8.2.x86_64 . These 2 rpm's are not available while the bind-utils was posted March 18 2020 .. openshift_disable_check=package_availability ~~~ Version-Release number of selected component (if applicable): OpenShift 3.11 Ansible 2.9 How reproducible: Every time Steps to Reproduce: 1. Provide the NFS host and storage variables in the inventory for registry. [OSEv3:children] masters nodes etcd nfs openshift_hosted_registry_storage_kind=nfs openshift_hosted_registry_storage_access_modes=['ReadWriteMany'] openshift_hosted_registry_storage_nfs_directory=/exports openshift_hosted_registry_storage_nfs_options='*(rw,root_squash)' openshift_hosted_registry_storage_volume_name=registry openshift_hosted_registry_storage_volume_size=10Gi [nfs] bug-infra.example.com 2. Run the deployment playbook. Actual results: TASK [openshift_persistent_volumes : Create PersistentVolumes] ************************************************************************* task path: /usr/share/ansible/openshift-ansible/roles/openshift_persistent_volumes/tasks/pv.yml:9 fatal: [aygarg-bug-master.example.com]: FAILED! => { "changed": false, "cmd": [ "oc", "create", "-f", "/tmp/openshift-ansible-aLObFLa/persistent-volumes.yml", "--config=/tmp/openshift-ansible-aLObFLa/admin.kubeconfig" ], "delta": "0:00:00.242917", "end": "2020-05-15 10:11:38.860266", "failed_when_result": true, "invocation": { "module_args": { "_raw_params": "oc create -f /tmp/openshift-ansible-aLObFLa/persistent-volumes.yml --config=/tmp/openshift-ansible-aLObFLa/admin.kubeconfig\n", "_uses_shell": false, "argv": null, "chdir": null, "creates": null, "executable": null, "removes": null, "stdin": null, "stdin_add_newline": true, "strip_empty_ends": true, "warn": true } }, "msg": "non-zero return code", "rc": 1, "start": "2020-05-15 10:11:38.617349", "stderr": "The PersistentVolume \"registry-volume\" is invalid: spec: Required value: must specify a volume type", "stderr_lines": [ "The PersistentVolume \"registry-volume\" is invalid: spec: Required value: must specify a volume type" ], "stdout": "", "stdout_lines": [] } PLAY RECAP ***************************************************************************************************************************** bug-compute.example.com : ok=132 changed=64 unreachable=0 failed=0 skipped=148 rescued=0 ignored=0 bug-infra.example.com : ok=145 changed=72 unreachable=0 failed=0 skipped=152 rescued=0 ignored=0 bug-master.example.com : ok=462 changed=203 unreachable=0 failed=1 skipped=747 rescued=0 ignored=0 localhost : ok=12 changed=0 unreachable=0 failed=0 skipped=4 rescued=0 ignored=0 INSTALLER STATUS *********************************************************************************************************************** Initialization : Complete (0:00:29) Health Check : Complete (0:00:12) Node Bootstrap Preparation : Complete (0:04:57) etcd Install : Complete (0:01:06) NFS Install : Complete (0:00:19) Master Install : Complete (0:05:58) Master Additional Install : Complete (0:02:07) Node Join : Complete (0:00:49) Hosted Install : In Progress (0:00:07) This phase can be restarted by running: playbooks/openshift-hosted/config.yml Failure summary: 1. Hosts: bug-master.example.com Play: Create Hosted Resources - persistent volumes Task: Create PersistentVolumes Message: non-zero return code Expected results: Installation should complete Additional info: The following packages were present to reproduce the issue by enabling the repos mentioned in the official documentation. # rpm -qa | grep -i ansible openshift-ansible-playbooks-3.11.216-1.git.0.085486a.el7.noarch openshift-ansible-roles-3.11.216-1.git.0.085486a.el7.noarch ansible-2.9.9-1.el7ae.noarch openshift-ansible-docs-3.11.216-1.git.0.085486a.el7.noarch openshift-ansible-3.11.216-1.git.0.085486a.el7.noarch After the deployment failed, uninstalled the cluster, ran the prerequisites again and downgraded the ansible version. Add the inventory variable as well. --> openshift_disable_check=package_availability # subscription-manager repos --enable='rhel-7-server-ansible-2.7-rpms' # yum downgrade ansible-2.7* # rpm -qa | grep -i ansible openshift-ansible-playbooks-3.11.216-1.git.0.085486a.el7.noarch openshift-ansible-roles-3.11.216-1.git.0.085486a.el7.noarch openshift-ansible-docs-3.11.216-1.git.0.085486a.el7.noarch ansible-2.7.18-1.el7ae.noarch openshift-ansible-3.11.216-1.git.0.085486a.el7.noarch After the downgrade, the installation succeeded.
We need to fix this. Can you provide a failed persistent-volumes.yml in your environment from an Ansible 2.9 run and one generated with Ansible 2.7? You can grab these from the tmp directory (eg, /tmp/openshift-ansible-aLObFLa/persistent-volumes.yml in the example from the description) In the meantime, since there's a workaround we're lowering the severity. We'd like to fix this in an upcoming sprint.
Hello Brenton, Thanks for looking into this. This is the persistent-volumes.yml file from the /tmp directory. ~~~ apiVersion: v1 kind: List items: - apiVersion: v1 kind: PersistentVolume metadata: name: "registry-volume" labels: spec: capacity: storage: "10Gi" accessModes: - ReadWriteMany claimName: registry-claim ... claimRef: name: registry-claim namespace: default ~~~
(In reply to aygarg from comment #3) > Hello Brenton, > > Thanks for looking into this. > > This is the persistent-volumes.yml file from the /tmp directory. > > ~~~ > apiVersion: v1 > kind: List > items: > - apiVersion: v1 > kind: PersistentVolume > metadata: > name: "registry-volume" > labels: > spec: > capacity: > storage: "10Gi" > accessModes: > - ReadWriteMany > claimName: > registry-claim > ... > claimRef: > name: registry-claim > namespace: default > ~~~ This persistent volume is from an Ansible 2.9 run, I will provide the another one generated with Ansible 2.7 in some time.
To further investigate this bug we will need the full verbose ansible log output as well as a complete inventory. There are previous tasks that set up variables used during execution of the failed task and there are inventory variables that may be defined which could also affect this task. Additionally, the failure message indicated in the description is from `oc` when it is parsing the generated pv list. Knowing the installed version of `oc` would also be helpful.
Thank you for continuing to use Red Hat OpenShift. As part of a wider bug review, this bug has been evaluated and we have determined that at this time we do not plan to progress it. As such, we will be closing this bug. If you have need for continued assistance on this issue, please reopen the bug with additional context on why it needs to be reconsidered.
I have determined the ordering of the keys processed when templating persistent-volume.yml is not consistent between 2.7 and 2.9. Ansible: 2.7.18 -> ['nfs', 'claimName'] 2.9.14 -> [u'claimName', u'nfs'] This causes an incorrectly templated persistent-volume.yml. Submitting a patch to correctly handle the templating regardless of the order of the keys.
Reproduce this bug with openshift-ansible-3.11.216-1.git.0.085486a.el7.noarch + ansible-2.9.9-1.el7ae. openshift_hosted_registry_storage_kind=nfs openshift_hosted_registry_storage_nfs_options="*(rw,root_squash,sync,no_wdelay)" openshift_hosted_registry_storage_nfs_directory=/var/lib/exports openshift_hosted_registry_storage_volume_name=regpv openshift_hosted_registry_storage_access_modes=["ReadWriteMany"] openshift_hosted_registry_storage_volume_size=17G TASK [openshift_persistent_volumes : Create PersistentVolumes] ***************** Friday 23 October 2020 15:25:49 +0800 (0:00:00.909) 0:14:26.954 ******** fatal: [ci-vm-10-0-150-46.hosted.upshift.rdu2.redhat.com]: FAILED! => {"changed": false, "cmd": ["oc", "create", "-f", "/tmp/openshift-ansible-ugSIrs9/persistent-volumes.yml", "--config=/tmp/openshift-ansible-ugSIrs9/admin.kubeconfig"], "delta": "0:00:00.220239", "end": "2020-10-23 03:25:50.626290", "failed_when_result": true, "msg": "non-zero return code", "rc": 1, "start": "2020-10-23 03:25:50.406051", "stderr": "The PersistentVolume \"regpv-volume\" is invalid: spec: Required value: must specify a volume type", "stderr_lines": ["The PersistentVolume \"regpv-volume\" is invalid: spec: Required value: must specify a volume type"], "stdout": "", "stdout_lines": []} Verified this bug with openshift-ansible-3.11.307-1.git.0.83cdf01.el7.noarch + ansible-2.9.9-1.el7ae. TASK [openshift_persistent_volumes : Create PersistentVolumes] ***************** Friday 23 October 2020 15:56:40 +0800 (0:00:00.998) 0:16:43.627 ******** changed: [ci-vm-10-0-151-94.hosted.upshift.rdu2.redhat.com] => {"changed": true, "cmd": ["oc", "create", "-f", "/tmp/openshift-ansible-Gbt2faU/persistent-volumes.yml", "--config=/tmp/openshift-ansible-Gbt2faU/admin.kubeconfig"], "delta": "0:00:00.265438", "end": "2020-10-23 03:56:40.251101", "failed_when_result": false, "rc": 0, "start": "2020-10-23 03:56:39.985663", "stderr": "", "stderr_lines": [], "stdout": "persistentvolume/regpv-volume created", "stdout_lines": ["persistentvolume/regpv-volume created"]}
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 3.11.317 bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:4430