Bug 1836157 - Deployment failing at TASK [openshift_persistent_volumes : Create PersistentVolumes] with latest ansible 2.9
Summary: Deployment failing at TASK [openshift_persistent_volumes : Create PersistentV...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer
Version: 3.11.0
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 3.11.z
Assignee: Russell Teague
QA Contact: Johnny Liu
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-05-15 09:32 UTC by aygarg
Modified: 2024-03-25 15:56 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Due to changes between Ansible 2.7 and 2.9 the order of keys when templating is now an alphabetized list. Updating the template to not rely on the first item being the correct item to template.
Clone Of:
Environment:
Last Closed: 2020-11-12 10:08:21 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift openshift-ansible pull 12250 0 None closed Bug 1836157: roles/openshift_peristent_volumes: Correct persistent-volume.yml template 2020-12-17 20:56:19 UTC
Red Hat Knowledge Base (Solution) 5175091 0 None None None 2020-06-23 01:06:33 UTC
Red Hat Product Errata RHBA-2020:4430 0 None None None 2020-11-12 10:08:55 UTC

Description aygarg 2020-05-15 09:32:58 UTC
Description of problem:

The latest OpenShift 3.11 deployment is failing at the task "TASK [openshift_persistent_volumes : Create PersistentVolumes]" with latest ansible version when NFS storage variables are used for registry in the inventory. The workaround for this issue is to downgrade the ansible version to 2.7. 

Also the "package_availability" needs to be disabled otherwise the playbook will fail. These were the customer's word.

~~~
Also note we needed to disable package check .. This has to do the new version  version of bind being advertised by Red Hat bind-utils-9.11.4-16.P2.el7_8.3.x86_64
This new version is missing the bind-libs-9.11.4-16.P2.el7_8.3.x86_64 and bind-libs-lite-9.11.4-16.P2.el7_8.2.x86_64 . When the openshift  anisble installer  requests an yum update during install it fails the playbook with a reference to the missing bind-libs-9.11.4-16.P2.el7_8.3.x86_64 and bind-libs-lite-9.11.4-16.P2.el7_8.2.x86_64 .
These 2 rpm's are not available while the bind-utils was posted March 18 2020 ..

openshift_disable_check=package_availability
~~~

Version-Release number of selected component (if applicable):
OpenShift 3.11
Ansible 2.9

How reproducible:
Every time

Steps to Reproduce:
1. Provide the NFS host and storage variables in the inventory for registry.

[OSEv3:children]
masters
nodes
etcd
nfs

openshift_hosted_registry_storage_kind=nfs
openshift_hosted_registry_storage_access_modes=['ReadWriteMany']
openshift_hosted_registry_storage_nfs_directory=/exports
openshift_hosted_registry_storage_nfs_options='*(rw,root_squash)'
openshift_hosted_registry_storage_volume_name=registry
openshift_hosted_registry_storage_volume_size=10Gi

[nfs]
bug-infra.example.com


2. Run the deployment playbook.


Actual results:

TASK [openshift_persistent_volumes : Create PersistentVolumes] *************************************************************************
task path: /usr/share/ansible/openshift-ansible/roles/openshift_persistent_volumes/tasks/pv.yml:9
fatal: [aygarg-bug-master.example.com]: FAILED! => {
    "changed": false, 
    "cmd": [
        "oc", 
        "create", 
        "-f", 
        "/tmp/openshift-ansible-aLObFLa/persistent-volumes.yml", 
        "--config=/tmp/openshift-ansible-aLObFLa/admin.kubeconfig"
    ], 
    "delta": "0:00:00.242917", 
    "end": "2020-05-15 10:11:38.860266", 
    "failed_when_result": true, 
    "invocation": {
        "module_args": {
            "_raw_params": "oc create -f /tmp/openshift-ansible-aLObFLa/persistent-volumes.yml --config=/tmp/openshift-ansible-aLObFLa/admin.kubeconfig\n", 
            "_uses_shell": false, 
            "argv": null, 
            "chdir": null, 
            "creates": null, 
            "executable": null, 
            "removes": null, 
            "stdin": null, 
            "stdin_add_newline": true, 
            "strip_empty_ends": true, 
            "warn": true
        }
    }, 
    "msg": "non-zero return code", 
    "rc": 1, 
    "start": "2020-05-15 10:11:38.617349", 
    "stderr": "The PersistentVolume \"registry-volume\" is invalid: spec: Required value: must specify a volume type", 
    "stderr_lines": [
        "The PersistentVolume \"registry-volume\" is invalid: spec: Required value: must specify a volume type"
    ], 
    "stdout": "", 
    "stdout_lines": []
}

PLAY RECAP *****************************************************************************************************************************
bug-compute.example.com : ok=132  changed=64   unreachable=0    failed=0    skipped=148  rescued=0    ignored=0   
bug-infra.example.com : ok=145  changed=72   unreachable=0    failed=0    skipped=152  rescued=0    ignored=0   
bug-master.example.com : ok=462  changed=203  unreachable=0    failed=1    skipped=747  rescued=0    ignored=0   
localhost                  : ok=12   changed=0    unreachable=0    failed=0    skipped=4    rescued=0    ignored=0   


INSTALLER STATUS ***********************************************************************************************************************
Initialization              : Complete (0:00:29)
Health Check                : Complete (0:00:12)
Node Bootstrap Preparation  : Complete (0:04:57)
etcd Install                : Complete (0:01:06)
NFS Install                 : Complete (0:00:19)
Master Install              : Complete (0:05:58)
Master Additional Install   : Complete (0:02:07)
Node Join                   : Complete (0:00:49)
Hosted Install              : In Progress (0:00:07)
	This phase can be restarted by running: playbooks/openshift-hosted/config.yml


Failure summary:


  1. Hosts:    bug-master.example.com
     Play:     Create Hosted Resources - persistent volumes
     Task:     Create PersistentVolumes
     Message:  non-zero return code


Expected results:
Installation should complete


Additional info:
The following packages were present to reproduce the issue by enabling the repos mentioned in the official documentation.

# rpm -qa | grep -i ansible
openshift-ansible-playbooks-3.11.216-1.git.0.085486a.el7.noarch
openshift-ansible-roles-3.11.216-1.git.0.085486a.el7.noarch
ansible-2.9.9-1.el7ae.noarch
openshift-ansible-docs-3.11.216-1.git.0.085486a.el7.noarch
openshift-ansible-3.11.216-1.git.0.085486a.el7.noarch

After the deployment failed, uninstalled the cluster, ran the prerequisites again and downgraded the ansible version.

Add the inventory variable as well.
--> openshift_disable_check=package_availability


# subscription-manager repos --enable='rhel-7-server-ansible-2.7-rpms'
# yum downgrade ansible-2.7*

# rpm -qa | grep -i ansible
openshift-ansible-playbooks-3.11.216-1.git.0.085486a.el7.noarch
openshift-ansible-roles-3.11.216-1.git.0.085486a.el7.noarch
openshift-ansible-docs-3.11.216-1.git.0.085486a.el7.noarch
ansible-2.7.18-1.el7ae.noarch
openshift-ansible-3.11.216-1.git.0.085486a.el7.noarch

After the downgrade, the installation succeeded.

Comment 1 Brenton Leanhardt 2020-05-18 13:41:34 UTC
We need to fix this.  Can you provide a failed persistent-volumes.yml in your environment from an Ansible 2.9 run and one generated with Ansible 2.7?

You can grab these from the tmp directory (eg, /tmp/openshift-ansible-aLObFLa/persistent-volumes.yml in the example from the description)

In the meantime, since there's a workaround we're lowering the severity.  We'd like to fix this in an upcoming sprint.

Comment 3 aygarg 2020-05-18 14:37:38 UTC
Hello Brenton,

Thanks for looking into this.

This is the persistent-volumes.yml file from the /tmp directory.

~~~
apiVersion: v1
kind: List
items:
- apiVersion: v1
  kind: PersistentVolume
  metadata:
    name: "registry-volume"
    labels:
  spec:
    capacity:
      storage: "10Gi"
    accessModes:
    - ReadWriteMany
    claimName:
      registry-claim
      ...
    claimRef:
      name: registry-claim
      namespace: default
~~~

Comment 4 aygarg 2020-05-18 14:45:47 UTC
(In reply to aygarg from comment #3)
> Hello Brenton,
> 
> Thanks for looking into this.
> 
> This is the persistent-volumes.yml file from the /tmp directory.
> 
> ~~~
> apiVersion: v1
> kind: List
> items:
> - apiVersion: v1
>   kind: PersistentVolume
>   metadata:
>     name: "registry-volume"
>     labels:
>   spec:
>     capacity:
>       storage: "10Gi"
>     accessModes:
>     - ReadWriteMany
>     claimName:
>       registry-claim
>       ...
>     claimRef:
>       name: registry-claim
>       namespace: default
> ~~~

This persistent volume is from an Ansible 2.9 run, I will provide the another one generated with Ansible 2.7 in some time.

Comment 19 Russell Teague 2020-09-15 15:53:10 UTC
To further investigate this bug we will need the full verbose ansible log output as well as a complete inventory.  There are previous tasks that set up variables used during execution of the failed task and there are inventory variables that may be defined which could also affect this task.  Additionally, the failure message indicated in the description is from `oc` when it is parsing the generated pv list.  Knowing the installed version of `oc` would also be helpful.

Comment 24 Stephen Cuppett 2020-10-07 20:53:31 UTC
Thank you for continuing to use Red Hat OpenShift.  As part of a wider bug review, this bug has been evaluated and we have determined that at this time we do not plan to progress it.  As such, we will be closing this bug.  If you have need for continued assistance on this issue, please reopen the bug with additional context on why it needs to be reconsidered.

Comment 26 Russell Teague 2020-10-12 15:22:01 UTC
I have determined the ordering of the keys processed when templating persistent-volume.yml is not consistent between 2.7 and 2.9.

Ansible:
2.7.18 -> ['nfs', 'claimName']
2.9.14 -> [u'claimName', u'nfs']

This causes an incorrectly templated persistent-volume.yml.

Submitting a patch to correctly handle the templating regardless of the order of the keys.

Comment 31 Johnny Liu 2020-10-23 09:02:57 UTC
Reproduce this bug with openshift-ansible-3.11.216-1.git.0.085486a.el7.noarch + ansible-2.9.9-1.el7ae.

openshift_hosted_registry_storage_kind=nfs
openshift_hosted_registry_storage_nfs_options="*(rw,root_squash,sync,no_wdelay)"
openshift_hosted_registry_storage_nfs_directory=/var/lib/exports
openshift_hosted_registry_storage_volume_name=regpv
openshift_hosted_registry_storage_access_modes=["ReadWriteMany"]
openshift_hosted_registry_storage_volume_size=17G

TASK [openshift_persistent_volumes : Create PersistentVolumes] *****************
Friday 23 October 2020  15:25:49 +0800 (0:00:00.909)       0:14:26.954 ******** 
fatal: [ci-vm-10-0-150-46.hosted.upshift.rdu2.redhat.com]: FAILED! => {"changed": false, "cmd": ["oc", "create", "-f", "/tmp/openshift-ansible-ugSIrs9/persistent-volumes.yml", "--config=/tmp/openshift-ansible-ugSIrs9/admin.kubeconfig"], "delta": "0:00:00.220239", "end": "2020-10-23 03:25:50.626290", "failed_when_result": true, "msg": "non-zero return code", "rc": 1, "start": "2020-10-23 03:25:50.406051", "stderr": "The PersistentVolume \"regpv-volume\" is invalid: spec: Required value: must specify a volume type", "stderr_lines": ["The PersistentVolume \"regpv-volume\" is invalid: spec: Required value: must specify a volume type"], "stdout": "", "stdout_lines": []}


Verified this bug with openshift-ansible-3.11.307-1.git.0.83cdf01.el7.noarch + ansible-2.9.9-1.el7ae.

TASK [openshift_persistent_volumes : Create PersistentVolumes] *****************
Friday 23 October 2020  15:56:40 +0800 (0:00:00.998)       0:16:43.627 ******** 

changed: [ci-vm-10-0-151-94.hosted.upshift.rdu2.redhat.com] => {"changed": true, "cmd": ["oc", "create", "-f", "/tmp/openshift-ansible-Gbt2faU/persistent-volumes.yml", "--config=/tmp/openshift-ansible-Gbt2faU/admin.kubeconfig"], "delta": "0:00:00.265438", "end": "2020-10-23 03:56:40.251101", "failed_when_result": false, "rc": 0, "start": "2020-10-23 03:56:39.985663", "stderr": "", "stderr_lines": [], "stdout": "persistentvolume/regpv-volume created", "stdout_lines": ["persistentvolume/regpv-volume created"]}

Comment 34 errata-xmlrpc 2020-11-12 10:08:21 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 3.11.317 bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:4430


Note You need to log in before you can comment on or make changes to this bug.