1590138 – [CNS] Run standalone deployment playbook with glusterfs_registry group failed due to AnsibleUndefinedVariable

Bug 1590138 - [CNS] Run standalone deployment playbook with glusterfs_registry group failed due to AnsibleUndefinedVariable

Summary: [CNS] Run standalone deployment playbook with glusterfs_registry group failed...

Keywords:
Status:	CLOSED WORKSFORME
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Installer
Sub Component:
Version:	3.10.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	medium
Severity:	medium
Target Milestone:	---
Target Release:	3.10.z
Assignee:	Jose A. Rivera
QA Contact:	Qin Ping
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1625090
TreeView+	depends on / blocked

Reported:	2018-06-12 06:18 UTC by Wenkai Shi
Modified:	2019-01-23 21:56 UTC (History)
CC List:	5 users (show)
Fixed In Version:
Doc Type:	No Doc Update
Doc Text:	undefined
Clone Of:
Clones:	1625090 (view as bug list)
Environment:
Last Closed:	2019-01-23 21:56:20 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Description Wenkai Shi 2018-06-12 06:18:23 UTC

Description of problem:
Run standalone deployment playbook with glusterfs_registry group failed due to AnsibleUndefinedVariable

Version-Release number of the following components:
openshift-ansible-3.10.0-0.64.0.git.20.48df973.el7

How reproducible:
100%

Steps to Reproduce:
1. Deploy OCP cluster with three extra node(For deploy CNS later)
# ansible-playbook -i hosts -vv /usr/share/ansible/openshift-ansible/playbooks/prerequisites.yml
# ansible-playbook -i hosts -vv /usr/share/ansible/openshift-ansible/playbooks/deploy_cluster.yml

2. Once deploy succeed, modify the inventory file, prepare for run CNS standalone deployment playbook.
# cat hosts
...
[OSEv3:children]
masters
nodes
etcd
glusterfs_registry
...
[OSEv3:vars]
openshift_hosted_registry_storage_kind=glusterfs
openshift_hosted_registry_storage_glusterfs_swap=true
...
[glusterfs_registry]
qe-weshi-sl-cnsr-gn-1.0611-iq6.qe.rhcloud.com glusterfs_devices="['/dev/vsda']"
qe-weshi-sl-cnsr-gn-2.0611-iq6.qe.rhcloud.com glusterfs_devices="['/dev/vsda']"
qe-weshi-sl-cnsr-gn-3.0611-iq6.qe.rhcloud.com glusterfs_devices="['/dev/vsda']"
...

3. Execute standalone deployment playbook 
# ansible-playbook -i hosts -vv /usr/share/ansible/openshift-ansible/playbooks/openshift-glusterfs/registry.yml

Actual results:
# ansible-playbook -i hosts -vv /usr/share/ansible/openshift-ansible/playbooks/openshift-glusterfs/registry.yml
...
TASK [openshift_persistent_volumes : Deploy PersistentVolumeClaim definitions] ***************************************************************************************************************
task path: /usr/share/ansible/openshift-ansible/roles/openshift_persistent_volumes/tasks/pvc.yml:2
Tuesday 12 June 2018  11:27:08 +0800 (0:00:00.040)       0:05:13.296 ********** 
fatal: [qe-weshi-sl-cnsr-master-etcd-1.0611-iq6.qe.rhcloud.com]: FAILED! => {"changed": false, "failed": true, "msg": "AnsibleUndefinedVariable: 'dict object' has no attribute 'annotations'"}

Expected results:
Should pass here

Additional info:

Comment 1 Scott Dodson 2018-06-12 12:50:32 UTC

Jose,

Is this a 3.10 blocker?

Comment 2 Jose A. Rivera 2018-06-12 13:34:38 UTC

It might be, yes.

The "swap" variable is not officially supported, as we've never had a chance to really verify it. The registry.yml playbook will still create a GlusterFS volume for the registry, but if the registry already exists then the process of changing the registry to use that new volume is a manual one. The instructions are found in the CNS documentation, chapter 10.2 step 18:

https://access.redhat.com/documentation/en-us/red_hat_gluster_storage/3.3/html-single/container-native_storage_for_openshift_container_platform/#idm139668589543152

Wenkai, can you try again without that variable? If the error still occurs can you attach a -vvv log of the run?

Comment 3 Wenkai Shi 2018-06-13 02:31:50 UTC

(In reply to Jose A. Rivera from comment #2)
> It might be, yes.
> 
> The "swap" variable is not officially supported, as we've never had a chance
> to really verify it. The registry.yml playbook will still create a GlusterFS
> volume for the registry, but if the registry already exists then the process
> of changing the registry to use that new volume is a manual one. The
> instructions are found in the CNS documentation, chapter 10.2 step 18:
> 
> https://access.redhat.com/documentation/en-us/red_hat_gluster_storage/3.3/
> html-single/container-native_storage_for_openshift_container_platform/
> #idm139668589543152
> 
> Wenkai, can you try again without that variable? If the error still occurs
> can you attach a -vvv log of the run?

I've try it without the "swap" variable, it's works well. But if no "swap" variable set, the docker registry back end storage can not set as glusterfs cluster defined in [glusterfs_registry] group.

Comment 4 Wenkai Shi 2018-06-13 03:01:29 UTC

By the way, same issue in CRS case.

Comment 5 Jose A. Rivera 2018-06-13 14:54:04 UTC

Okay, that makes sense.

So you're saying that without the "swap" variable the installation completed successfully? If so, we still need "openshift_hosted_registry_storage_kind=glusterfs" so that the GlusterFS volume gets configured with the fsGroup of the registry pods.

Can you verify if that works correctly? A simple check would be to mount the GlusterFS volume after the installer completes and check that the GID on the root of the volume matches the fsGroup of the registry pods.

Comment 6 Wenkai Shi 2018-06-14 02:23:18 UTC

(In reply to Jose A. Rivera from comment #5)
> Okay, that makes sense.
> 
> So you're saying that without the "swap" variable the installation completed
> successfully? If so, we still need
> "openshift_hosted_registry_storage_kind=glusterfs" so that the GlusterFS
> volume gets configured with the fsGroup of the registry pods.
> 
> Can you verify if that works correctly? A simple check would be to mount the
> GlusterFS volume after the installer completes and check that the GID on the
> root of the volume matches the fsGroup of the registry pods.

During execute standalone playbook [1], I do have parameter [2]. What do you mean by "mount the GlusterFS volume after the installer completes". Without parameter [3] set. There is no PVC/PV existing. Shouldn't it mount by installer?

[1]. /usr/share/ansible/openshift-ansible/playbooks/openshift-glusterfs/registry.yml
[2]. openshift_hosted_registry_storage_kind=glusterfs
[3]. openshift_hosted_registry_storage_glusterfs_swap=true

Comment 8 Jose A. Rivera 2018-06-20 21:56:44 UTC

Sorry for the delay, have been traveling at an event.

I think I managed to figure out the issue. I've submitted the following PR to hopefully resolve this: https://github.com/openshift/openshift-ansible/pull/8885

Comment 10 Scott Dodson 2018-09-17 18:35:55 UTC

Not aware of any backport for this to 3.10.

Comment 11 Scott Dodson 2018-09-17 18:37:45 UTC

https://github.com/openshift/openshift-ansible/pull/9934 was backported to release-3.10

Which is in openshift-ansible-3.10.45-1 and later

Comment 12 Jose A. Rivera 2018-09-17 18:38:15 UTC

Whoops, forgot to backport. Backport PR: https://github.com/openshift/openshift-ansible/pull/10111

Comment 13 Wenkai Shi 2018-09-18 08:35:22 UTC

Checked with version openshift-ansible-3.10.47-1.git.0.95bc2d2.el7_5.noarch, code doesn't merged in this build.

Comment 14 Jose A. Rivera 2018-09-19 13:02:08 UTC

PR was merged in 3.10.48.

Comment 16 Wenkai Shi 2018-10-10 03:01:09 UTC

Checked with version openshift-ansible-3.10.51-1.git.0.44a646c.el7.noarch, code has been merged, but this issue still appear.

# cat /usr/share/ansible/openshift-ansible/playbooks/openshift-glusterfs/private/registry.yml
---
- import_playbook: setup_nodes.yml

- import_playbook: gluster_hosts.yml

- import_playbook: gluster_main.yml

- name: Create persistent volumes
  hosts: oo_first_master
  roles:
  - role: openshift_persistent_volumes
    when:
    - openshift_hosted_registry_storage_kind | default(none) == 'glusterfs' or openshift_hosted_registry_storage_glusterfs_swap

- import_playbook: ../../openshift-hosted/private/openshift_hosted_registry.yml

- import_playbook: ../../openshift-hosted/private/openshift_hosted_wait_for_pods.yml

- import_playbook: ../../openshift-hosted/private/openshift_hosted_registry_storage.yml

# rpm -q openshift-ansible
openshift-ansible-3.10.51-1.git.0.44a646c.el7.noarch

# ansible-playbook -i hosts -vv /usr/share/ansible/openshift-ansible/playbooks/openshift-glusterfs/registry.yml
...
TASK [openshift_persistent_volumes : Deploy PersistentVolumeClaim definitions] ***************************************************************************************************************
task path: /usr/share/ansible/openshift-ansible/roles/openshift_persistent_volumes/tasks/pvc.yml:2
Wednesday 10 October 2018  10:47:24 +0800 (0:00:00.039)       0:04:59.549 ***** 
fatal: [ec2-******.compute-1.amazonaws.com]: FAILED! => {"changed": false, "failed": true, "msg": "AnsibleUndefinedVariable: 'dict object' has no attribute 'annotations'"}
...

Comment 20 Jose A. Rivera 2019-01-23 21:56:20 UTC

Given that there should be no new installations happening using openshift-ansible 3.10.x and that this bug does not seem to have been popping up for a while, I'm closing it.

Note You need to log in before you can comment on or make changes to this bug.