Description of problem: CNS as docker registry backend storage installation failed with 'dict object' has no attribute 'fsGroup' error. Version-Release number of the following components: openshift-ansible-3.10.0-0.30.0.git.0.4f02952.el7 How reproducible: 100% Steps to Reproduce: 1. Install OCP, CNS as docker-registry backend storage. # cat hosts ... openshift_hosted_registry_storage_kind=glusterfs ... [glusterfs_registry] gulsterfs1.example.com glusterfs_devices="['/dev/vsda']" glusterfs2.example.com glusterfs_devices="['/dev/vsda']" glusterfs3.example.com glusterfs_devices="['/dev/vsda']" ... 2. 3. Actual results: # ansible-playbook -i hosts -v /usr/share/ansible/openshift-ansible/playbooks/prerequisites.yml ... # ansible-playbook -i hosts -v /usr/share/ansible/openshift-ansible/playbooks/deploy_cluster.yml ... TASK [openshift_hosted : Determine registry fsGroup] *************************** Friday 27 April 2018 05:53:37 -0400 (0:00:32.866) 0:19:49.765 ********** fatal: [master.example.com]: FAILED! => {"failed": true, "msg": "The task includes an option with an undefined variable. The error was: 'dict object' has no attribute 'fsGroup'\n\nThe error appears to have been in '/home/slave3/workspace/Launch Environment Flexy/private-openshift-ansible/roles/openshift_hosted/tasks/storage/glusterfs.yml': line 24, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n- name: Determine registry fsGroup\n ^ here\n\nexception type: <class 'ansible.errors.AnsibleUndefinedVariable'>\nexception: 'dict object' has no attribute 'fsGroup'"} ... Failure summary: 1. Hosts: master.example.com Play: Poll for hosted pod deployments Task: Determine registry fsGroup Message: The task includes an option with an undefined variable. The error was: 'dict object' has no attribute 'fsGroup' The error appears to have been in '/usr/share/ansible/openshift-ansible/roles/openshift_hosted/tasks/storage/glusterfs.yml': line 24, column 3, but may be elsewhere in the file depending on the exact syntax problem. The offending line appears to be: - name: Determine registry fsGroup ^ here exception type: <class 'ansible.errors.AnsibleUndefinedVariable'> exception: 'dict object' has no attribute 'fsGroup' ... Expected results: Should pass here. Additional info:
Could you provide your full inventory file as well as the output of "oc get -o yaml" and "oc describe" for one of the registry pods?
Finally had a good look at this... is this consistently reproducible? Does it reproduce in 3.9 or 3.7?
(In reply to Jose A. Rivera from comment #6) > Finally had a good look at this... is this consistently reproducible? Does > it reproduce in 3.9 or 3.7? Try with openshift-ansible-3.7.44-1.git.0.dbb912c.el7 and openshift-ansible-3.9.27-1.git.0.52e35b5.el7, it doesn't reproduce. Take another shoot in openshift-ansible-3.10.0-0.31.0.git.0.9f771c7.el7, it reproduce.
Created attachment 1430335 [details] oc get pods
Created attachment 1430336 [details] oc get -o yaml pod docker-registry-1-5dfcb
Created attachment 1430337 [details] oc describe pod docker-registry-1-5dfcb
Created attachment 1430338 [details] ansible-playbook -i inventory.ini playbooks/deploy_cluster.yml
Created attachment 1430341 [details] inventory.ini
The same problem. See attachments. # oc version oc v3.10.0-alpha.0+428152d-936 kubernetes v1.10.0+b81c8f8 features: Basic-Auth GSSAPI Kerberos SPNEGO Server https://openshift.guillen.io:8443 openshift v3.10.0-alpha.0+428152d-936 kubernetes v1.10.0+b81c8f8
Sorry for the delay, I've been traveling a lot lately. Similarly sorry to ask fro more info, but could you try again on 3.9 (which should succeed) and grab the output of "oc get -o yaml" and "oc describe" for one of the registry pods? I currently don't have immediate access to an OCP environment to test myself. Mostly I'm looking to see if 3.9 reports an "fsGroup" field. Scott, any immediate ideas if something has changed in the hosted registry from 3.9 to 3.10 that might explain this?
I was afraid of this. In the 3.9 output, the registry pod has: securityContext: fsGroup: 1000000000 But in the 3.10 output the securityContext field is blank. I'll try to look into this, hopefully someone else can chime in with more wisdom in the meantime.
Is the following configured in 3.9 but not 3.10 on the nodes in /etc/origin/node/node-config.yaml volumeConfig: localQuota: perFSGroup I'm not aware of any changes in registry actual deployment.
Alexi, Any ideas on why the registry would have fsGroup set in 3.9 but not 3.10?
I don't know. We didn't change it.
Scott, I checked the node-config templates and the volumeConfig stanza is the same between 3.9 and 3.10. Any other ideas?
There were some volumeConfig changes that were recently added to 3.10 to account for changes in Origin. Lets retest with the latest openshift-ansible. https://github.com/openshift/openshift-ansible/pull/8450 Which is included in openshift-ansible-3.10.0-0.51.0 and later.
Verified with version openshift-ansible-3.10.0-0.54.0.git.0.537c485.el7, it doesn't appear now.
I also tried it and it works as expected with the version openshift-ansible-3.10.0-0.58.0-2-g73079a70f