Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1528688

Summary: Unable to deploy CNS if storage disks are initialized
Product: OpenShift Container Platform Reporter: raffaele spazzoli <rspazzol>
Component: InstallerAssignee: Scott Dodson <sdodson>
Status: CLOSED DUPLICATE QA Contact: Wenkai Shi <weshi>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 3.7.1CC: aos-bugs, jialiu, jokerman, mmccomas
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-12-26 06:21:11 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description raffaele spazzoli 2017-12-22 19:01:06 UTC
If the storage disks are initialized (for example for previously aborted install), heketi does not overwrite them unless using the -ff option.
it looks like that option can be managed from the ansible installer by setting the following:

openshift_storage_glusterfs_wipe: true

when I set that value if fails with the following:

TASK [openshift_storage_glusterfs : Unlabel any existing GlusterFS nodes] *****************************************************************************************************************************************
task path: /tmp/git/casl-ansible/openshift-ansible/roles/openshift_storage_glusterfs/tasks/glusterfs_deploy.yml:19
Friday 22 December 2017  18:55:58 +0000 (0:00:02.835)       0:04:49.891 ******* 
changed: [master-1.env1.casl.raffa.com] => (item=infranode-0.env1.casl.raffa.com) => {"changed": true, "failed": false, "item": "infranode-0.env1.casl.raffa.com", "results": {"cmd": "/bin/oc label node infranode-0.env1.casl.raffa.com region-", "results": {}, "returncode": 0}, "state": "absent"}
ok: [master-1.env1.casl.raffa.com] => (item=master-1.env1.casl.raffa.com) => {"changed": false, "failed": false, "item": "master-1.env1.casl.raffa.com", "state": "absent"}
ok: [master-1.env1.casl.raffa.com] => (item=master-2.env1.casl.raffa.com) => {"changed": false, "failed": false, "item": "master-2.env1.casl.raffa.com", "state": "absent"}
ok: [master-1.env1.casl.raffa.com] => (item=master-0.env1.casl.raffa.com) => {"changed": false, "failed": false, "item": "master-0.env1.casl.raffa.com", "state": "absent"}
fatal: [master-1.env1.casl.raffa.com]: FAILED! => {"failed": true, "msg": "The task includes an option with an undefined variable. The error was: 'dict object' has no attribute 'openshift'\n\nThe error appears to have been in '/tmp/git/casl-ansible/openshift-ansible/roles/openshift_storage_glusterfs/tasks/glusterfs_deploy.yml': line 19, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n- name: Unlabel any existing GlusterFS nodes\n  ^ here\n\nexception type: <class 'ansible.errors.AnsibleUndefinedVariable'>\nexception: 'dict object' has no attribute 'openshift'"}

I think the problem is related to the following piece of code:

- name: Unlabel any existing GlusterFS nodes
  oc_label:
    name: "{{ hostvars[item].openshift.node.nodename }}"
    kind: node
    state: absent
    labels: "{{ glusterfs_nodeselector | lib_utils_oo_dict_to_list_of_dict }}"
  with_items: "{{ groups.all }}"
  when: "'openshift' in hostvars[item] and glusterfs_wipe"

an inventory file can contain hosts that are not nodes and so this may fail:
{{ hostvars[item].openshift.node.nodename }}

I think the code should iterate only on the nodes:
with_items: "{{ groups.nodes }}"

Comment 1 Wenkai Shi 2017-12-26 03:09:24 UTC
According to document[1]. Seems these block devices must be bare (e.g. have no data, not be marked as LVM PVs), and will be formatted.

[1]. https://github.com/openshift/openshift-ansible/blob/master/inventory/hosts.glusterfs.native.example#L45-L46

But, from attached log, looks like code of the task has issue.

Comment 2 Wenkai Shi 2017-12-26 06:21:11 UTC

*** This bug has been marked as a duplicate of bug 1523681 ***