1657172 – Failures in ceph-ansible while adding new ceph-volume based OSD's to existing migrated cluster (ceph-disk to ceph-volume migrated cluster)

Bug 1657172 - Failures in ceph-ansible while adding new ceph-volume based OSD's to existing migrated cluster (ceph-disk to ceph-volume migrated cluster)

Summary: Failures in ceph-ansible while adding new ceph-volume based OSD's to existing...

Keywords:
Status:	CLOSED NOTABUG
Alias:	None
Product:	Red Hat Ceph Storage
Classification:	Red Hat Storage
Component:	Ceph-Ansible
Sub Component:
Version:	3.2
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	high
Target Milestone:	rc
Target Release:	3.2
Assignee:	Sébastien Han
QA Contact:	ceph-qe-bugs
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2018-12-07 10:37 UTC by Ramakrishnan Periyasamy
Modified:	2018-12-10 08:44 UTC (History)
CC List:	6 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2018-12-07 11:09:21 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
ansible logs (1.20 MB, text/plain) 2018-12-07 10:37 UTC, Ramakrishnan Periyasamy	no flags	Details
View All

Description Ramakrishnan Periyasamy 2018-12-07 10:37:01 UTC

Created attachment 1512430 [details]
ansible logs

Description of problem:
Failures in ceph-ansible while adding new ceph-volume based OSD's to existing migrated cluster (ceph-disk to ceph-volume migrated cluster). Cluster is based on RHEL baremetal.

Playbook cmd "ansible-playbook site.yml --limit osds -vvv". Playbook completed and in summary observed there are failures for old OSD nodes.

2018-12-07 14:51:59,207 p=54053 u=ubuntu |  failed: [cephqe-node3] (item=/dev/sdb) => {
    "changed": false,
    "cmd": [
        "ceph-disk",
        "activate",
        "--dmcrypt",
        "/dev/sdb1"
    ],
    "delta": "0:00:00.103753",
    "end": "2018-12-07 14:51:59.188476",
    "invocation": {
        "module_args": {
            "_raw_params": "ceph-disk activate --dmcrypt \"/dev/sdb1\"",
            "_uses_shell": false,
            "argv": null,
            "chdir": null,
            "creates": null,
            "executable": null,
            "removes": null,
            "stdin": null,
            "warn": true
        }
    },
    "item": "/dev/sdb",
    "msg": "non-zero return code",
    "rc": 1,
    "start": "2018-12-07 14:51:59.084723",
    "stderr": "get_dmcrypt_key: no `ceph_fsid` found falling back to 'ceph' for cluster name\nceph-disk: Error: unknown key-management-mode None",
    "stderr_lines": [
        "get_dmcrypt_key: no `ceph_fsid` found falling back to 'ceph' for cluster name",
        "ceph-disk: Error: unknown key-management-mode None"
    ],
    "stdout": "",
    "stdout_lines": []
}

Initially created ceph-disk filestore based cluster with below OSD scenarios.
[osds]
cephqe-node3 dedicated_devices="['/dev/sdd','/dev/nvme0n1']" devices="['/dev/sdb','/dev/sdc']" osd_scenario="non-collocated" dmcrypt="true"
cephqe-node4 devices="['/dev/sdb','/dev/sdc','/dev/nvme0n1']" osd_scenario="collocated" dmcrypt="true"
cephqe-node5 osd_auto_discovery='true' osd_scenario="collocated"

After cluster creation started some IO's and converted OSD's from ceph-disk to ceph-volume using "ceph-volume simple scan and activate" commands.

Added more ceph-volume based OSD's to the existing cluster. Changed ansible inventory file as below
[osds]
cephqe-node3 dedicated_devices="['/dev/sdd','/dev/nvme0n1']" devices="['/dev/sdb','/dev/sdc']" osd_scenario="non-collocated" dmcrypt="true"
cephqe-node4 devices="['/dev/sdb','/dev/sdc','/dev/nvme0n1']" osd_scenario="collocated" dmcrypt="true"
cephqe-node5 osd_auto_discovery='true' osd_scenario="collocated"
cephqe-node6 osd_scenario="lvm" lvm_volumes="[{'data':'data_lv1','data_vg':'data_vg','journal':'journal_lv1','journal_vg':'journal_vg'},{'data':'data_lv2','data_vg':'data_vg','journal':'journal_lv2','journal_vg':'journal_vg'},{'data':'data_lv3','data_vg':'data_vg','journal':'journal_lv3','journal_vg':'journal_vg'}]" osd_objectstore="filestore" dmcrypt="True"
cephqe-node8 osd_scenario="lvm" lvm_volumes="[{'data':'/dev/sdb','journal':'journal_lv1','journal_vg':'journal_vg'},{'data':'/dev/sdc','journal':'journal_lv2','journal_vg':'journal_vg'},{'data':'/dev/nvme0n1','journal':'journal_lv3','journal_vg':'journal_vg'}]" osd_objectstore="filestore" dmcrypt="True"

Version-Release number of selected component (if applicable):
ansible-2.6.10-1.el7ae.noarch
ceph-ansible-3.2.0-0.1.rc8.el7cp.noarch
ceph version 12.2.8-50.el7cp (e6f2204cc48f4018ab290488d75a262827cfb872) luminous (stable)

How reproducible:
1/1

Steps to Reproduce:
1. steps are in description.

Actual results:
Failures in ansible playbook related to ceph-disk

Expected results:
There should not be any failures.

Additional info:
Ansible inventory needs change for migrated OSD's before adding new OSD's to cluster or it needs to be handled in code.

Comment 4 seb 2018-12-07 11:09:21 UTC

At this point, what you're trying to achieve is not expected to work. The playbook will fail at this task. This work is targetted for 4.0 (full transition from ceph-disk to ceph-volume), as per https://bugzilla.redhat.com/show_bug.cgi?id=1656468.

Note You need to log in before you can comment on or make changes to this bug.