Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 1564214 - [ceph-ansible] : osd scenario -lvm : playbook failing when initiated second time
[ceph-ansible] : osd scenario -lvm : playbook failing when initiated second time
Status: ON_QA
Product: Red Hat Ceph Storage
Classification: Red Hat
Component: Ceph-Ansible (Show other bugs)
3.0
Unspecified Unspecified
unspecified Severity high
: rc
: 3.2
Assigned To: leseb
ceph-qe-bugs
Erin Donnelly
:
Depends On:
Blocks: 1557269
  Show dependency treegraph
 
Reported: 2018-04-05 12:52 EDT by Vasishta
Modified: 2018-11-01 15:28 EDT (History)
13 users (show)

See Also:
Fixed In Version: RHEL: ceph-ansible-3.2.0-0.1.rc1.el7cp Ubuntu: ceph-ansible_3.2.0~rc1-2redhat1
Doc Type: Known Issue
Doc Text:
.It is not possible to expand a cluster using the `osd_scenario: lvm` option `ceph-ansible` is not idempotent when deploying OSDs using `ceph-volume` and the `lvm_volumes` config option. Therefor, if you deploy a cluster using the `lvm` `osd_scenario` option, then you will not be able to expand the cluster. To workaround this issue, remove existing OSDs from the `lvm_volumes` config option so that they will not try to be recreated when deploying new OSDs. Cluster expansion will succeed as expected and create the new OSDs.
Story Points: ---
Clone Of:
Environment:
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
File contains contents of ansible-playbook log (873.44 KB, text/plain)
2018-04-05 12:52 EDT, Vasishta
no flags Details
File contains contents ansible-playbook log (476.98 KB, text/plain)
2018-04-09 00:53 EDT, Vasishta
no flags Details


External Trackers
Tracker ID Priority Status Summary Last Updated
Ceph Project Bug Tracker 23140 None None None 2018-04-05 16:29 EDT

  None (edit)
Description Vasishta 2018-04-05 12:52:13 EDT
Created attachment 1417857 [details]
File contains contents of ansible-playbook log

Description of problem:
When playbook was initiated to add nodes, the task 'create filestore osds with dedicated journal' failed trying to create OSD on lvs and disk partitions which are being used by existing OSDs.

Version-Release number of selected component (if applicable):
ceph-ansible-3.0.28-1.el7cp.noarch

How reproducible:
Always (1/1)

Steps to Reproduce:
1.Configure ceph-ansible to initiate a ceph cluster with at least OSD with lv as data part and a disk partition as journal part
2. Once the Cluster is up, rerun playbook

Actual results:
TASK [ceph-osd : use ceph-volume to create filestore osds with dedicated journals] is trying to create OSD on lv and disk partition which is already being used by another OSD

Expected results:
Task must be skipped
Comment 3 Andrew Schoen 2018-04-05 15:26:21 EDT
The PRs that fix this have not been backported to the stable-3.0 branch. However, even if they were you could not use a partition or raw device for 'data' and expect the playbook to be idempotent until https://github.com/ceph/ceph/pull/20620 makes it into a release.
Comment 4 Ken Dreyer (Red Hat) 2018-04-05 16:29:08 EDT
That PR 20620 will be in Ceph v12.2.5 upstream.
Comment 5 Vasishta 2018-04-06 02:29:57 EDT
(In reply to Vasishta from comment #0)
 
> Description of problem:
> When playbook was initiated to add nodes, the task 'create filestore osds
> with dedicated journal' failed trying to create OSD on lvs and disk
> partitions which are being used by existing OSDs.
> 

With this issue, user won't be able to successfully add new nodes to the cluster with OSDs having data part on logical volumes and journal on disk partitions.
Comment 6 Harish NV Rao 2018-04-06 05:09:54 EDT
(In reply to Ken Dreyer (Red Hat) from comment #4)
> That PR 20620 will be in Ceph v12.2.5 upstream.

@Ken, it means we will not have the fix for this in z2?

As per comment 5, this bug limits the ability to expand the cluster. Is there a way we can get the fix in z2?
Comment 9 Vasishta 2018-04-09 00:53 EDT
Created attachment 1419115 [details]
File contains contents ansible-playbook log

Not able to expand cluster even when data and journal were on logical volumes.

Failing while running same task which should have skipped as per my understanding.

$ cat /usr/share/ceph-ansible/group_vars/osds.yml | egrep -v ^# | grep -v ^$
---
dummy:
copy_admin_key: true
osd_scenario: lvm
lvm_volumes:
   - data: data1
     data_vg: d_vg
     journal: journal1
     journal_vg: j_vg
   - data: data2
     data_vg: d_vg
     journal: journal2
     journal_vg: j_vg
   - data: data3
     data_vg: d_vg
     journal: journal3
     journal_vg: j_vg
Comment 11 leseb 2018-04-19 04:55:37 EDT
Not sure I fully got what happened here, Andrew has more knowledge than me on that ceph-ansible code and on the BZ itself.

Andrew, could you please fill out the Doc Text field for me?
Thanks
Comment 12 Vasu Kulkarni 2018-10-30 14:36:25 EDT
We have to fix the idempotent nature of rerunning the playbook here, we use that for other add/remove operations.

Note You need to log in before you can comment on or make changes to this bug.