Created attachment 1347282 [details] ceph-ansible playbook log Description of problem: Deployment of container with osd scenario as "lvm" fails. Version-Release number of selected component (if applicable): [admin@magna003 ceph-ansible]$ rpm -qa | grep ansible ceph-ansible-3.0.8-1.el7cp.noarch ansible-2.4.0.0-5.el7.noarch [admin@magna003 ceph-ansible]$ How reproducible: 2/2 Steps to Reproduce: 1. created lv cache volume on the osd nodes using the below commands. a. pvcreate /dev/sdb1 /dev/sdc1 b. vgcreate data_vg /dev/sdb1 /dev/sdc1 c. lvcreate -L 400G -n slowdisk data_vg /dev/sdb1 d. lvcreate -L 100G -n cachedisk data_vg /dev/sdc1 e. lvcreate -L 2G -n metadisk data_vg /dev/sdc1 f. lvconvert --type cache-pool /dev/data_vg/cachedisk --poolmetadata /dev/data_vg/metadisk g. lvconvert --type cache data_vg/slowdisk --cachepool data_vg/cachedisk 2. In osds.yml set the osd_scenario to "lvm" 3. The contianer deployment fails. p.s. used /dev/sdd1 partition for journal TASK [ceph-defaults : resolve device link(s)] ********************************************************************************************************************************************************************* fatal: [magna030]: FAILED! => {"failed": true, "msg": "'devices' is undefined"} Actual results: The deployment fails Expected results: The deployment should succeed. Additional info: ubuntu@magna003 ceph-ansible]$ cat group_vars/osds.yml | egrep -v ^# | grep -v ^$ --- dummy: osd_scenario: lvm #"{{ 'collocated' if journal_collocation or dmcrytpt_journal_collocation else 'non-collocated' if raw_multi_journal or dmcrypt_dedicated_journal else 'dummy' }}" # backward compatibility with stable-2.2, will disappear in stable 3.1 lvm_volumes: - data: slowdisk journal: /dev/sdd1 data_vg: data_vg ===== [ubuntu@magna003 ceph-ansible]$ cat group_vars/all.yml | egrep -v ^# | grep -v ^$ --- dummy: fetch_directory: ~/ceph-ansible-keys ceph_origin: distro ceph_repository: rhcs monitor_interface: eno1 public_network: 10.8.128.0/21 ceph_docker_image: "rhcs" ceph_docker_image_tag: "ceph-3.0-rhel-7-docker-candidate-82532-20171102231218" ceph_docker_registry: ""brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888" containerized_deployment: True #"{{ True if mon_containerized_deployment or osd_containerized_deployment or mds_containerized_deployment or rgw_containerized_deployment else False }}" # backward compatibility with stable-2.2, will disappear in stable 3.1 [ubuntu@magna003 ceph-ansible]$ ===== [ubuntu@magna003 ceph-ansible]$ cat /etc/ansible/hosts [mons] magna030 magna056 magna084 [osds] magna030 magna056 magna084 [mgrs] magna030 magna056 magna084 [ubuntu@magna003 ceph-ansible]$
Not sure I can help with the container portion of this. I can assist with the ceph-volume implementation though.
Hi, there are tasks to set some facts in ceph-defaults role which should be skipped when using osd_scenario: lvm. it's fixed upstream, waiting for the CI to merge the commit in master.
In any case, even if the error mentioned here is fixed, you are going to get an issue like no osd up, indeed, as far as I know, lvm scenario is not supposed to work with containerized deployments yet.
Since RHOSP 13 now depends on RHCS 3.0z2, can we get this support into that z-stream build? Background: we need ceph-volume support in Ceph containers, particularly for bluestore. Bluestore is critical to RHCS performance improvement, and ceph-volume seems critical for supporting deployment of bluestore, explained here by upstream documentation. http://docs.ceph.com/docs/master/rados/configuration/bluestore-config-ref/ This is particularly true for increasingly common all-flash configurations, where you need multiple OSDs/NVM device (or lose >= 40% throughput). Only osd_scenario=lvm allows this functionality today. https://mojo.redhat.com/groups/product-performance-scale-community-of-practice/blog/2018/01/31/bluestore-on-all-nvm-rados-bench
Hi John, This patch has been backported in 3.0 since 3.0.11 and v3.1.0beta2 for 3.1. Note that these backports are not going to add ceph-volume support in containers.
Guillaume, what additional changes in ceph-ansible and ceph-container are needed for this BZ?
Moving this bug to verified state In Container OSD_scenarios as lvm works, verified in build ceph-ansible-3.2.0-0.1.rc3.el7cp.noarch
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:0020