Description of problem: A customer is now testing osd migration from filestore to bluestore following the documentation. https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16.1/html/framework_for_upgrades_13_to_16.1/osd-migration-from-filestore-to-bluestore However they observe that their osds are still using filrestore after the command to trigger migration completes successfully. ~~~ [stack@undercloud-0 ~]$ openstack overcloud external-upgrade run --tags ceph_fstobs -e ceph_ansible_limit=ceph-0| tee oc-fstobs.log ... Success ~~~ ~~~ [root@controller-0 ~]# podman exec -it ceph-mon-controller-0 sh -c "ceph -f json osd metadata" | jq -c '.[] | select(.hostname == "ceph-0") | ["host", .hostname, "osd_id", .id, "objectstore", .osd_objectstore]' ["host","ceph-0","osd_id",0,"objectstore","filestore"] ["host","ceph-0","osd_id",1,"objectstore","filestore"] ... ~~~ Version-Release number of selected component (if applicable): ansible-role-tripleo-modify-image-1.2.1-0.20200804085623.1dffa21.el8ost.noarch ansible-tripleo-ipa-0.2.1-1.20200813093411.3bb3c53.el8ost.noarch ansible-tripleo-ipsec-9.2.1-0.20200311073016.0c8693c.el8ost.noarch openstack-tripleo-common-11.4.1-1.20200914165651.el8ost.noarch openstack-tripleo-common-containers-11.4.1-1.20200914165651.el8ost.noarch openstack-tripleo-heat-templates-11.3.2-1.20200914170156.el8ost.noarch openstack-tripleo-image-elements-10.6.2-0.20200528043425.7dc0fa1.el8ost.noarch openstack-tripleo-puppet-elements-11.2.2-0.20200701163410.432518a.el8ost.noarch openstack-tripleo-validations-11.3.2-1.20200914170825.el8ost.noarch puppet-tripleo-11.5.0-1.20200914161840.f716ef5.el8ost.noarch python3-tripleoclient-12.3.2-1.20200914164928.el8ost.noarch python3-tripleoclient-heat-installer-12.3.2-1.20200914164928.el8ost.noarch python3-tripleo-common-11.4.1-1.20200914165651.el8ost.noarch tripleo-ansible-0.5.1-1.20200914163925.el8ost.noarch ceph-ansible-4.0.31-1.el8cp.noarch How reproducible: Always Steps to Reproduce: 1. Deploy OSP16.1 + OCS4 with filestore 2. Follow the documentation to migrate osd from filestore to bluestore Actual results: OSD keeps using filestore even after successful command execution Expected results: OSD uses bluestore after successful command execution Additional info:
I turned out that filestore-to-bluestore.yaml skipped the steps for migration /var/lib/mistral/4eacc9bf-622b-43cc-9301-c8c1f6e328b6/ceph-ansible/ceph_ansible_command.log ~~~ Running /var/lib/mistral/4eacc9bf-622b-43cc-9301-c8c1f6e328b6/ceph-ansible/ceph_ansible_command.sh ... 2020-11-27 11:42:08,293 p=143980 u=root n=ansible | ok: [ceph-0] => {"ansible_facts": {"current_objectstore": "bluestore"}, "changed": false} 2020-11-27 11:42:08,343 p=143980 u=root n=ansible | TASK [warn user about osd already using bluestore] ***************************** 2020-11-27 11:42:08,343 p=143980 u=root n=ansible | Friday 27 November 2020 11:42:08 +0900 (0:00:00.075) 0:00:05.484 ******* 2020-11-27 11:42:08,368 p=143980 u=root n=ansible | ok: [ceph-0] => { "msg": "WARNING: ceph-0 is already using bluestore. Skipping all tasks." } ... ~~~ This is because the playbook has a logic to skip migration when osd_objectstore is "bluestore" https://github.com/ceph/ceph-ansible/tree/stable-4.0/infrastructure-playbooks/filestore-to-bluestore.yml ~~~ - hosts: "{{ osd_group_name }}" become: true serial: 1 vars: delegate_facts_host: true tasks: - name: gather and delegate facts setup: delegate_to: "{{ item }}" delegate_facts: True with_items: "{{ groups[mon_group_name] }}" run_once: true when: delegate_facts_host | bool - import_role: name: ceph-defaults - name: set_fact current_objectstore set_fact: current_objectstore: '{{ osd_objectstore }}' - name: warn user about osd already using bluestore debug: msg: 'WARNING: {{ inventory_hostname }} is already using bluestore. Skipping all tasks.' when: current_objectstore == 'bluestore' ~~~ Our current documentation says that we should set "osd_objectstore: bluestore" in CephAnsibleDiskConfig and this is causing the issue. Even if we remove that line bluestore seems to be the default value in ceph-ansible so I'm afraid the issue is not solved. https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16.1/html/framework_for_upgrades_13_to_16.1/osd-migration-from-filestore-to-bluestore#migrating-OSDs-from-FileStore-to-BlueStore
I don't think this is a documentation BZ. This sounds like an engineering BZ.
(In reply to Dan Macpherson from comment #2) > I don't think this is a documentation BZ. This sounds like an engineering BZ. One possible solution without code change would be to set osd_objectstore: filestore before the migration command and reset the parameter to bluestore after migration completes. We can use "openstack overcloud deploy --stack-only" to change the parameter without triggering actual deployment steps. However I tend to agree with you about this is an engineering BZ and the above parameter settings should be handled in tripleo, ideally. Do you want me to change the assigned component to tripleo-heat-templates(or any different package if we have better one), or can we get some insights from Ceph squad before moving this bz ?
Thanks Takashi Hi @Francesco. Could you take a look and advise how best to address this BZ please? I agree with Dan that it seems more like an engineering BZ. Thanks :)
Thanks for the updated note and clear explanation Takashi. Here is the update: https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16.1/html-single/framework_for_upgrades_13_to_16.1/index?lb_target=production#OSD-migration-from-filestore-to-bluestore
Thanks Naomi, The updated version looks good to me !