Bug 1902153 - OSD migration from filestore to bluestore is not invoked properly
Summary: OSD migration from filestore to bluestore is not invoked properly
Keywords:
Status: CLOSED DUPLICATE of bug 1875777
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: documentation
Version: 16.1 (Train)
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: ---
Assignee: ndeevy
QA Contact: RHOS Documentation Team
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-11-27 05:53 UTC by Takashi Kajinami
Modified: 2024-03-25 17:16 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-12-02 09:02:09 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Takashi Kajinami 2020-11-27 05:53:47 UTC
Description of problem:

A customer is now testing osd migration from filestore to bluestore following the documentation.
 https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16.1/html/framework_for_upgrades_13_to_16.1/osd-migration-from-filestore-to-bluestore

However they observe that their osds are still using filrestore after the command to trigger migration completes successfully.

~~~
[stack@undercloud-0 ~]$ openstack overcloud external-upgrade run --tags ceph_fstobs -e ceph_ansible_limit=ceph-0| tee oc-fstobs.log
...
Success
~~~

~~~
[root@controller-0 ~]# podman exec -it ceph-mon-controller-0 sh -c "ceph -f json osd metadata" | jq -c '.[] | select(.hostname == "ceph-0") | ["host", .hostname, "osd_id", .id, "objectstore", .osd_objectstore]'
["host","ceph-0","osd_id",0,"objectstore","filestore"]
["host","ceph-0","osd_id",1,"objectstore","filestore"]
...
~~~

Version-Release number of selected component (if applicable):


ansible-role-tripleo-modify-image-1.2.1-0.20200804085623.1dffa21.el8ost.noarch
ansible-tripleo-ipa-0.2.1-1.20200813093411.3bb3c53.el8ost.noarch
ansible-tripleo-ipsec-9.2.1-0.20200311073016.0c8693c.el8ost.noarch
openstack-tripleo-common-11.4.1-1.20200914165651.el8ost.noarch
openstack-tripleo-common-containers-11.4.1-1.20200914165651.el8ost.noarch
openstack-tripleo-heat-templates-11.3.2-1.20200914170156.el8ost.noarch
openstack-tripleo-image-elements-10.6.2-0.20200528043425.7dc0fa1.el8ost.noarch
openstack-tripleo-puppet-elements-11.2.2-0.20200701163410.432518a.el8ost.noarch
openstack-tripleo-validations-11.3.2-1.20200914170825.el8ost.noarch
puppet-tripleo-11.5.0-1.20200914161840.f716ef5.el8ost.noarch
python3-tripleoclient-12.3.2-1.20200914164928.el8ost.noarch
python3-tripleoclient-heat-installer-12.3.2-1.20200914164928.el8ost.noarch
python3-tripleo-common-11.4.1-1.20200914165651.el8ost.noarch
tripleo-ansible-0.5.1-1.20200914163925.el8ost.noarch

ceph-ansible-4.0.31-1.el8cp.noarch

How reproducible:
Always

Steps to Reproduce:
1. Deploy OSP16.1 + OCS4 with filestore
2. Follow the documentation to migrate osd from filestore to bluestore

Actual results:
OSD keeps using filestore even after successful command execution

Expected results:
OSD uses bluestore after successful command execution

Additional info:

Comment 1 Takashi Kajinami 2020-11-27 05:57:27 UTC
I turned out that filestore-to-bluestore.yaml skipped the steps for migration

/var/lib/mistral/4eacc9bf-622b-43cc-9301-c8c1f6e328b6/ceph-ansible/ceph_ansible_command.log
~~~
Running /var/lib/mistral/4eacc9bf-622b-43cc-9301-c8c1f6e328b6/ceph-ansible/ceph_ansible_command.sh
...
2020-11-27 11:42:08,293 p=143980 u=root n=ansible | ok: [ceph-0] => {"ansible_facts": {"current_objectstore": "bluestore"}, "changed": false}
2020-11-27 11:42:08,343 p=143980 u=root n=ansible | TASK [warn user about osd already using bluestore] *****************************
2020-11-27 11:42:08,343 p=143980 u=root n=ansible | Friday 27 November 2020  11:42:08 +0900 (0:00:00.075)       0:00:05.484 ******* 
2020-11-27 11:42:08,368 p=143980 u=root n=ansible | ok: [ceph-0] => {
    "msg": "WARNING: ceph-0 is already using bluestore. Skipping all tasks."
}
...
~~~

This is because the playbook has a logic to skip migration when osd_objectstore is "bluestore"

https://github.com/ceph/ceph-ansible/tree/stable-4.0/infrastructure-playbooks/filestore-to-bluestore.yml
~~~
- hosts: "{{ osd_group_name }}"
  become: true
  serial: 1
  vars:
    delegate_facts_host: true
  tasks:
    - name: gather and delegate facts
      setup:
      delegate_to: "{{ item }}"
      delegate_facts: True
      with_items: "{{ groups[mon_group_name] }}"
      run_once: true
      when: delegate_facts_host | bool

    - import_role:
        name: ceph-defaults

    - name: set_fact current_objectstore
      set_fact:
        current_objectstore: '{{ osd_objectstore }}'

    - name: warn user about osd already using bluestore
      debug:
        msg: 'WARNING: {{ inventory_hostname }} is already using bluestore. Skipping all tasks.'
      when: current_objectstore == 'bluestore'
~~~

Our current documentation says that we should set "osd_objectstore: bluestore" in
CephAnsibleDiskConfig and this is causing the issue.

Even if we remove that line bluestore seems to be the default value in ceph-ansible
so I'm afraid the issue is not solved.

https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16.1/html/framework_for_upgrades_13_to_16.1/osd-migration-from-filestore-to-bluestore#migrating-OSDs-from-FileStore-to-BlueStore

Comment 2 Dan Macpherson 2020-11-27 12:51:27 UTC
I don't think this is a documentation BZ. This sounds like an engineering BZ.

Comment 3 Takashi Kajinami 2020-11-30 15:02:06 UTC
(In reply to Dan Macpherson from comment #2)
> I don't think this is a documentation BZ. This sounds like an engineering BZ.

One possible solution without code change would be to set
 osd_objectstore: filestore
before the migration command and reset the parameter to bluestore after migration completes.
We can use "openstack overcloud deploy --stack-only" to change the parameter without
triggering actual deployment steps.

However I tend to agree with you about this is an engineering BZ and the above parameter settings
should be handled in tripleo, ideally.

Do you want me to change the assigned component to tripleo-heat-templates(or any different package
if we have better one), or can we get some insights from Ceph squad before moving this bz ?

Comment 4 ndeevy 2020-12-01 09:03:12 UTC
Thanks Takashi

Hi @Francesco. Could you take a look and advise how best to address this BZ please? I agree with Dan that it seems more like an engineering BZ.

Thanks :)

Comment 14 Takashi Kajinami 2020-12-04 13:19:20 UTC
Thanks Naomi,

The updated version looks good to me !


Note You need to log in before you can comment on or make changes to this bug.