Description of problem: The last BZ:1585482 which I raised was related to the exact same error but with OSP-12 + RHCS 3.0 version. I was adviced that RHCS 3.0 is not supported/tested with OSP-12. So i repeated my experiment with OSP-13 and RHCS 3.0 (a supported/tested configuration). After deploying OSP-13 using OSPd, OpenStack deployment was clean ## Ceph OSDs are DOWN [heat-admin@controller-0 ~]$ ceph -s cluster: id: ce7bd88c-6a9c-11e8-a882-2047478ccfaa health: HEALTH_WARN no active mgr services: mon: 1 daemons, quorum controller-0 mgr: no daemons active osd: 60 osds: 0 up, 0 in data: pools: 0 pools, 0 pgs objects: 0 objects, 0 bytes usage: 0 kB used, 0 kB / 0 kB avail pgs: ## OSDs are flapping [heat-admin@ceph-storage-0 ~]$ sudo docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES f259f333d2a9 192.168.120.1:8787/rhceph/rhceph-3-rhel7:latest "/entrypoint.sh" 1 second ago Up Less than a second ceph-osd-ceph-storage-0-sdf 55f519615632 192.168.120.1:8787/rhceph/rhceph-3-rhel7:latest "/entrypoint.sh" 2 seconds ago Up 2 seconds ceph-osd-ceph-storage-0-sdg ac148f97e891 192.168.120.1:8787/rhceph/rhceph-3-rhel7:latest "/entrypoint.sh" 4 seconds ago Up 4 seconds ceph-osd-ceph-storage-0-sdj 55ba302877ee 192.168.120.1:8787/rhceph/rhceph-3-rhel7:latest "/entrypoint.sh" 6 seconds ago Up 6 seconds ceph-osd-ceph-storage-0-sdd 89c860a5b291 192.168.120.1:8787/rhceph/rhceph-3-rhel7:latest "/entrypoint.sh" 8 seconds ago Up 7 seconds ceph-osd-ceph-storage-0-sdk 6be986fc3049 192.168.120.1:8787/rhceph/rhceph-3-rhel7:latest "/entrypoint.sh" 10 seconds ago Up 9 seconds ceph-osd-ceph-storage-0-sdh ff233ee5104a 192.168.120.1:8787/rhceph/rhceph-3-rhel7:latest "/entrypoint.sh" 15 seconds ago Up 14 seconds ceph-osd-ceph-storage-0-sde a8f8f9a97e3e 192.168.120.1:8787/rhosp13-beta/openstack-cron:latest "kolla_start" 2 hours ago Up 2 hours logrotate_crond [heat-admin@ceph-storage-0 ~]$ [heat-admin@ceph-storage-0 ~]$ [heat-admin@ceph-storage-0 ~]$ sudo docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES c93cb93e6794 192.168.120.1:8787/rhceph/rhceph-3-rhel7:latest "/entrypoint.sh" 1 second ago Up Less than a second ceph-osd-ceph-storage-0-sdj 1c4c8e94b7f4 192.168.120.1:8787/rhceph/rhceph-3-rhel7:latest "/entrypoint.sh" 3 seconds ago Up 1 second ceph-osd-ceph-storage-0-sdd 82e9ca7b314e 192.168.120.1:8787/rhceph/rhceph-3-rhel7:latest "/entrypoint.sh" 5 seconds ago Up 3 seconds ceph-osd-ceph-storage-0-sdk 2299a9229ee7 192.168.120.1:8787/rhceph/rhceph-3-rhel7:latest "/entrypoint.sh" 7 seconds ago Up 5 seconds ceph-osd-ceph-storage-0-sdh 7cb42e826804 192.168.120.1:8787/rhceph/rhceph-3-rhel7:latest "/entrypoint.sh" 12 seconds ago Up 10 seconds ceph-osd-ceph-storage-0-sde 04ff29e7ec5c 192.168.120.1:8787/rhceph/rhceph-3-rhel7:latest "/entrypoint.sh" 13 seconds ago Up 12 seconds ceph-osd-ceph-storage-0-sdm 6d400d81f198 192.168.120.1:8787/rhceph/rhceph-3-rhel7:latest "/entrypoint.sh" 15 seconds ago Up 14 seconds ceph-osd-ceph-storage-0-sdi a8f8f9a97e3e 192.168.120.1:8787/rhosp13-beta/openstack-cron:latest "kolla_start" 2 hours ago Up 2 hours logrotate_crond [heat-admin@ceph-storage-0 ~]$ ## Logs from journalctl -u ceph-osd@<HDD> Jun 12 17:01:27 ceph-storage-0 systemd[1]: Started Ceph OSD. Jun 12 17:01:28 ceph-storage-0 ceph-osd-run.sh[762534]: Error response from daemon: No such container: expose_partitions_sdm Jun 12 17:01:32 ceph-storage-0 ceph-osd-run.sh[762534]: 2018-06-12 17:01:32 /entrypoint.sh: static: does not generate config Jun 12 17:01:33 ceph-storage-0 ceph-osd-run.sh[762534]: main_activate: path = /dev/sdm1 Jun 12 17:01:34 ceph-storage-0 ceph-osd-run.sh[762534]: get_dm_uuid: get_dm_uuid /dev/sdm1 uuid path is /sys/dev/block/8:193/dm/uuid Jun 12 17:01:34 ceph-storage-0 ceph-osd-run.sh[762534]: command: Running command: /usr/sbin/blkid -o udev -p /dev/sdm1 Jun 12 17:01:34 ceph-storage-0 ceph-osd-run.sh[762534]: command: Running command: /sbin/blkid -p -s TYPE -o value -- /dev/sdm1 Jun 12 17:01:34 ceph-storage-0 ceph-osd-run.sh[762534]: command: Running command: /usr/bin/ceph-conf --cluster=ceph --name=osd. --lookup osd_mount_options_xfs Jun 12 17:01:35 ceph-storage-0 ceph-osd-run.sh[762534]: command: Running command: /usr/bin/ceph-conf --cluster=ceph --name=osd. --lookup osd_fs_mount_options_xfs Jun 12 17:01:35 ceph-storage-0 ceph-osd-run.sh[762534]: mount: Mounting /dev/sdm1 on /var/lib/ceph/tmp/mnt.81tYoO with options noatime,inode64 Jun 12 17:01:35 ceph-storage-0 ceph-osd-run.sh[762534]: command_check_call: Running command: /usr/bin/mount -t xfs -o noatime,inode64 -- /dev/sdm1 /var/lib/ceph/tmp/mnt.81tYoO Jun 12 17:01:35 ceph-storage-0 ceph-osd-run.sh[762534]: command: Running command: /usr/sbin/restorecon /var/lib/ceph/tmp/mnt.81tYoO Jun 12 17:01:35 ceph-storage-0 ceph-osd-run.sh[762534]: activate: Cluster uuid is ce7bd88c-6a9c-11e8-a882-2047478ccfaa Jun 12 17:01:35 ceph-storage-0 ceph-osd-run.sh[762534]: command: Running command: /usr/bin/ceph-osd --cluster=ceph --show-config-value=fsid Jun 12 17:01:35 ceph-storage-0 ceph-osd-run.sh[762534]: activate: Cluster name is ceph Jun 12 17:01:35 ceph-storage-0 ceph-osd-run.sh[762534]: activate: OSD uuid is fed4806b-b65c-4cbf-8b6a-9ae2399875b6 Jun 12 17:01:35 ceph-storage-0 ceph-osd-run.sh[762534]: activate: OSD id is 42 Jun 12 17:01:35 ceph-storage-0 ceph-osd-run.sh[762534]: activate: Initializing OSD... Jun 12 17:01:35 ceph-storage-0 ceph-osd-run.sh[762534]: command_check_call: Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring mon getmap -o /var/lib/ceph/tmp/mnt.81tYoO/activate.monmap Jun 12 17:01:35 ceph-storage-0 ceph-osd-run.sh[762534]: got monmap epoch 1 Jun 12 17:01:35 ceph-storage-0 ceph-osd-run.sh[762534]: command_check_call: Running command: /usr/bin/ceph-osd --cluster ceph --mkfs -i 42 --monmap /var/lib/ceph/tmp/mnt.81tYoO/activate.monmap --osd-data /var/lib/ceph/tmp/mnt.81tYoO --osd-uuid fed4806b-b65c-4cbf-8b6a-9ae2399875b6 --set Jun 12 17:01:35 ceph-storage-0 ceph-osd-run.sh[762534]: 2018-06-12 17:01:35.994819 7fc24e71ed80 -1 bluestore(/var/lib/ceph/tmp/mnt.81tYoO/block) _check_or_set_bdev_label bdev /var/lib/ceph/tmp/mnt.81tYoO/block fsid b8ceb78b-4766-4cc3-8496-7bed005d3769 does not match our fsid fed4806b-b Jun 12 17:01:36 ceph-storage-0 ceph-osd-run.sh[762534]: 2018-06-12 17:01:36.251825 7fc24e71ed80 -1 bluestore(/var/lib/ceph/tmp/mnt.81tYoO) mkfs fsck found fatal error: (5) Input/output error Jun 12 17:01:36 ceph-storage-0 ceph-osd-run.sh[762534]: 2018-06-12 17:01:36.251860 7fc24e71ed80 -1 OSD::mkfs: ObjectStore::mkfs failed with error (5) Input/output error Jun 12 17:01:36 ceph-storage-0 ceph-osd-run.sh[762534]: 2018-06-12 17:01:36.251978 7fc24e71ed80 -1 ** ERROR: error creating empty object store in /var/lib/ceph/tmp/mnt.81tYoO: (5) Input/output error Jun 12 17:01:36 ceph-storage-0 ceph-osd-run.sh[762534]: mount_activate: Failed to activate Jun 12 17:01:36 ceph-storage-0 ceph-osd-run.sh[762534]: unmount: Unmounting /var/lib/ceph/tmp/mnt.81tYoO Jun 12 17:01:36 ceph-storage-0 ceph-osd-run.sh[762534]: command_check_call: Running command: /bin/umount -- /var/lib/ceph/tmp/mnt.81tYoO Jun 12 17:01:36 ceph-storage-0 ceph-osd-run.sh[762534]: Traceback (most recent call last): Jun 12 17:01:36 ceph-storage-0 ceph-osd-run.sh[762534]: File "/usr/sbin/ceph-disk", line 9, in <module> Jun 12 17:01:36 ceph-storage-0 ceph-osd-run.sh[762534]: load_entry_point('ceph-disk==1.0.0', 'console_scripts', 'ceph-disk')() Jun 12 17:01:36 ceph-storage-0 ceph-osd-run.sh[762534]: File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 5735, in run Jun 12 17:01:36 ceph-storage-0 ceph-osd-run.sh[762534]: main(sys.argv[1:]) Jun 12 17:01:36 ceph-storage-0 ceph-osd-run.sh[762534]: File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 5686, in main Jun 12 17:01:36 ceph-storage-0 ceph-osd-run.sh[762534]: args.func(args) Jun 12 17:01:36 ceph-storage-0 ceph-osd-run.sh[762534]: File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 3776, in main_activate Jun 12 17:01:36 ceph-storage-0 ceph-osd-run.sh[762534]: reactivate=args.reactivate, Jun 12 17:01:36 ceph-storage-0 ceph-osd-run.sh[762534]: File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 3539, in mount_activate Jun 12 17:01:36 ceph-storage-0 ceph-osd-run.sh[762534]: (osd_id, cluster) = activate(path, activate_key_template, init) Jun 12 17:01:36 ceph-storage-0 ceph-osd-run.sh[762534]: File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 3716, in activate Jun 12 17:01:36 ceph-storage-0 ceph-osd-run.sh[762534]: keyring=keyring, Jun 12 17:01:36 ceph-storage-0 ceph-osd-run.sh[762534]: File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 3168, in mkfs Jun 12 17:01:36 ceph-storage-0 ceph-osd-run.sh[762534]: '--setgroup', get_ceph_group(), Jun 12 17:01:36 ceph-storage-0 ceph-osd-run.sh[762534]: File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 566, in command_check_call Jun 12 17:01:36 ceph-storage-0 ceph-osd-run.sh[762534]: return subprocess.check_call(arguments) Jun 12 17:01:36 ceph-storage-0 ceph-osd-run.sh[762534]: File "/usr/lib64/python2.7/subprocess.py", line 542, in check_call Jun 12 17:01:36 ceph-storage-0 ceph-osd-run.sh[762534]: raise CalledProcessError(retcode, cmd) Jun 12 17:01:36 ceph-storage-0 ceph-osd-run.sh[762534]: subprocess.CalledProcessError: Command '['/usr/bin/ceph-osd', '--cluster', 'ceph', '--mkfs', '-i', u'42', '--monmap', '/var/lib/ceph/tmp/mnt.81tYoO/activate.monmap', '--osd-data', '/var/lib/ceph/tmp/mnt.81tYoO', '--osd-uuid', u'fe Jun 12 17:01:36 ceph-storage-0 systemd[1]: ceph-osd: main process exited, code=exited, status=1/FAILURE Jun 12 17:01:36 ceph-storage-0 docker[787214]: Error response from daemon: No such container: ceph-osd-ceph-storage-0-sdm Jun 12 17:01:36 ceph-storage-0 systemd[1]: Unit ceph-osd entered failed state. Jun 12 17:01:36 ceph-storage-0 systemd[1]: ceph-osd failed. Version-Release number of selected component (if applicable): (undercloud) [stack@refarch-r220-02 ~]$ rpm -qa | grep -i openstack openstack-ironic-common-10.1.2-3.el7ost.noarch openstack-tripleo-heat-templates-8.0.2-14.el7ost.noarch openstack-mistral-common-6.0.2-1.el7ost.noarch openstack-nova-api-17.0.3-0.20180420001138.el7ost.noarch openstack-nova-scheduler-17.0.3-0.20180420001138.el7ost.noarch puppet-openstack_extras-12.4.1-0.20180413042250.2634296.el7ost.noarch openstack-nova-compute-17.0.3-0.20180420001138.el7ost.noarch openstack-heat-api-10.0.1-0.20180411125639.825731d.el7ost.noarch openstack-mistral-executor-6.0.2-1.el7ost.noarch openstack-swift-object-2.17.1-0.20180314165245.caeeb54.el7ost.noarch openstack-tripleo-common-containers-8.6.1-7.el7ost.noarch openstack-selinux-0.8.14-5.el7ost.noarch openstack-keystone-13.0.1-0.20180420194847.7bd6454.el7ost.noarch openstack-neutron-openvswitch-12.0.2-0.20180421011358.0ec54fd.el7ost.noarch openstack-swift-proxy-2.17.1-0.20180314165245.caeeb54.el7ost.noarch openstack-heat-common-10.0.1-0.20180411125639.825731d.el7ost.noarch openstack-ironic-conductor-10.1.2-3.el7ost.noarch openstack-mistral-engine-6.0.2-1.el7ost.noarch openstack-nova-placement-api-17.0.3-0.20180420001138.el7ost.noarch openstack-nova-common-17.0.3-0.20180420001138.el7ost.noarch openstack-swift-container-2.17.1-0.20180314165245.caeeb54.el7ost.noarch openstack-tripleo-common-8.6.1-7.el7ost.noarch python-openstackclient-lang-3.14.1-1.el7ost.noarch openstack-heat-engine-10.0.1-0.20180411125639.825731d.el7ost.noarch openstack-ironic-api-10.1.2-3.el7ost.noarch openstack-ironic-inspector-7.2.1-0.20180409163359.2435d97.el7ost.noarch openstack-tempest-18.0.0-2.el7ost.noarch openstack-mistral-api-6.0.2-1.el7ost.noarch openstack-tripleo-validations-8.4.1-4.el7ost.noarch openstack-glance-16.0.1-2.el7ost.noarch openstack-swift-account-2.17.1-0.20180314165245.caeeb54.el7ost.noarch openstack-neutron-12.0.2-0.20180421011358.0ec54fd.el7ost.noarch openstack-tripleo-puppet-elements-8.0.0-2.el7ost.noarch openstack-heat-api-cfn-10.0.1-0.20180411125639.825731d.el7ost.noarch openstack-ironic-staging-drivers-0.9.0-4.el7ost.noarch python2-openstacksdk-0.11.3-1.el7ost.noarch python2-openstackclient-3.14.1-1.el7ost.noarch openstack-tripleo-ui-8.3.1-2.el7ost.noarch openstack-zaqar-6.0.1-1.el7ost.noarch puppet-openstacklib-12.4.0-0.20180329042555.4b30e6f.el7ost.noarch openstack-neutron-common-12.0.2-0.20180421011358.0ec54fd.el7ost.noarch openstack-neutron-ml2-12.0.2-0.20180421011358.0ec54fd.el7ost.noarch openstack-tripleo-image-elements-8.0.1-1.el7ost.noarch openstack-nova-conductor-17.0.3-0.20180420001138.el7ost.noarch (undercloud) [stack@refarch-r220-02 ~]$ (undercloud) [stack@refarch-r220-02 ~]$ rpm -qa | grep -i ceph-ansible ceph-ansible-3.1.0-0.1.beta8.el7cp.noarch (undercloud) [stack@refarch-r220-02 ~]$ How reproducible: Deploy OSP-13 with RHCS 3 Steps to Reproduce: 1. Create osp-13 undercloud 2. Deploy osp-13 overcloud using OSPd and let OSPd deploy RHCS 3.0 Actual results: Jun 12 17:01:35 ceph-storage-0 ceph-osd-run.sh[762534]: 2018-06-12 17:01:35.994819 7fc24e71ed80 -1 bluestore(/var/lib/ceph/tmp/mnt.81tYoO/block) _check_or_set_bdev_label bdev /var/lib/ceph/tmp/mnt.81tYoO/block fsid b8ceb78b-4766-4cc3-8496-7bed005d3769 does not match our fsid fed4806b-b Jun 12 17:01:36 ceph-storage-0 ceph-osd-run.sh[762534]: 2018-06-12 17:01:36.251825 7fc24e71ed80 -1 bluestore(/var/lib/ceph/tmp/mnt.81tYoO) mkfs fsck found fatal error: (5) Input/output error Jun 12 17:01:36 ceph-storage-0 ceph-osd-run.sh[762534]: 2018-06-12 17:01:36.251860 7fc24e71ed80 -1 OSD::mkfs: ObjectStore::mkfs failed with error (5) Input/output error Jun 12 17:01:36 ceph-storage-0 ceph-osd-run.sh[762534]: 2018-06-12 17:01:36.251978 7fc24e71ed80 -1 ** ERROR: error creating empty object store in /var/lib/ceph/tmp/mnt.81tYoO: (5) Input/output error Jun 12 17:01:36 ceph-storage-0 ceph-osd-run.sh[762534]: mount_activate: Failed to activate Expected results: Both OSP and Ceph cluster should be up and running. Able to launch nova vms backed by Ceph Additional info: ceph parameters that i am using : https://github.com/ksingh7/OSP-12_RHCS_Deployment_Guide/blob/master/templates-part-2-test/ceph-config-bluestore.yaml
BTW Ironic is configured to clean the nodes, so everytime when nodes are moved from Manage >> Available state, they are cleaned by ironic.
If we use an inventory like the following (which TripleO set up for me) with ceph-ansible alone, then can we reproduce this problem? [root@undercloud ansible-mistral-actionA6bbkK]# cat inventory.yaml all: vars: admin_secret: AQBDCTNbAAAAABAA9FqF71dP9ASdoCkO4eipRA== ceph_conf_overrides: global: {bluestore block db size: 67108864, bluestore block size: 5368709120, bluestore block wal size: 134217728, bluestore fsck on mount: true, enable experimental unrecoverable data corrupting features: bluestore rocksdb, osd_pool_default_pg_num: 32, osd_pool_default_pgp_num: 32, osd_pool_default_size: 1, rgw_keystone_accepted_roles: 'Member, admin', rgw_keystone_admin_domain: default, rgw_keystone_admin_password: j2CwCGDHbWMw2NnWbasAFZkjR, rgw_keystone_admin_project: service, rgw_keystone_admin_user: swift, rgw_keystone_api_version: 3, rgw_keystone_implicit_tenants: 'true', rgw_keystone_revocation_interval: '0', rgw_keystone_url: 'http://192.168.24.14:5000', rgw_s3_auth_use_keystone: 'true'} ceph_docker_image: ceph/daemon ceph_docker_image_tag: v3.0.3-stable-3.0-luminous-centos-7-x86_64 ceph_docker_registry: 192.168.24.1:8787 ceph_mgr_docker_extra_env: -e MGR_DASHBOARD=0 ceph_origin: distro ceph_osd_docker_cpu_limit: 1 ceph_osd_docker_memory_limit: 5g ceph_stable: true cephfs: cephfs cephfs_data: manila_data cephfs_metadata: manila_metadata cephfs_pools: - {name: manila_data, pgs: 128} - {name: manila_metadata, pgs: 128} cluster: ceph cluster_network: 192.168.24.0/24 containerized_deployment: true dedicated_devices: [/dev/vdd, /dev/vdd] devices: [/dev/vdb, /dev/vdc] docker: true fetch_directory: /tmp/file-mistral-actionYYpjBm fsid: 14c53142-79bd-11e8-9ec3-006063b643f8 generate_fsid: false ip_version: ipv4 ireallymeanit: 'yes' keys: - {key: AQBDCTNbAAAAABAAnD6DBTlqB3S/spEWZLqpkg==, mgr_cap: allow *, mode: '0600', mon_cap: allow r, name: client.openstack, osd_cap: 'allow class-read object_prefix rbd_children, allow rwx pool=volumes, allow rwx pool=backups, allow rwx pool=vms, allow rwx pool=images, allow rwx pool='} - {key: AQBDCTNbAAAAABAA0DBho/DrE/c4mhcESAaDsQ==, mds_cap: allow *, mgr_cap: allow *, mode: '0600', mon_cap: 'allow r, allow command \"auth del\", allow command \"auth caps\", allow command \"auth get\", allow command \"auth get-or-create\"', name: client.manila, osd_cap: allow rw} - {key: AQBDCTNbAAAAABAAYkwKO/QNv6venujH9OYheA==, mgr_cap: allow *, mode: '0600', mon_cap: allow rw, name: client.radosgw, osd_cap: allow rwx} monitor_address_block: 192.168.24.0/24 monitor_secret: AQBDCTNbAAAAABAAMF8wCW5TBMEoZuViTecbuQ== ntp_service_enabled: false openstack_config: true openstack_keys: - {key: AQBDCTNbAAAAABAAnD6DBTlqB3S/spEWZLqpkg==, mgr_cap: allow *, mode: '0600', mon_cap: allow r, name: client.openstack, osd_cap: 'allow class-read object_prefix rbd_children, allow rwx pool=volumes, allow rwx pool=backups, allow rwx pool=vms, allow rwx pool=images, allow rwx pool='} - {key: AQBDCTNbAAAAABAA0DBho/DrE/c4mhcESAaDsQ==, mds_cap: allow *, mgr_cap: allow *, mode: '0600', mon_cap: 'allow r, allow command \"auth del\", allow command \"auth caps\", allow command \"auth get\", allow command \"auth get-or-create\"', name: client.manila, osd_cap: allow rw} - {key: AQBDCTNbAAAAABAAYkwKO/QNv6venujH9OYheA==, mgr_cap: allow *, mode: '0600', mon_cap: allow rw, name: client.radosgw, osd_cap: allow rwx} openstack_pools: - {application: rbd, name: images, pg_num: 32, rule_name: replicated_rule} - {application: rbd, name: backups, pg_num: 32, rule_name: replicated_rule} - {application: rbd, name: vms, pg_num: 32, rule_name: replicated_rule} - {application: rbd, name: volumes, pg_num: 32, rule_name: replicated_rule} osd_objectstore: bluestore osd_scenario: non-collocated pools: [] public_network: 192.168.24.0/24 user_config: true clients: hosts: 192.168.24.6: {} mdss: hosts: 192.168.24.11: {} 192.168.24.12: {} 192.168.24.8: {} mgrs: hosts: 192.168.24.11: {} 192.168.24.12: {} 192.168.24.8: {} mons: hosts: 192.168.24.11: {} 192.168.24.12: {} 192.168.24.8: {} nfss: hosts: {} osds: hosts: 192.168.24.13: {} 192.168.24.17: {} 192.168.24.9: {} rbdmirrors: hosts: {} rgws: hosts: {} [root@undercloud ansible-mistral-actionA6bbkK]#
In case it's useful, I'm seeing almost exactly the same problem.
*** Bug 1585482 has been marked as a duplicate of this bug. ***
Bluestore is working in this ceph-ansible example: https://github.com/ceph/ceph-ansible/tree/master/tests/functional/centos/7/bs-osds-container Use it to figure out what is being passed incorrectly.
We should try this next. It should work with Lumionus. parameter_defaults: CephAnsiblePlaybookVerbosity: 1 CephAnsibleEnvironmentVariables: ANSIBLE_SSH_RETRIES: '6' CephAnsibleDisksConfig: devices: - /dev/vdb - /dev/vdc dedicated_devices: - /dev/vdd - /dev/vdd CephAnsibleExtraConfig: osd_scenario: non-collocated osd_objectstore: bluestore ceph_osd_docker_memory_limit: 5g ceph_osd_docker_cpu_limit: 1 CephConfigOverrides: bluestore block db size: 67108864 bluestore block wal size: 134217728
(In reply to John Fulton from comment #17) > We should try this next. It should work with Lumionus. https://review.openstack.org/#/c/547682/
Status update: An OSP13 deployment which passes 4 additional THT parameters [0] to request bluestore and uses the ceph container rhceph-3-rhel7:3-9 [1] hits the race condition documented in bz 1608946 and the deployment fails. Changing to the rhceph-3-rhel7:3-11 [2] container results in a failed deployment but with 13 of 15 requested OSDs running. The other two OSDs failed because of the same race condition. I don't think 3-11 vs 3-9 was significant as they have the same version of ceph-disk. The issue seems to be the ceph-disk race condition (1608946). In theory you could deploy, get some percentage of OSDs, workaround the race condition until you have 100% of your OSDs, and then update the deployment so that it finishes and you have OSP13 + bluestore. The workarounds would be the same as the ones documented for a different ceph-disk race condition. It comes down to repeating until you succeed because you win the race. An example is in: https://bugzilla.redhat.com/show_bug.cgi?id=1494543 The real fix is to avoid this race condition when using bluestore the same way the above was fixed to avoid the race using filestore, thus I'm marking this as a duplicate of 1608946 which focuses just on ceph-disk, not ceph-ansible's deployment of it. This bug also mentions OSP13, but OSP13 doesn't support bluestore. [0] https://review.openstack.org/#/c/547682/3/ci/environments/scenario001-multinode-containers.yaml [1] https://access.redhat.com/containers/#/registry.access.redhat.com/rhceph/rhceph-3-rhel7/images/3-9 [2] https://access.redhat.com/containers/#/registry.access.redhat.com/rhceph/rhceph-3-rhel7/images/3-11 *** This bug has been marked as a duplicate of bug 1608946 ***
Do not incorrectly conclude that OSP13 will not work with Bluestore as a result of this bug. This bug is about RHCS 3.1 not supporting bluestore. If you hit this Ceph (not openstack) bug just use RHCS 3.2 You can use RHCS3.2 with Bluestore and OpenStack documents have been updated to recommend you do this https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/13/html-single/deploying_an_overcloud_with_containerized_red_hat_ceph/index#using_bluestore_in_ceph_3_2_and_later
(In reply to John Fulton from comment #21) > The real fix is to avoid this race condition when using bluestore the same > way the above was fixed to avoid the race using filestore, thus I'm marking > this as a duplicate of 1608946 which focuses just on ceph-disk, not > ceph-ansible's deployment of it. This bug also mentions OSP13, but OSP13 > doesn't support bluestore. This should say "OSP13 doesn't support bluestore WITH RHCS3.1". It was true in July 2018 because RHCS3.2 was not yet released.