Bug 1585482

Summary: OSPd unable to deploy RHCS3.0 (Bluestore) Error : bluestore mkfs fsck found fatal Input/output error
Product: Red Hat OpenStack Reporter: karan singh <karan>
Component: openstack-tripleoAssignee: Giulio Fidente <gfidente>
Status: CLOSED DUPLICATE QA Contact: Arik Chernetsky <achernet>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: 12.0 (Pike)CC: bschmaus, gfidente, karan, mburns
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-06-07 18:01:14 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description karan singh 2018-06-03 12:37:34 UTC
Description of problem:

I am trying to deploy openstack and Ceph (bluestore osds) using OSPd, once I issue openstack deploy command , OSPd deploys openstack cluster without any issues, however, it has issues deploying Ceph Bluestore OSD cleanly. FYI Both OSP and Ceph services are containerized. 

I know OSP-12 can't deploy RHCS 3.0 (because of the introduction of ceph-mgs, so i pached mistral workflow to deploy ceph-mgr and it worked fine after that), with this patched version I was successfully able to deploy OSP-12 and RHCS3.0 (Filestore OSD).

Now this time I i am repeating the exercise (for benchmarking) with OSP-12 RHCS3.0 Bluestore OSDs. But OSPd or most likely Ceph-Ansible is unable to bring up OSD containers cleanly. Throws the following error.



Jun 03 10:19:26 ceph-storage-0 ceph-osd-run.sh[803994]: 2018-06-03 10:19:26.132957 7f20866f4d80 -1 bluestore(/var/lib/ceph/tmp/mnt.dSWS2P/block) _check_or_set_bdev_label bdev /var/lib/ceph/tmp/mnt.dSWS2P/block fsid 20e0ad03-8201-40ec-9af5-12866177d140 does not match our fsid 18302dac-dfc7-4639-8a59-ba2b8afca3a4
Jun 03 10:19:26 ceph-storage-0 ceph-osd-run.sh[803994]: 2018-06-03 10:19:26.390462 7f20866f4d80 -1 bluestore(/var/lib/ceph/tmp/mnt.dSWS2P) mkfs fsck found fatal error: (5) Input/output error
Jun 03 10:19:26 ceph-storage-0 ceph-osd-run.sh[803994]: 2018-06-03 10:19:26.390496 7f20866f4d80 -1 OSD::mkfs: ObjectStore::mkfs failed with error (5) Input/output error
Jun 03 10:19:26 ceph-storage-0 ceph-osd-run.sh[803994]: 2018-06-03 10:19:26.390608 7f20866f4d80 -1  ** ERROR: error creating empty object store in /var/lib/ceph/tmp/mnt.dSWS2P: (5) Input/output error
Jun 03 10:19:26 ceph-storage-0 ceph-osd-run.sh[803994]: mount_activate: Failed to activate

Version-Release number of selected component (if applicable):

$ sudo docker images | grep -i ceph
192.168.120.1:8787/rhceph/rhceph-3-rhel7                       latest              0eabd5d89d8b        3 weeks ago         593 MB
$

(undercloud) [stack@refarch-r220-03 templates]$ less container_images.yaml | grep -i ceph
- imagename: registry.access.redhat.com/rhceph/rhceph-3-rhel7:latest
(undercloud) [stack@refarch-r220-03 templates]$


(undercloud) [stack@refarch-r220-03 ~]$ rpm -qa | grep -i tripleo
openstack-tripleo-puppet-elements-7.0.5-1.el7ost.noarch
openstack-tripleo-heat-templates-7.0.9-8.el7ost.noarch
puppet-tripleo-7.4.8-5.el7ost.noarch
openstack-tripleo-image-elements-7.0.3-1.el7ost.noarch
openstack-tripleo-common-7.6.9-3.el7ost.noarch
python-tripleoclient-7.3.8-1.el7ost.noarch
openstack-tripleo-ui-7.4.7-1.el7ost.noarch
openstack-tripleo-validations-7.4.6-1.el7ost.noarch
openstack-tripleo-common-containers-7.6.9-3.el7ost.noarch
(undercloud) [stack@refarch-r220-03 ~]$


(undercloud) [stack@refarch-r220-03 ~]$ rpm -qa | grep -i ceph
puppet-ceph-2.4.2-1.el7ost.noarch
ceph-ansible-3.0.33-1.el7cp.noarch
(undercloud) [stack@refarch-r220-03 ~]$

How reproducible:

Always ( i have tried 4 times with 0 success )

Steps to Reproduce:
1. Deploy OSP-12 and Ceph (bluestore CSD) using image tag : 12.0-20180519.1


Actual results:

## Output from openstack overcloud deploy command shows that openstack deployment was successful

2018-06-03 10:29:26Z [overcloud.AllNodesDeploySteps.R220ComputeDeployment_Step5]: CREATE_COMPLETE  state changed
2018-06-03 10:42:24Z [overcloud.AllNodesDeploySteps.ControllerDeployment_Step5.0]: SIGNAL_IN_PROGRESS  Signal: deployment ad16bee5-11e3-472a-9a18-fe14784e612d succeeded
2018-06-03 10:42:25Z [overcloud.AllNodesDeploySteps.ControllerDeployment_Step5.0]: CREATE_COMPLETE  state changed
2018-06-03 10:42:25Z [overcloud.AllNodesDeploySteps.ControllerDeployment_Step5]: CREATE_COMPLETHost 172.21.1.157 not found in /home/stack/.ssh/known_hosts
E  Stack CREATE completed successfully
2018-06-03 10:42:25Z [overcloud.AllNodesDeploySteps.ControllerDeployment_Step5]: CREATE_COMPLETE  state changed
2018-06-03 10:42:25Z [overcloud.AllNodesDeploySteps.R220ComputeExtraConfigPost]: CREATE_IN_PROGRESS  state changed
2018-06-03 10:42:26Z [overcloud.AllNodesDeploySteps.ControllerExtraConfigPost]: CREATE_IN_PROGRESS  state changed
2018-06-03 10:42:27Z [overcloud.AllNodesDeploySteps.CephStorageExtraConfigPost]: CREATE_IN_PROGRESS  state changed
2018-06-03 10:42:27Z [overcloud.AllNodesDeploySteps.R630ComputeExtraConfigPost]: CREATE_IN_PROGRESS  state changed
2018-06-03 10:42:29Z [overcloud.AllNodesDeploySteps.R220ComputeExtraConfigPost]: CREATE_COMPLETE  state changed
2018-06-03 10:42:29Z [overcloud.AllNodesDeploySteps.ControllerExtraConfigPost]: CREATE_COMPLETE  state changed
2018-06-03 10:42:29Z [overcloud.AllNodesDeploySteps.CephStorageExtraConfigPost]: CREATE_COMPLETE  state changed
2018-06-03 10:42:29Z [overcloud.AllNodesDeploySteps.R630ComputeExtraConfigPost]: CREATE_COMPLETE  state changed
2018-06-03 10:42:29Z [overcloud.AllNodesDeploySteps.R220ComputePostConfig]: CREATE_IN_PROGRESS  state changed
2018-06-03 10:42:29Z [overcloud.AllNodesDeploySteps.CephStoragePostConfig]: CREATE_IN_PROGRESS  state changed
2018-06-03 10:42:30Z [overcloud.AllNodesDeploySteps.R630ComputePostConfig]: CREATE_IN_PROGRESS  state changed
2018-06-03 10:42:30Z [overcloud.AllNodesDeploySteps.ControllerPostConfig]: CREATE_IN_PROGRESS  state changed
2018-06-03 10:42:31Z [overcloud.AllNodesDeploySteps.R220ComputePostConfig]: CREATE_COMPLETE  state changed
2018-06-03 10:42:31Z [overcloud.AllNodesDeploySteps.R630ComputePostConfig]: CREATE_COMPLETE  state changed
2018-06-03 10:42:31Z [overcloud.AllNodesDeploySteps.CephStoragePostConfig]: CREATE_COMPLETE  state changed
2018-06-03 10:42:31Z [overcloud.AllNodesDeploySteps.ControllerPostConfig]: CREATE_COMPLETE  state changed
2018-06-03 10:42:31Z [overcloud.AllNodesDeploySteps]: CREATE_COMPLETE  Stack CREATE completed successfully
2018-06-03 10:42:31Z [overcloud.AllNodesDeploySteps]: CREATE_COMPLETE  state changed
2018-06-03 10:42:31Z [overcloud]: CREATE_COMPLETE  Stack CREATE completed successfully

 Stack overcloud CREATE_COMPLETE

Overcloud Endpoint: http://172.21.1.157:5000/v2.0
Overcloud Deployed


## Heat stack creation was successful

(undercloud) [stack@refarch-r220-03 ~]$ openstack stack list
+--------------------------------------+------------+----------------------------------+-----------------+----------------------+--------------+
| ID                                   | Stack Name | Project                          | Stack Status    | Creation Time        | Updated Time |
+--------------------------------------+------------+----------------------------------+-----------------+----------------------+--------------+
| dc15dd8a-5e12-4b79-94d0-0fb982bb33c9 | overcloud  | 2b1b9b2392804931ace583f8b00d80f7 | CREATE_COMPLETE | 2018-06-03T09:15:08Z | None         |
+--------------------------------------+------------+----------------------------------+-----------------+----------------------+--------------+
(undercloud) [stack@refarch-r220-03 ~]$

## From controller-0 ceph -s shows 60 OSD but none of them were UP and IN

[heat-admin@controller-0 ~]$ sudo ceph -s
  cluster:
    id:     b8a0918c-5d05-11e8-962f-2047478cce5e
    health: HEALTH_WARN
            Reduced data availability: 160 pgs inactive

  services:
    mon: 1 daemons, quorum controller-0
    mgr: controller-0(active)
    osd: 60 osds: 0 up, 0 in

  data:
    pools:   5 pools, 160 pgs
    objects: 0 objects, 0 bytes
    usage:   0 kB used, 0 kB / 0 kB avail
    pgs:     100.000% pgs unknown
             160 unknown

[heat-admin@controller-0 ~]$

[heat-admin@controller-0 ~]$ sudo ceph osd stat
60 osds: 0 up, 0 in
[heat-admin@controller-0 ~]$


## ceph-ansible logs from mistral shows ceph-ansible run was successful without any failures

2018-06-03 06:04:50,706 p=20038 u=mistral |  PLAY RECAP *********************************************************************
2018-06-03 06:04:50,706 p=20038 u=mistral |  192.168.120.10             : ok=131  changed=17   unreachable=0    failed=0
2018-06-03 06:04:50,706 p=20038 u=mistral |  192.168.120.11             : ok=50   changed=3    unreachable=0    failed=0
2018-06-03 06:04:50,706 p=20038 u=mistral |  192.168.120.14             : ok=73   changed=7    unreachable=0    failed=0
2018-06-03 06:04:50,706 p=20038 u=mistral |  192.168.120.16             : ok=50   changed=3    unreachable=0    failed=0
2018-06-03 06:04:50,706 p=20038 u=mistral |  192.168.120.18             : ok=73   changed=7    unreachable=0    failed=0
2018-06-03 06:04:50,706 p=20038 u=mistral |  192.168.120.19             : ok=50   changed=3    unreachable=0    failed=0
2018-06-03 06:04:50,706 p=20038 u=mistral |  192.168.120.20             : ok=73   changed=7    unreachable=0    failed=0
2018-06-03 06:04:50,707 p=20038 u=mistral |  192.168.120.21             : ok=50   changed=3    unreachable=0    failed=0
2018-06-03 06:04:50,707 p=20038 u=mistral |  192.168.120.6              : ok=73   changed=7    unreachable=0    failed=0
2018-06-03 06:04:50,707 p=20038 u=mistral |  192.168.120.7              : ok=76   changed=7    unreachable=0    failed=0
2018-06-03 06:04:50,707 p=20038 u=mistral |  192.168.120.8              : ok=50   changed=3    unreachable=0    failed=0
2018-06-03 06:04:50,707 p=20038 u=mistral |  192.168.120.9              : ok=53   changed=3    unreachable=0    failed=0



## Output of journalctl -u ceph-osd@<HDD>

Jun 03 10:19:19 ceph-storage-0 systemd[1]: Starting Ceph OSD...
Jun 03 10:19:19 ceph-storage-0 docker[803923]: Error response from daemon: No such container: ceph-osd-ceph-storage-0-sdd
Jun 03 10:19:19 ceph-storage-0 docker[803962]: Error response from daemon: No such container: ceph-osd-ceph-storage-0-sdd
Jun 03 10:19:19 ceph-storage-0 systemd[1]: Started Ceph OSD.
Jun 03 10:19:20 ceph-storage-0 ceph-osd-run.sh[803994]: Error response from daemon: No such container: expose_partitions_sdd
Jun 03 10:19:22 ceph-storage-0 ceph-osd-run.sh[803994]: 2018-06-03 10:19:22  /entrypoint.sh: static: does not generate config
Jun 03 10:19:22 ceph-storage-0 ceph-osd-run.sh[803994]: main_activate: path = /dev/sdd1
Jun 03 10:19:24 ceph-storage-0 ceph-osd-run.sh[803994]: get_dm_uuid: get_dm_uuid /dev/sdd1 uuid path is /sys/dev/block/8:49/dm/uuid
Jun 03 10:19:24 ceph-storage-0 ceph-osd-run.sh[803994]: command: Running command: /usr/sbin/blkid -o udev -p /dev/sdd1
Jun 03 10:19:24 ceph-storage-0 ceph-osd-run.sh[803994]: command: Running command: /sbin/blkid -p -s TYPE -o value -- /dev/sdd1
Jun 03 10:19:24 ceph-storage-0 ceph-osd-run.sh[803994]: command: Running command: /usr/bin/ceph-conf --cluster=ceph --name=osd. --lookup osd_mount_options_xfs
Jun 03 10:19:25 ceph-storage-0 ceph-osd-run.sh[803994]: command: Running command: /usr/bin/ceph-conf --cluster=ceph --name=osd. --lookup osd_fs_mount_options_xfs
Jun 03 10:19:25 ceph-storage-0 ceph-osd-run.sh[803994]: mount: Mounting /dev/sdd1 on /var/lib/ceph/tmp/mnt.dSWS2P with options noatime,inode64
Jun 03 10:19:25 ceph-storage-0 ceph-osd-run.sh[803994]: command_check_call: Running command: /usr/bin/mount -t xfs -o noatime,inode64 -- /dev/sdd1 /var/lib/ceph/tmp/mnt.dSWS2P
Jun 03 10:19:25 ceph-storage-0 ceph-osd-run.sh[803994]: command: Running command: /usr/sbin/restorecon /var/lib/ceph/tmp/mnt.dSWS2P
Jun 03 10:19:25 ceph-storage-0 ceph-osd-run.sh[803994]: activate: Cluster uuid is b8a0918c-5d05-11e8-962f-2047478cce5e
Jun 03 10:19:25 ceph-storage-0 ceph-osd-run.sh[803994]: command: Running command: /usr/bin/ceph-osd --cluster=ceph --show-config-value=fsid
Jun 03 10:19:25 ceph-storage-0 ceph-osd-run.sh[803994]: activate: Cluster name is ceph
Jun 03 10:19:25 ceph-storage-0 ceph-osd-run.sh[803994]: activate: OSD uuid is 18302dac-dfc7-4639-8a59-ba2b8afca3a4
Jun 03 10:19:25 ceph-storage-0 ceph-osd-run.sh[803994]: activate: OSD id is 10
Jun 03 10:19:25 ceph-storage-0 ceph-osd-run.sh[803994]: activate: Initializing OSD...
Jun 03 10:19:25 ceph-storage-0 ceph-osd-run.sh[803994]: command_check_call: Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring mon getmap -o /var/lib/ceph/tmp/mnt.dSWS2P/activate.monmap
Jun 03 10:19:26 ceph-storage-0 ceph-osd-run.sh[803994]: got monmap epoch 1
Jun 03 10:19:26 ceph-storage-0 ceph-osd-run.sh[803994]: command_check_call: Running command: /usr/bin/ceph-osd --cluster ceph --mkfs -i 10 --monmap /var/lib/ceph/tmp/mnt.dSWS2P/activate.monmap --osd-data /var/lib/ceph/tmp/mnt.dSWS2P --osd-uuid 18302dac-dfc7-4639-8a59-ba2b8afca3a4 --setuser ceph --setgroup disk
Jun 03 10:19:26 ceph-storage-0 ceph-osd-run.sh[803994]: 2018-06-03 10:19:26.132957 7f20866f4d80 -1 bluestore(/var/lib/ceph/tmp/mnt.dSWS2P/block) _check_or_set_bdev_label bdev /var/lib/ceph/tmp/mnt.dSWS2P/block fsid 20e0ad03-8201-40ec-9af5-12866177d140 does not match our fsid 18302dac-dfc7-4639-8a59-ba2b8afca3a4
Jun 03 10:19:26 ceph-storage-0 ceph-osd-run.sh[803994]: 2018-06-03 10:19:26.390462 7f20866f4d80 -1 bluestore(/var/lib/ceph/tmp/mnt.dSWS2P) mkfs fsck found fatal error: (5) Input/output error
Jun 03 10:19:26 ceph-storage-0 ceph-osd-run.sh[803994]: 2018-06-03 10:19:26.390496 7f20866f4d80 -1 OSD::mkfs: ObjectStore::mkfs failed with error (5) Input/output error
Jun 03 10:19:26 ceph-storage-0 ceph-osd-run.sh[803994]: 2018-06-03 10:19:26.390608 7f20866f4d80 -1  ** ERROR: error creating empty object store in /var/lib/ceph/tmp/mnt.dSWS2P: (5) Input/output error
Jun 03 10:19:26 ceph-storage-0 ceph-osd-run.sh[803994]: mount_activate: Failed to activate
Jun 03 10:19:26 ceph-storage-0 ceph-osd-run.sh[803994]: unmount: Unmounting /var/lib/ceph/tmp/mnt.dSWS2P
Jun 03 10:19:26 ceph-storage-0 ceph-osd-run.sh[803994]: command_check_call: Running command: /bin/umount -- /var/lib/ceph/tmp/mnt.dSWS2P
Jun 03 10:19:26 ceph-storage-0 ceph-osd-run.sh[803994]: Traceback (most recent call last):
Jun 03 10:19:26 ceph-storage-0 ceph-osd-run.sh[803994]: File "/usr/sbin/ceph-disk", line 9, in <module>
Jun 03 10:19:26 ceph-storage-0 ceph-osd-run.sh[803994]: load_entry_point('ceph-disk==1.0.0', 'console_scripts', 'ceph-disk')()
Jun 03 10:19:26 ceph-storage-0 ceph-osd-run.sh[803994]: File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 5735, in run
Jun 03 10:19:26 ceph-storage-0 ceph-osd-run.sh[803994]: main(sys.argv[1:])
Jun 03 10:19:26 ceph-storage-0 ceph-osd-run.sh[803994]: File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 5686, in main
Jun 03 10:19:26 ceph-storage-0 ceph-osd-run.sh[803994]: args.func(args)
Jun 03 10:19:26 ceph-storage-0 ceph-osd-run.sh[803994]: File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 3776, in main_activate
Jun 03 10:19:26 ceph-storage-0 ceph-osd-run.sh[803994]: reactivate=args.reactivate,
Jun 03 10:19:26 ceph-storage-0 ceph-osd-run.sh[803994]: File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 3539, in mount_activate
Jun 03 10:19:26 ceph-storage-0 ceph-osd-run.sh[803994]: (osd_id, cluster) = activate(path, activate_key_template, init)
Jun 03 10:19:26 ceph-storage-0 ceph-osd-run.sh[803994]: File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 3716, in activate
Jun 03 10:19:26 ceph-storage-0 ceph-osd-run.sh[803994]: keyring=keyring,
Jun 03 10:19:26 ceph-storage-0 ceph-osd-run.sh[803994]: File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 3168, in mkfs
Jun 03 10:19:26 ceph-storage-0 ceph-osd-run.sh[803994]: '--setgroup', get_ceph_group(),
Jun 03 10:19:26 ceph-storage-0 ceph-osd-run.sh[803994]: File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 566, in command_check_call
Jun 03 10:19:26 ceph-storage-0 ceph-osd-run.sh[803994]: return subprocess.check_call(arguments)
Jun 03 10:19:26 ceph-storage-0 ceph-osd-run.sh[803994]: File "/usr/lib64/python2.7/subprocess.py", line 542, in check_call
Jun 03 10:19:26 ceph-storage-0 ceph-osd-run.sh[803994]: raise CalledProcessError(retcode, cmd)
Jun 03 10:19:26 ceph-storage-0 ceph-osd-run.sh[803994]: subprocess.CalledProcessError: Command '['/usr/bin/ceph-osd', '--cluster', 'ceph', '--mkfs', '-i', u'10', '--monmap', '/var/lib/ceph/tmp/mnt.dSWS2P/activate.monmap', '--osd-data', '/var/lib/ceph/tmp/mnt.dSWS2P', '--osd-uuid', u'18302dac-dfc7-4639-8a59-ba2b8afca3a4', '--setuser', 'ceph', '--setgroup', 'disk']' returned non-zero exit status 1
Jun 03 10:19:26 ceph-storage-0 systemd[1]: ceph-osd: main process exited, code=exited, status=1/FAILURE
Jun 03 10:19:26 ceph-storage-0 docker[811685]: Error response from daemon: No such container: ceph-osd-ceph-storage-0-sdd
Jun 03 10:19:26 ceph-storage-0 systemd[1]: Unit ceph-osd entered failed state.
Jun 03 10:19:26 ceph-storage-0 systemd[1]: ceph-osd failed.
[root@ceph-storage-0 ~]#

Reproducable: 

Always

Expected results:

Openstack and Ceph cluster shoule be up and running without failures, and could launch instances on top of it.

Additional info:

## My ospd ceph-ansible configureation is here 

https://github.com/ksingh7/OSP-12_RHCS_Deployment_Guide/blob/master/templates-part-2-test/ceph-config-bluestore.yaml


## If any of is interested in could also share screen session to troubleshoot it LIVE with me

Comment 1 karan singh 2018-06-03 20:33:20 UTC
Based on email discussion with John Fulton i retried this by performing node cleanup.

1. Set automated_clean=true in ironic.conf, restart services
2. Set nodes to manage, set nodes back to available. Ironic did node cleanup before setting the nodes to available.

(undercloud) [stack@refarch-r220-03 ~]$ openstack baremetal node list
+--------------------------------------+-----------+---------------+-------------+--------------------+-------------+
| UUID                                 | Name      | Instance UUID | Power State | Provisioning State | Maintenance |
+--------------------------------------+-----------+---------------+-------------+--------------------+-------------+
| 90065bea-6a60-4c94-833d-d9c7843fb735 | r630-01   | None          | power on    | clean wait         | False       |
| 5866f163-f48c-4a5a-8c81-c59041f31322 | r630-02   | None          | power on    | clean wait         | False       |
| 0c6a190e-a14f-4bf0-a8c1-09535311819c | r630-03   | None          | power on    | clean wait         | False       |
| 29b252d7-da76-483c-a834-829f7b3c3144 | r220-01   | None          | power on    | clean wait         | False       |
| 5581154b-9992-419d-9916-971e299f0673 | r220-08   | None          | power on    | clean wait         | False       |
| a51790ee-25ac-4400-ac7b-03de2024b9f4 | r220-09   | None          | power on    | clean wait         | False       |
| 9f60fa30-fad2-4d28-8759-be7a1480bb98 | r220-10   | None          | power on    | clean wait         | False       |
| 58cfae5f-c650-45a2-ad95-c23c1f71eada | r730xd-01 | None          | power off   | cleaning           | False       |
| dda77bfa-291a-40e7-acfb-5c1fae6cf561 | r730xd-02 | None          | power off   | cleaning           | False       |
| 87072bba-765a-4e21-83c3-1ec3d07d07b6 | r730xd-03 | None          | power off   | cleaning           | False       |
| b26921d4-7a33-4815-9069-d0430eb773fe | r730xd-04 | None          | power off   | cleaning           | False       |
| 8c4dfe4f-d0e2-4876-9509-42ae3d29fa27 | r730xd-05 | None          | power off   | cleaning           | False       |
+--------------------------------------+-----------+---------------+-------------+--------------------+-------------+
(undercloud) [stack@refarch-r220-03 ~]$


3. Triggered OSPd to deplo OSP and Ceph. But still encountering the exact same issue  

[heat-admin@controller-0 ~]$ sudo ceph -s
  cluster:
    id:     b8a0918c-5d05-11e8-962f-2047478cce5e
    health: HEALTH_WARN
            Reduced data availability: 160 pgs inactive
 
  services:
    mon: 1 daemons, quorum controller-0
    mgr: controller-0(active)
    osd: 60 osds: 0 up, 0 in <<<< No OSD are UP/IN
 
  data:
    pools:   5 pools, 160 pgs
    objects: 0 objects, 0 bytes
    usage:   0 kB used, 0 kB / 0 kB avail
    pgs:     100.000% pgs unknown
             160 unknown
 
[heat-admin@controller-0 ~]$
 
OSD logs
-----------
 
Jun 03 20:10:46 ceph-storage-1 docker[106640]: Error response from daemon: No such container: ceph-osd-ceph-storage-1-sdh
Jun 03 20:10:46 ceph-storage-1 docker[106673]: Error response from daemon: No such container: ceph-osd-ceph-storage-1-sdh
Jun 03 20:10:46 ceph-storage-1 systemd[1]: Started Ceph OSD.
Jun 03 20:10:47 ceph-storage-1 ceph-osd-run.sh[106706]: Error response from daemon: No such container: expose_partitions_sdh
Jun 03 20:10:49 ceph-storage-1 ceph-osd-run.sh[106706]: 2018-06-03 20:10:49  /entrypoint.sh: static: does not generate config
Jun 03 20:10:49 ceph-storage-1 ceph-osd-run.sh[106706]: main_activate: path = /dev/sdh1
Jun 03 20:10:51 ceph-storage-1 ceph-osd-run.sh[106706]: get_dm_uuid: get_dm_uuid /dev/sdh1 uuid path is /sys/dev/block/8:113/dm/uuid
Jun 03 20:10:51 ceph-storage-1 ceph-osd-run.sh[106706]: command: Running command: /usr/sbin/blkid -o udev -p /dev/sdh1
Jun 03 20:10:51 ceph-storage-1 ceph-osd-run.sh[106706]: command: Running command: /sbin/blkid -p -s TYPE -o value -- /dev/sdh1
Jun 03 20:10:51 ceph-storage-1 ceph-osd-run.sh[106706]: command: Running command: /usr/bin/ceph-conf --cluster=ceph --name=osd. --lookup osd_mount_options_xfs
Jun 03 20:10:52 ceph-storage-1 ceph-osd-run.sh[106706]: command: Running command: /usr/bin/ceph-conf --cluster=ceph --name=osd. --lookup osd_fs_mount_options_xfs
Jun 03 20:10:52 ceph-storage-1 ceph-osd-run.sh[106706]: mount: Mounting /dev/sdh1 on /var/lib/ceph/tmp/mnt.VwM4I_ with options noatime,inode64
Jun 03 20:10:52 ceph-storage-1 ceph-osd-run.sh[106706]: command_check_call: Running command: /usr/bin/mount -t xfs -o noatime,inode64 -- /dev/sdh1 /var/lib/ceph/tmp/mnt.VwM4I_
Jun 03 20:10:52 ceph-storage-1 ceph-osd-run.sh[106706]: command: Running command: /usr/sbin/restorecon /var/lib/ceph/tmp/mnt.VwM4I_
Jun 03 20:10:52 ceph-storage-1 ceph-osd-run.sh[106706]: activate: Cluster uuid is b8a0918c-5d05-11e8-962f-2047478cce5e
Jun 03 20:10:52 ceph-storage-1 ceph-osd-run.sh[106706]: command: Running command: /usr/bin/ceph-osd --cluster=ceph --show-config-value=fsid
Jun 03 20:10:52 ceph-storage-1 ceph-osd-run.sh[106706]: activate: Cluster name is ceph
Jun 03 20:10:52 ceph-storage-1 ceph-osd-run.sh[106706]: activate: OSD uuid is 03b5dab0-ff57-4645-a9b2-4c106777833e
Jun 03 20:10:52 ceph-storage-1 ceph-osd-run.sh[106706]: activate: OSD id is 35
Jun 03 20:10:52 ceph-storage-1 ceph-osd-run.sh[106706]: activate: Initializing OSD...
Jun 03 20:10:52 ceph-storage-1 ceph-osd-run.sh[106706]: command_check_call: Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring mon getmap -o /var/lib/ceph/tmp/mnt.VwM4I_/activate.monmap
Jun 03 20:10:53 ceph-storage-1 ceph-osd-run.sh[106706]: got monmap epoch 1
Jun 03 20:10:53 ceph-storage-1 ceph-osd-run.sh[106706]: command_check_call: Running command: /usr/bin/ceph-osd --cluster ceph --mkfs -i 35 --monmap /var/lib/ceph/tmp/mnt.VwM4I_/activate.monmap --osd-data /var/lib/ceph/tmp/mnt.VwM4I_ --osd-uuid 03b5dab0-ff57-4645-a9b2-4c106777833e --setuser ceph --setgroup disk
Jun 03 20:10:53 ceph-storage-1 ceph-osd-run.sh[106706]: 2018-06-03 20:10:53.220245 7f322512ad80 -1 bluestore(/var/lib/ceph/tmp/mnt.VwM4I_/block) _check_or_set_bdev_label bdev /var/lib/ceph/tmp/mnt.VwM4I_/block fsid a1a13449-f826-4899-8a8c-eb8c905ab9c6 does not match our fsid 03b5dab0-ff57-4645-a9b2-4c106777833e
Jun 03 20:10:53 ceph-storage-1 ceph-osd-run.sh[106706]: 2018-06-03 20:10:53.475778 7f322512ad80 -1 bluestore(/var/lib/ceph/tmp/mnt.VwM4I_) mkfs fsck found fatal error: (5) Input/output error
Jun 03 20:10:53 ceph-storage-1 ceph-osd-run.sh[106706]: 2018-06-03 20:10:53.475812 7f322512ad80 -1 OSD::mkfs: ObjectStore::mkfs failed with error (5) Input/output error
Jun 03 20:10:53 ceph-storage-1 ceph-osd-run.sh[106706]: 2018-06-03 20:10:53.475912 7f322512ad80 -1  ** ERROR: error creating empty object store in /var/lib/ceph/tmp/mnt.VwM4I_: (5) Input/output error
Jun 03 20:10:53 ceph-storage-1 ceph-osd-run.sh[106706]: mount_activate: Failed to activate
Jun 03 20:10:53 ceph-storage-1 ceph-osd-run.sh[106706]: unmount: Unmounting /var/lib/ceph/tmp/mnt.VwM4I_