Created attachment 1510792 [details] Ansible log file. Description of problem: Ceph-disk dedicated_devices scenario is failing to start OSD service in containers, observed this issue in multiple machines. /etc/ansible/hosts file data [osds] cephqe-node3 dedicated_devices="['/dev/nvme0n1','/dev/sdd']" devices="['/dev/sdb','/dev/sdc']" osd_scenario="non-collocated" dmcrypt="True" osd_objectstore="filestore" cephqe-node4 dedicated_devices="['/dev/nvme0n1','/dev/sdd']" devices="['/dev/sdb','/dev/sdc']" osd_scenario="non-collocated" osd_objectstore="filestore" OSD's are repeatedly failing to start osd daemons. journalctl error message in OSD's ----------------------------------- Dec 03 09:45:47 cephqe-node4.lab.eng.blr.redhat.com ceph-osd-run.sh[120149]: 2018-12-03 09:45:47 /entrypoint.sh: static: does not generate config Dec 03 09:45:47 cephqe-node4.lab.eng.blr.redhat.com ceph-osd-run.sh[120149]: /dev/nvme0n1p1 Dec 03 09:45:48 cephqe-node4.lab.eng.blr.redhat.com ceph-osd-run.sh[120149]: main_activate: path = /dev/sdb1 Dec 03 09:45:48 cephqe-node4.lab.eng.blr.redhat.com ceph-osd-run.sh[120149]: get_dm_uuid: get_dm_uuid /dev/sdb1 uuid path is /sys/dev/block/8:17/dm/uuid Dec 03 09:45:48 cephqe-node4.lab.eng.blr.redhat.com ceph-osd-run.sh[120149]: command: Running command: /usr/sbin/blkid -o udev -p /dev/sdb1 Dec 03 09:45:48 cephqe-node4.lab.eng.blr.redhat.com ceph-osd-run.sh[120149]: command: Running command: /sbin/blkid -p -s TYPE -o value -- /dev/sdb1 Dec 03 09:45:48 cephqe-node4.lab.eng.blr.redhat.com ceph-osd-run.sh[120149]: command: Running command: /usr/bin/ceph-conf --cluster=ceph --name=osd. --lookup osd_mount_options_xfs Dec 03 09:45:49 cephqe-node4.lab.eng.blr.redhat.com ceph-osd-run.sh[120149]: mount: Mounting /dev/sdb1 on /var/lib/ceph/tmp/mnt.3fi6L2 with options noatime,largeio,inode64,swalloc Dec 03 09:45:49 cephqe-node4.lab.eng.blr.redhat.com ceph-osd-run.sh[120149]: command_check_call: Running command: /usr/bin/mount -t xfs -o noatime,largeio,inode64,swalloc -- /dev/sdb1 /var/lib/ceph/tmp/mnt.3fi6L2 Dec 03 09:45:49 cephqe-node4.lab.eng.blr.redhat.com ceph-osd-run.sh[120149]: command: Running command: /usr/sbin/restorecon /var/lib/ceph/tmp/mnt.3fi6L2 Dec 03 09:45:49 cephqe-node4.lab.eng.blr.redhat.com ceph-osd-run.sh[120149]: activate: Cluster uuid is aade34f6-53bb-43fa-8d9b-d5a51a4be74d Dec 03 09:45:49 cephqe-node4.lab.eng.blr.redhat.com ceph-osd-run.sh[120149]: command: Running command: /usr/bin/ceph-osd --cluster=ceph --show-config-value=fsid Dec 03 09:45:50 cephqe-node4.lab.eng.blr.redhat.com ceph-osd-run.sh[120149]: activate: Cluster name is ceph Dec 03 09:45:50 cephqe-node4.lab.eng.blr.redhat.com ceph-osd-run.sh[120149]: activate: OSD uuid is a23a3aa3-8205-44d0-b4ce-a768f40ea602 Dec 03 09:45:50 cephqe-node4.lab.eng.blr.redhat.com ceph-osd-run.sh[120149]: activate: OSD id is 6 Dec 03 09:45:50 cephqe-node4.lab.eng.blr.redhat.com ceph-osd-run.sh[120149]: command: Running command: /usr/bin/ceph-conf --cluster=ceph --name=osd. --lookup init Dec 03 09:45:50 cephqe-node4.lab.eng.blr.redhat.com ceph-osd-run.sh[120149]: command: Running command: /usr/bin/ceph-detect-init --default sysvinit Dec 03 09:45:50 cephqe-node4.lab.eng.blr.redhat.com ceph-osd-run.sh[120149]: activate: Marking with init system none Dec 03 09:45:50 cephqe-node4.lab.eng.blr.redhat.com ceph-osd-run.sh[120149]: command: Running command: /usr/sbin/restorecon -R /var/lib/ceph/tmp/mnt.3fi6L2/none Dec 03 09:45:50 cephqe-node4.lab.eng.blr.redhat.com ceph-osd-run.sh[120149]: command: Running command: /usr/bin/chown -R ceph:ceph /var/lib/ceph/tmp/mnt.3fi6L2/none Dec 03 09:45:50 cephqe-node4.lab.eng.blr.redhat.com ceph-osd-run.sh[120149]: activate: ceph osd.6 data dir is ready at /var/lib/ceph/tmp/mnt.3fi6L2 Dec 03 09:45:50 cephqe-node4.lab.eng.blr.redhat.com ceph-osd-run.sh[120149]: move_mount: Moving mount to final location... Dec 03 09:45:50 cephqe-node4.lab.eng.blr.redhat.com ceph-osd-run.sh[120149]: command_check_call: Running command: /bin/mount -o noatime,largeio,inode64,swalloc -- /dev/sdb1 /var/lib/ceph/osd/ceph-6 Dec 03 09:45:50 cephqe-node4.lab.eng.blr.redhat.com ceph-osd-run.sh[120149]: command_check_call: Running command: /bin/umount -l -- /var/lib/ceph/tmp/mnt.3fi6L2 Dec 03 09:45:50 cephqe-node4.lab.eng.blr.redhat.com ceph-osd-run.sh[120149]: Waiting for /dev/sdb2 to show up Dec 03 09:45:51 cephqe-node4.lab.eng.blr.redhat.com ceph-osd-run.sh[120149]: Waiting for /dev/sdb2 to show up Dec 03 09:45:52 cephqe-node4.lab.eng.blr.redhat.com ceph-osd-run.sh[120149]: Waiting for /dev/sdb2 to show up Dec 03 09:45:53 cephqe-node4.lab.eng.blr.redhat.com ceph-osd-run.sh[120149]: Waiting for /dev/sdb2 to show up Dec 03 09:45:54 cephqe-node4.lab.eng.blr.redhat.com ceph-osd-run.sh[120149]: Waiting for /dev/sdb2 to show up Dec 03 09:45:55 cephqe-node4.lab.eng.blr.redhat.com ceph-osd-run.sh[120149]: Waiting for /dev/sdb2 to show up Dec 03 09:45:56 cephqe-node4.lab.eng.blr.redhat.com ceph-osd-run.sh[120149]: Waiting for /dev/sdb2 to show up Dec 03 09:45:57 cephqe-node4.lab.eng.blr.redhat.com ceph-osd-run.sh[120149]: Waiting for /dev/sdb2 to show up Dec 03 09:45:58 cephqe-node4.lab.eng.blr.redhat.com ceph-osd-run.sh[120149]: Waiting for /dev/sdb2 to show up Dec 03 09:45:59 cephqe-node4.lab.eng.blr.redhat.com ceph-osd-run.sh[120149]: Waiting for /dev/sdb2 to show up Dec 03 09:46:00 cephqe-node4.lab.eng.blr.redhat.com systemd[1]: ceph-osd: main process exited, code=exited, status=124/n/a Dec 03 09:46:00 cephqe-node4.lab.eng.blr.redhat.com docker[171709]: Error response from daemon: No such container: ceph-osd-cephqe-node4-sdb Dec 03 09:46:00 cephqe-node4.lab.eng.blr.redhat.com systemd[1]: Unit ceph-osd entered failed state. Dec 03 09:46:00 cephqe-node4.lab.eng.blr.redhat.com systemd[1]: ceph-osd failed. [ubuntu@cephqe-node4 ~]$ lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 0 279G 0 disk ├─sda1 8:1 0 1G 0 part /boot └─sda2 8:2 0 278G 0 part ├─rhel_cephqe--node4-root 253:0 0 50G 0 lvm / ├─rhel_cephqe--node4-swap 253:1 0 4G 0 lvm [SWAP] └─rhel_cephqe--node4-home 253:2 0 224G 0 lvm /home sdb 8:16 0 1.1T 0 disk └─sdb1 8:17 0 1.1T 0 part sdc 8:32 0 1.1T 0 disk └─sdc1 8:33 0 1.1T 0 part sdd 8:48 0 223.1G 0 disk └─sdd1 8:49 0 5G 0 part nvme0n1 259:0 0 931.5G 0 disk └─nvme0n1p1 259:1 0 5G 0 part [ubuntu@cephqe-node4 ~]$ Version-Release number of selected component (if applicable): ceph version 12.2.8-44.el7cp (31b25bc7bd626437e6b3da93b2de79640d8ccf61) luminous (stable) ceph-ansible-3.2.0-0.1.rc5.el7cp.noarch ansible-2.6.8-1.el7ae.noarch How reproducible: 4/4 Steps to Reproduce: 1. Start ceph-ansible playbook with ceph-disk dedicated devices osd scenario. Actual results: OSD creation failed. Expected results: OSD creation should succeed. Additional info: Attached ansible log file.
In https://github.com/ceph/ceph-ansible/releases/tag/v3.2.0rc8
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:0020