Description of problem: With some specific disk configuration, Ceph cluster is created correctly, but some OSDs are not visible in USM. I have a cluster with following disk configuration on two nodes: All disks are configured as SSD[1]. - NODE3: 8 spare disks (sizes: 11G, 11G, 11G, 1T, 1T, 1T, 1T, 1T) - NODE4: 8 spare disks (sizes: 6G, 11G, 16G, 100G, 1T, 1T, 1T, 1T) NOTE: NODE1 and NODE2 have different disks schema and on each of them was properly created 6 OSDs. Cluster creation success, but there is problem with adding 2 OSDs on each of the above mentioned nodes: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ OSD addition failed for [NODE3:map[/dev/vdh:/dev/vdd] NODE3:map[/dev/vdi:/dev/vdb] NODE4:map[/dev/vdf:/dev/vdb] NODE4:map[/dev/vdg:/dev/vdc]] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ And I don't see those problematic OSDs on the OSD summary page (Clusters->"CLUSTER"->OSDs). But when I check it directly on the nodes and in Ceph, all OSDs were created properly and are there: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ [root@NODE3 ~]# ceph-disk list /dev/vda : /dev/vda1 other, swap /dev/vda2 other, xfs, mounted on / /dev/vdb : /dev/vdb1 ceph journal, for /dev/vdf1 /dev/vdb2 ceph journal, for /dev/vdi1 /dev/vdc : /dev/vdc1 ceph journal, for /dev/vde1 /dev/vdd : /dev/vdd1 ceph journal, for /dev/vdg1 /dev/vdd2 ceph journal, for /dev/vdh1 /dev/vde : /dev/vde1 ceph data, active, cluster TestClusterA, osd.12, journal /dev/vdc1 /dev/vdf : /dev/vdf1 ceph data, active, cluster TestClusterA, osd.15, journal /dev/vdb1 /dev/vdg : /dev/vdg1 ceph data, active, cluster TestClusterA, osd.13, journal /dev/vdd1 /dev/vdh : /dev/vdh1 ceph data, active, cluster TestClusterA, osd.14, journal /dev/vdd2 /dev/vdi : /dev/vdi1 ceph data, active, cluster TestClusterA, osd.16, journal /dev/vdb2 [root@NODE3 ~]# lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT vda 253:0 0 20G 0 disk ├─vda1 253:1 0 2G 0 part └─vda2 253:2 0 18G 0 part / vdb 253:16 0 11G 0 disk ├─vdb1 253:17 0 5G 0 part └─vdb2 253:18 0 5G 0 part vdc 253:32 0 11G 0 disk └─vdc1 253:33 0 5G 0 part vdd 253:48 0 11G 0 disk ├─vdd1 253:49 0 5G 0 part └─vdd2 253:50 0 5G 0 part vde 253:64 0 1T 0 disk └─vde1 253:65 0 1024G 0 part /var/lib/ceph/osd/TestClusterA-12 vdf 253:80 0 1T 0 disk └─vdf1 253:81 0 1024G 0 part /var/lib/ceph/osd/TestClusterA-15 vdg 253:96 0 1T 0 disk └─vdg1 253:97 0 1024G 0 part /var/lib/ceph/osd/TestClusterA-13 vdh 253:112 0 1T 0 disk └─vdh1 253:113 0 1024G 0 part /var/lib/ceph/osd/TestClusterA-14 vdi 253:128 0 1T 0 disk └─vdi1 253:129 0 1024G 0 part /var/lib/ceph/osd/TestClusterA-16 [root@NODE4 ~]# ceph-disk list /dev/vda : /dev/vda1 other, swap /dev/vda2 other, xfs, mounted on / /dev/vdb : /dev/vdb1 ceph journal, for /dev/vdf1 /dev/vdc : /dev/vdc2 ceph journal, for /dev/vdg1 /dev/vdc1 ceph journal, for /dev/vdh1 /dev/vdd : /dev/vdd2 ceph journal, for /dev/vde1 /dev/vdd1 ceph journal, for /dev/vdi1 /dev/vde : /dev/vde1 ceph data, active, cluster TestClusterA, osd.21, journal /dev/vdd2 /dev/vdf : /dev/vdf1 ceph data, active, cluster TestClusterA, osd.17, journal /dev/vdb1 /dev/vdg : /dev/vdg1 ceph data, active, cluster TestClusterA, osd.19, journal /dev/vdc2 /dev/vdh : /dev/vdh1 ceph data, active, cluster TestClusterA, osd.18, journal /dev/vdc1 /dev/vdi : /dev/vdi1 ceph data, active, cluster TestClusterA, osd.20, journal /dev/vdd1 [root@NODE4 ~]# lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT vda 253:0 0 20G 0 disk ├─vda1 253:1 0 2G 0 part └─vda2 253:2 0 18G 0 part / vdb 253:16 0 6G 0 disk └─vdb1 253:17 0 5G 0 part vdc 253:32 0 11G 0 disk ├─vdc1 253:33 0 5G 0 part └─vdc2 253:34 0 5G 0 part vdd 253:48 0 16G 0 disk ├─vdd1 253:49 0 5G 0 part └─vdd2 253:50 0 5G 0 part vde 253:64 0 100G 0 disk └─vde1 253:65 0 100G 0 part /var/lib/ceph/osd/TestClusterA-21 vdf 253:80 0 1T 0 disk └─vdf1 253:81 0 1024G 0 part /var/lib/ceph/osd/TestClusterA-17 vdg 253:96 0 1T 0 disk └─vdg1 253:97 0 1024G 0 part /var/lib/ceph/osd/TestClusterA-19 vdh 253:112 0 1T 0 disk └─vdh1 253:113 0 1024G 0 part /var/lib/ceph/osd/TestClusterA-18 vdi 253:128 0 1T 0 disk └─vdi1 253:129 0 1024G 0 part /var/lib/ceph/osd/TestClusterA-20 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ And also ceph knows about all OSDs (6 OSDs peer NODE1 and NODE2, 5 OSDs peer NODE3 and NODE4). ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ [root@MON1 ~]# ceph --cluster TestClusterA osd stat osdmap e204: 22 osds: 22 up, 22 in flags sortbitwise ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Version-Release number of selected component (if applicable): USM Server (RHEL 7.2): ceph-ansible-1.0.5-31.el7scon.noarch ceph-installer-1.0.14-1.el7scon.noarch rhscon-ceph-0.0.33-1.el7scon.x86_64 rhscon-core-0.0.34-1.el7scon.x86_64 rhscon-core-selinux-0.0.34-1.el7scon.noarch rhscon-ui-0.0.48-1.el7scon.noarch Ceph MON node (RHEL 7.2): calamari-server-1.4.7-1.el7cp.x86_64 ceph-base-10.2.2-24.el7cp.x86_64 ceph-common-10.2.2-24.el7cp.x86_64 ceph-mon-10.2.2-24.el7cp.x86_64 ceph-selinux-10.2.2-24.el7cp.x86_64 libcephfs1-10.2.2-24.el7cp.x86_64 python-cephfs-10.2.2-24.el7cp.x86_64 rhscon-agent-0.0.15-1.el7scon.noarch rhscon-core-selinux-0.0.34-1.el7scon.noarch Ceph SOD node (RHEL 7.2): ceph-base-10.2.2-24.el7cp.x86_64 ceph-common-10.2.2-24.el7cp.x86_64 ceph-osd-10.2.2-24.el7cp.x86_64 ceph-selinux-10.2.2-24.el7cp.x86_64 libcephfs1-10.2.2-24.el7cp.x86_64 python-cephfs-10.2.2-24.el7cp.x86_64 rhscon-agent-0.0.15-1.el7scon.noarch rhscon-core-selinux-0.0.34-1.el7scon.noarch How reproducible: 100% Steps to Reproduce: 1. Prepare nodes for USM cluster and on some nodes prepare/use following disks: - nodeA: 8 SSD disks (11G, 11G, 11G, 1T, 1T, 1T, 1T, 1T) - nodeB: 8 SSD disks (6G, 11G, 16G, 100G, 1T, 1T, 1T, 1T) Disks were created via this command: `qemu-img create -f qcow2 ${IMAGES_PATH}${NODE_NAME}-${DISK_NUMBER}.img ${SIZE}` and configured as SSD[1]. 2. Create Ceph cluster via USM, use 5GB journal. 3. Check the "Create Cluster" task and OSD summary page (Clusters->"CLUSTER"->OSDs) 4. Check the number and list of OSDs in Ceph. On the MON node: # ceph --cluster TestClusterA osd stat On the OSD node: # lsblk # ceph-disk list Actual results: Some OSDs were properly created, but USM reports them as Failed and they are not visible there. Expected results: All properly created OSDs are visible in USM. Additional info: [1] `echo 0 > /sys/block/vdX/queue/rotational`
Tested on: USM Server (RHEL 7.2): ceph-ansible-1.0.5-31.el7scon.noarch ceph-installer-1.0.14-1.el7scon.noarch rhscon-ceph-0.0.37-1.el7scon.x86_64 rhscon-core-0.0.37-1.el7scon.x86_64 rhscon-core-selinux-0.0.37-1.el7scon.noarch rhscon-ui-0.0.51-1.el7scon.noarch Ceph MON (RHEL 7.2): calamari-server-1.4.7-1.el7cp.x86_64 ceph-base-10.2.2-30.el7cp.x86_64 ceph-common-10.2.2-30.el7cp.x86_64 ceph-mon-10.2.2-30.el7cp.x86_64 ceph-selinux-10.2.2-30.el7cp.x86_64 libcephfs1-10.2.2-30.el7cp.x86_64 python-cephfs-10.2.2-30.el7cp.x86_64 rhscon-agent-0.0.16-1.el7scon.noarch rhscon-core-selinux-0.0.37-1.el7scon.noarch Ceph OSD (RHEL 7.2): ceph-base-10.2.2-30.el7cp.x86_64 ceph-common-10.2.2-30.el7cp.x86_64 ceph-osd-10.2.2-30.el7cp.x86_64 ceph-selinux-10.2.2-30.el7cp.x86_64 libcephfs1-10.2.2-30.el7cp.x86_64 python-cephfs-10.2.2-30.el7cp.x86_64 rhscon-agent-0.0.16-1.el7scon.noarch rhscon-core-selinux-0.0.37-1.el7scon.noarch All created OSDs are visible in USM and neither is marked as Failed on the same configuration as described in Comment 0.
Moving to VERIFIED as peer Comment 2.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2016:1754