Description of problem ====================== When a ceph cluster is created via RHSC 2.0, two independent cluster hierarchies can be found in it's CRUSH cluster map. Version-Release =============== On RHSC 2.0 server: rhscon-ceph-0.0.27-1.el7scon.x86_64 rhscon-core-0.0.28-1.el7scon.x86_64 rhscon-core-selinux-0.0.28-1.el7scon.noarch rhscon-ui-0.0.42-1.el7scon.noarch ceph-ansible-1.0.5-23.el7scon.noarch ceph-installer-1.0.12-3.el7scon.noarch On Ceph Storage nodes: rhscon-agent-0.0.13-1.el7scon.noarch ceph-osd-10.2.2-5.el7cp.x86_64 How reproducible ================ 100 % Steps to Reproduce ================== 1. Install RHSC 2.0 following the documentation. 2. Accept few nodes for the ceph cluster. 3. Create new ceph cluster named 'alpha'. 4. Check CRUSH cluster map Actual results ============== There are 2 intependent cluster hierarchies in cluster map: ~~~ # ceph -c /etc/ceph/alpha.conf osd tree ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY -10 0.03998 root general -6 0.00999 host mbukatov-usm1-node2.os1.phx2.redhat.com-general 1 0.00999 osd.1 up 1.00000 1.00000 -7 0.00999 host mbukatov-usm1-node3.os1.phx2.redhat.com-general 2 0.00999 osd.2 up 1.00000 1.00000 -8 0.00999 host mbukatov-usm1-node1.os1.phx2.redhat.com-general 0 0.00999 osd.0 up 1.00000 1.00000 -9 0.00999 host mbukatov-usm1-node4.os1.phx2.redhat.com-general 3 0.00999 osd.3 up 1.00000 1.00000 -1 0 root default -2 0 host mbukatov-usm1-node1 -3 0 host mbukatov-usm1-node2 -4 0 host mbukatov-usm1-node3 -5 0 host mbukatov-usm1-node4 ~~~ In this case: * 1st hierarchy has root with ID -10 (named "general") * 2nd hierarchy has root with ID -1 (named "default") Expected results ================ There is only single cluster hierarchies in cluster map, so that the output would look something like this: ~~~ # ceph -c /etc/ceph/alpha.conf osd tree ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY -10 0.03998 root general -6 0.00999 host mbukatov-usm1-node2.os1.phx2.redhat.com-general 1 0.00999 osd.1 up 1.00000 1.00000 -7 0.00999 host mbukatov-usm1-node3.os1.phx2.redhat.com-general 2 0.00999 osd.2 up 1.00000 1.00000 -8 0.00999 host mbukatov-usm1-node1.os1.phx2.redhat.com-general 0 0.00999 osd.0 up 1.00000 1.00000 -9 0.00999 host mbukatov-usm1-node4.os1.phx2.redhat.com-general 3 0.00999 osd.3 up 1.00000 1.00000 ~~~ Additional info =============== For details about CRUSH map, see: http://docs.ceph.com/docs/master/rados/operations/crush-map/
I have few additional questions here: 1) Why do we have 2 cluster hierarchies in the crush map? Is it intentional or just a remnant of some action during cluster setup? 2) Why does each hierarchy use different bucket naming scheme for hosts? 3) Which component created each hierarchy? 4) Why does only one hierarchy have OSD's attached while the other doesn't? That said, based on my current understanding of CRUSH cluster map, I don't think that makes any sense to have 2 hierarchies in cluster map like that. Without a clear purpose, RHSC 2.0 shouldn't create such complicated and error prone configuration.
(In reply to Martin Bukatovic from comment #1) > I have few additional questions here: > > 1) Why do we have 2 cluster hierarchies in the crush map? Is it intentional > or > just a remnant of some action during cluster setup? you will as many hierarchies as the number of valid storage profiles applicable to the cluster > > 2) Why does each hierarchy use different bucket naming scheme for hosts? > As mentioned above it is based on the storage profile > 3) Which component created each hierarchy? it will be created by the ceph provider after the cluster is created > > 4) Why does only one hierarchy have OSD's attached while the other doesn't? This was due to the earlier implementation where default tree is ignored. Also calamari won't allow a OSD to be present inn two different hierarchies. hence the original tree will be empty if you take out all the OSDs from the the dafult to otheres > > That said, based on my current understanding of CRUSH cluster map, I don't > think that makes any sense to have 2 hierarchies in cluster map like that. > > Without a clear purpose, RHSC 2.0 shouldn't create such complicated and > error prone configuration. This is implemented based on the requirement where pools to the created on top specified OSDs grouped by the storage profile As part of this patch, fixed the issue of ignoring default hierarchy. Still it is possible to see a blank default if the user move all the OSDs out from default
Tested on: rhscon-core-0.0.36-1.el7scon.x86_64 rhscon-ui-0.0.50-1.el7scon.noarch rhscon-ceph-0.0.36-1.el7scon.x86_64 rhscon-core-selinux-0.0.36-1.el7scon.noarch I have several active storage profiles # ceph osd tree ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY -11 0.00999 root test -10 0.00999 host dhcp-126-101-test 0 0.00999 osd.0 up 1.00000 1.00000 -9 0.02998 root ec_test -6 0.00999 host dhcp-126-103-ec_test 4 0.00999 osd.4 up 1.00000 1.00000 -7 0.00999 host dhcp-126-102-ec_test 3 0.00999 osd.3 up 1.00000 1.00000 -8 0.00999 host dhcp-126-105-ec_test 6 0.00999 osd.6 up 1.00000 1.00000 -1 0.03998 root default -2 0.00999 host dhcp-126-101 1 0.00999 osd.1 up 1.00000 1.00000 -3 0.00999 host dhcp-126-102 2 0.00999 osd.2 up 1.00000 1.00000 -4 0.00999 host dhcp-126-103 5 0.00999 osd.5 up 1.00000 1.00000 -5 0.00999 host dhcp-126-105 7 0.00999 osd.7 up 1.00000 1.00000 It works as it should, pools could be created, data could be stored etc from GUI and from CLI too. However because of multiple hierarchies ceph statistics are not correct. On some places it seems that ceph counts with all osds available for a pool.
(In reply to Nishanth Thomas from comment #2) > (In reply to Martin Bukatovic from comment #1) > > 1) Why do we have 2 cluster hierarchies in the crush map? Is it intentional > > or > > just a remnant of some action during cluster setup? > > you will as many hierarchies as the number of valid storage profiles > applicable to the cluster Ok, we will assume this is intended RHSC 2.0 design. (In reply to Lubos Trilety from comment #4) > Tested on: > rhscon-core-0.0.36-1.el7scon.x86_64 > rhscon-ui-0.0.50-1.el7scon.noarch > rhscon-ceph-0.0.36-1.el7scon.x86_64 > rhscon-core-selinux-0.0.36-1.el7scon.noarch > > I have several active storage profiles > # ceph osd tree > ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY > -11 0.00999 root test > -10 0.00999 host dhcp-126-101-test > 0 0.00999 osd.0 up 1.00000 1.00000 > -9 0.02998 root ec_test > -6 0.00999 host dhcp-126-103-ec_test > 4 0.00999 osd.4 up 1.00000 1.00000 > -7 0.00999 host dhcp-126-102-ec_test > 3 0.00999 osd.3 up 1.00000 1.00000 > -8 0.00999 host dhcp-126-105-ec_test > 6 0.00999 osd.6 up 1.00000 1.00000 > -1 0.03998 root default > -2 0.00999 host dhcp-126-101 > 1 0.00999 osd.1 up 1.00000 1.00000 > -3 0.00999 host dhcp-126-102 > 2 0.00999 osd.2 up 1.00000 1.00000 > -4 0.00999 host dhcp-126-103 > 5 0.00999 osd.5 up 1.00000 1.00000 > -5 0.00999 host dhcp-126-105 > 7 0.00999 osd.7 up 1.00000 1.00000 > > It works as it should, pools could be created, data could be stored etc from > GUI and from CLI too. > However because of multiple hierarchies ceph statistics are not correct. On > some places it seems that ceph counts with all osds available for a pool. Since Nishanth stated that the current design is that dedicated cluster hierarchy is maintained for each storage profile, I would consider this behaviour correct and so I think it's ok to validate this BZ. That said, personally I'm not completely sure current design of storage profiles feature makes sense. But that would be a starting point for another discussion (and another BZ).
Based information provided in comment 4 and comment 5, moving to VERIFIED.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2016:1754