Description of problem:After executing the enable_stretch_mode command twice getting different messages. [root@ceph-bharath-1603163288685-node1-monmgrinstaller cephuser]# /bin/ceph mon enable_stretch_mode ceph-bharath-1603163288685-node6-mon stretch_rule datacenter Error EINVAL: the 2 datacenterinstances in the cluster have differing weights 5372 and 4062 but stretch mode currently requires they be the same! [root@ceph-bharath-1603163288685-node1-monmgrinstaller cephuser]# /bin/ceph mon enable_stretch_mode ceph-bharath-1603163288685-node6-mon stretch_rule datacenter stretch mode currently committing At first time when executing the command getting the Error message and second time when I ran getting message as stretch mode is currently committing. Version-Release number of selected component (if applicable): [root@ceph-bharath-1603163288685-node1-monmgrinstaller cephuser]# ceph versions { "mon": { "ceph version 14.2.11-55.el8cp (a88999020b8767a4c384efbc8f9c061e95e78051) nautilus (stable)": 3 }, "mgr": { "ceph version 14.2.11-55.el8cp (a88999020b8767a4c384efbc8f9c061e95e78051) nautilus (stable)": 1 }, "osd": { "ceph version 14.2.11-55.el8cp (a88999020b8767a4c384efbc8f9c061e95e78051) nautilus (stable)": 28 }, "mds": {}, "overall": { "ceph version 14.2.11-55.el8cp (a88999020b8767a4c384efbc8f9c061e95e78051) nautilus (stable)": 32 } } How reproducible: Steps to Reproduce: bin/ceph config set osd osd_crush_update_on_start false /bin/ceph osd crush move osd.0 host=host1-1 datacenter=site1 /bin/ceph osd crush move osd.1 host=host1-2 datacenter=site1 /bin/ceph osd crush move osd.2 host=host2-1 datacenter=site2 /bin/ceph osd crush move osd.3 host=host2-2 datacenter=site2 /bin/ceph osd getcrushmap > crush.map.bin /bin/crushtool -d crush.map.bin -o crush.map.txt cat <<EOF >> crush.map.txt rule stretch_rule { id 1 type replicated min_size 1 max_size 10 step take site1 step chooseleaf firstn 2 type host step emit step take site2 step chooseleaf firstn 2 type host step emit } EOF /bin/crushtool -c crush.map.txt -o crush2.map.bin /bin/ceph osd setcrushmap -i crush2.map.bin /bin/ceph mon set election_strategy connectivity /bin/ceph mon set_location a datacenter=site1 /bin/ceph mon set_location ceph-bharath-1601914071234-node6-mon datacenter=site1 /bin/ceph mon set_location ceph-bharath-1601914071234-node1-monmgrinstaller datacenter=site2 /bin/ceph mon set_location ceph-bharath-1601914071234-node2-mon datacenter=site3 /bin/ceph osd pool create test_stretch1 8 8 replicated /bin/ceph mon enable_stretch_mode ceph-bharath-1601914071234-node2-mon stretch_rule datacenter Actual results: 1.Getting error message as first time 2.cluster committing in to stretch mode Expected results: Need to show proper message Additional info:
Ah yep, this code was erroneously committing changes when it was meant to be testing validity, so things ended up in a weird half-state. Patch in progress upstream.
(In reply to Greg Farnum from comment #1) > Ah yep, this code was erroneously committing changes when it was meant to be > testing validity, so things ended up in a weird half-state. Patch in > progress upstream. Any updates? can you post a link to the upstream patch?
Fixed in ceph-4.2-rhel-patches branch.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: Red Hat Ceph Storage 4.2 Security and Bug Fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:0081