Description of problem: Error message is getting displayed when a Mon is removed though cluster was not disrupted. Version-Release number of selected component (if applicable): ceph version 10.2.3-13.el7cp How reproducible: always Steps to Reproduce: 1. Stop Monitor service of a Monitor node sudo systemctl stop ceph-mon@<monitor_hostname> 2. Remove the monitor from the cluster: sudo ceph mon remove <mon_id> Actual results: $sudo ceph mon remove magna072 --cluster master Error EINVAL: removing mon.magna072 at 10.8.128.72:6789/0, there will be 3 monitors Additional info: Election process after removing mon seems to be running normal 2016-11-12 16:22:24.326599 7fd960177700 0 mon.magna030@0(leader) e4 handle_command mon_command({"prefix": "mon remove", "name": "magna072"} v 0) v1 2016-11-12 16:22:24.326627 7fd960177700 0 log_channel(audit) log [INF] : from='client.? 10.8.128.30:0/673982027' entity='client.admin' cmd=[{"prefix": "mon remove", "name": "magna072"}]: dispatch 2016-11-12 16:22:24.395750 7fd95de70700 0 -- 10.8.128.30:6789/0 >> 10.8.128.111:6789/0 pipe(0x7fd974160800 sd=13 :33906 s=2 pgs=7 cs=1 l=0 c=0x7fd973d3aa00).fault, initiating reconnect 2016-11-12 16:22:24.396399 7fd95df71700 0 -- 10.8.128.30:6789/0 >> 10.8.128.111:6789/0 pipe(0x7fd9746b9400 sd=13 :6789 s=0 pgs=0 cs=0 l=0 c=0x7fd9742cb080).accept connect_seq 0 vs existing 2 state connecting 2016-11-12 16:22:24.396414 7fd95df71700 0 -- 10.8.128.30:6789/0 >> 10.8.128.111:6789/0 pipe(0x7fd9746b9400 sd=13 :6789 s=0 pgs=0 cs=0 l=0 c=0x7fd9742cb080).accept peer reset, then tried to connect to us, replacing 2016-11-12 16:22:24.412214 7fd960177700 0 log_channel(cluster) log [INF] : mon.magna030 calling new monitor election 2016-11-12 16:22:24.412264 7fd960177700 1 mon.magna030@0(electing).elector(34) init, last seen epoch 34 2016-11-12 16:22:24.435800 7fd960177700 0 log_channel(cluster) log [INF] : mon.magna030@0 won leader election with quorum 0,1,2
it's a known issue, and is pending on backport in upstream. see http://tracker.ceph.com/issues/17725
https://github.com/ceph/ceph/pull/11999
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2017-0514.html