1271227 – Monitor thrashing causing the Cluster in a dead state

Bug 1271227 - Monitor thrashing causing the Cluster in a dead state

Summary: Monitor thrashing causing the Cluster in a dead state

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	Red Hat Ceph Storage
Classification:	Red Hat Storage
Component:	Documentation
Sub Component:
Version:	1.3.1
Hardware:	x86_64
OS:	Linux
Priority:	unspecified
Severity:	urgent
Target Milestone:	rc
Target Release:	1.3.1
Assignee:	ceph-docs@redhat.com
QA Contact:	ceph-qe-bugs
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2015-10-13 12:20 UTC by Tanay Ganguly
Modified:	2016-09-20 01:50 UTC (History)
CC List:	11 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2015-12-18 09:59:28 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
Current Leader Mon Log (2.00 MB, text/plain) 2015-10-13 12:20 UTC, Tanay Ganguly	no flags	Details
View All

Description Tanay Ganguly 2015-10-13 12:20:10 UTC

Created attachment 1082405 [details]
Current Leader Mon Log

Description of problem:
After continuous monitor addition and removal ( Even new hosts added a MON), my Cluster is now in unusable state, getting continuous Error message:

"2015-10-13 23:01:25.332621 7fac6c50c700  0 -- :/1034060 >> 10.70.44.42:6789/0 pipe(0x7fac5c0008c0 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7fac5c0136c0).fault"

While running any ceph command.

Version-Release number of selected component (if applicable):
ceph-0.94.3-2.el7cp.x86_64

How reproducible:
NA

Steps to Reproduce:
1. Started the test with 1 Mon
2. Then added 2 more Mon.
3. Again killed 1 MON and added a new host which acted as a new Mon.
4. Again added 2 more Mon ( At this point i have 5 Mon and Cluster in healthy state and it was in Quorum)
5. Then i destroyed the leader mon.
ceph-deploy mon destroy Node-Name


NOTE: All the time there was IO happening in the Cluster.

Actual results:
After that i am unable to do any operation in the Cluster, and it went into unusable state.

Expected results:
4 Nodes should be enough for having quorum

Additional info:
I tried the same on my 1.3.0 Cluster having the same Mon config and it worked fine.

From the other mon log i can see that the health is OK and there are 4 Mons
==============================================================================
2015-10-13 21:55:14.927477 7f4bde543700  0 log_channel(cluster) log [INF] : mon.cephqe3@0 won leader election with quorum 0,1,2,3
2015-10-13 21:55:14.933411 7f4bde543700  0 log_channel(cluster) log [INF] : HEALTH_OK
2015-10-13 21:55:14.934109 7f4bde543700  0 log_channel(cluster) log [WRN] : mon.3 10.70.44.56:6789/0 clock skew 23.1797s > max 0.05s
2015-10-13 21:55:14.934184 7f4bde543700  0 log_channel(cluster) log [WRN] : mon.2 10.70.44.54:6789/0 clock skew 7.02393s > max 0.05s
2015-10-13 21:55:14.952291 7f4bde543700  0 log_channel(cluster) log [INF] : monmap e6: 4 mons at {cephqe10=10.70.44.54:6789/0,cephqe11=10.70.44.56:6789/0,cephqe3=10.70.44.40:6789/0,ceph
qe7=10.70.44.48:6789/0}
===============================================================================

Attaching the current leader mon log

Comment 3 Kefu Chai 2015-10-14 07:37:45 UTC

the ceph.conf on cephqe3 looks like

# cat /etc/ceph/ceph.conf
[global]
osd crush location hook = /usr/bin/calamari-crush-location
fsid = 3461ab41-2b16-4e45-a350-902fe73ea98a
mon_initial_members = cephqe4
mon_host = 10.70.44.42
auth_cluster_required = cephx
auth_service_required = cephx
auth_client_required = cephx
filestore_xattr_use_omap = true

so it's expected behaviour of "ceph" CLI and monitor. but i am not sure if it's expected from ceph-deploy's perspective. as per Tanay, he installed cephqe4 using "ceph-deploy install --mon <mon>", and added the other monitors using "ceph-deploy mon add <mon>", then put the cephqe4 offline.

seems we don't update the ceph.conf when adding a new monitor using ceph-deploy. @Alfredo, is this expected behaviour? i am looking at ceph-deploy/hosts/common.py, seems we are updating the conf file in mon_add(), but seems it does not rewrite the "monmap", "mon host", the "mon addr" in "mon[.*]" and "global" sections. the ceph CLI tries to buildup the initial monmap by reading those variables from the config file.

what if our user starts his/her cluster with only one monitor, and adds more of them, then let the very first monitor offline? this will leave him/her with a conf file with an out-dated monmap. and the CLI will not able to connect to the cluster with it.

Comment 4 Alfredo Deza 2015-10-14 12:10:08 UTC

ceph-deploy does not have anything to be able to *update* values in a configuration file arbitrarily. When creating a mon or adding a mon the same configuration file that exists locally in the CWD is then written on the remote host.

This seems to me like an advanced usage scenario, where we shouldn't expect ceph-deploy to be able to keep up with the effort of updating and synchronizing the configuration file.

Comment 5 Kefu Chai 2015-10-14 14:44:39 UTC

thank you, Alfredo!

Tanay, if you believe that we should update the document to address this issue. could you update the ticket accordingly? thanks.

Comment 6 Tanay Ganguly 2015-10-15 06:59:12 UTC

Hi Kefu/Alfredo,

How can we document this, as discussed earlier this can easily happen at customer place when customer who have single Initial Mon Node, later feels like expanding the Mon Cluster and add few more.

Then if something goes wrong with the original Mon Node then we will hit this Issue.

Workaround is:
We need to manually change the ceph.conf and remove the old entry.

Can we make the change in code, to avoid this workaround.

Comment 7 Ken Dreyer (Red Hat) 2015-10-20 15:38:33 UTC

To clarify: it sounds like we're saying the "Adding a Monitor" documentation should be updated to say "When adding a new monitor host, you should also add it to the 'mon initial members' configuration option in ceph.conf". Right?

The alternative is updating ceph-deploy to dynamically insert new monitors into ceph.conf on the fly. Unfortunately that ceph-deploy change is probably not going to happen any time soon :(

Comment 8 Federico Lucifredi 2015-10-21 16:42:11 UTC

Let's make it documentation per comment #7.

Comment 10 John Poelstra 2015-10-28 16:13:17 UTC

approved at program meeting on Oct 28,2015

Comment 12 Tanay Ganguly 2015-10-29 10:28:59 UTC

John,

This changes only talks about removing the old Mon reference in ceph.conf file.

In addition to this, we also need to add the new monitor IP and hostname once the new Mon have been added to Cluster, changes should be again made in ceph.conf file.

Thanks,
Tanay

Comment 13 Kefu Chai 2015-10-29 10:53:03 UTC

> We need to manually change the ceph.conf and remove the old entry.

and more importantly, to add the new one. because the client needs to create the initial monmap so it is able to contact at least one of the alive monitors to get the latest monmap, osdmap and other important cluster information.

Comment 16 John Wilkins 2015-11-02 19:04:08 UTC

See: 

https://gitlab.cee.redhat.com/jowilkin/red-hat-ceph-storage-administration-guide/commit/7bc27031c1ba98ea6902e0797753bb5582c5e237

https://gitlab.cee.redhat.com/jowilkin/red-hat-ceph-storage-administration-guide/commit/47cf365e16b63b121255177fd30fffb56705a94b

Comment 19 John Wilkins 2015-11-18 17:49:47 UTC

See https://gitlab.cee.redhat.com/jowilkin/red-hat-ceph-storage-administration-guide/commit/4fafbb099ae526515558a53816501183f9166016

Comment 20 Tanay Ganguly 2015-11-18 19:15:48 UTC

John,

As mentioned in Comment 17, we need to add both in the documentation.

mon_initial_members 
AND
mon_host

Refer my above comment 17

Another minor change, the starting of the Important section:
If are adding a monitor to a cluster that has only one monitor

Maybe solution:
If you are adding a monitor to a cluster that has only one monitor
If adding a monitor to a cluster that has only one monitor

Comment 21 John Wilkins 2015-11-19 00:35:36 UTC

See https://gitlab.cee.redhat.com/jowilkin/red-hat-ceph-storage-administration-guide/commit/80ec873960e71e111f920488b405adfe16cea042

Comment 22 Tanay Ganguly 2015-11-19 09:01:48 UTC

Marking this Bug as Verified.

Comment 23 Anjana Suparna Sriram 2015-12-18 09:59:28 UTC

Fixed for 1.3.1 Release.

Note You need to log in before you can comment on or make changes to this bug.