Bug 1278822

Summary: 1.3.1: "Managing Cluster Size" doc needs changes in add monitor section
Product: [Red Hat Storage] Red Hat Ceph Storage Reporter: Harish NV Rao <hnallurv>
Component: DocumentationAssignee: ceph-docs <ceph-docs>
Status: CLOSED CURRENTRELEASE QA Contact: ceph-qe-bugs <ceph-qe-bugs>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 1.3.1CC: flucifre, kchai, kdreyer, ngoswami, shmohan
Target Milestone: rc   
Target Release: 1.3.1   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-12-18 10:00:16 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Harish NV Rao 2015-11-06 13:45:16 UTC
Description of problem:

As part of testing 1231203 & 1271227 QE encountered few issues while adding mon which need doc fixes.

The doc referred: https://gitlab.cee.redhat.com/jowilkin/red-hat-ceph-storage-administration-guide/blob/v1.3/cluster-size.adoc. Please modify this doc as per the information provided below.

--------------------------------------------------------

Section "Add monitor" using ceph-deploy: 

Before invoking the 'ceph-deploy mon add' command, 
a) the ceph.conf needs to be modified to include this new mon in the ceph.conf file for the config parameters: mon_initial_members and mon_host.

mon_initial_members = <existing mon1>,<existing-mon2>, <NEW-mon3>
mon_host = <IP of existing mon1>,<IP of existing-mon2>, <IP of NEW-mon3>

b) ceph.conf file needs to be distributed to other nodes

---------------------------------------------------------

Section "manual" for adding mon: 

1. Before adding mon manually to cluster,

a) the ceph.conf needs to be modified to include this new mon in the ceph.conf file for the config parameters: mon_initial_members and mon_host.

mon_initial_members = <existing mon1>,<existing-mon2>, <NEW-mon3>
mon_host = <IP of existing mon1>,<IP of existing-mon2>, <IP of NEW-mon3>

b) Add following entry under [global]

pid file = /var/run/ceph/$name.pid
[mon.<mon-node>]
host = <mon-node>

c) ceph.conf file needs to be distributed to other nodes

2. the step 6 (ceph mon add <mon-id> <ip>[:<port>]) Should be removed. Executing this hangs and cluster becomes unusable

3. the step 7 should be "sudo ceph-mon -i {mon-id} --public-addr {ip:port} --pid-file /var/run/ceph/mon.{mon-id}.pid"

-----------------------------------------------------------------

A sample ceph.conf file:

(Note: magna037 -> initial monitor, magna089 -> manually added mon, magna040 -> added via ceph-deploy mon add)

[cephuser@magna006 ceph-config]$ cat ceph.conf
[global]
fsid = fae42f30-434e-4f4e-b57a-156af84037ad

mon_initial_members = magna037,magna089,magna040
mon_host = 10.b.c.37, 10.x.y.89,10.m.n.40
auth_cluster_required = cephx
auth_service_required = cephx
auth_client_required = cephx
filestore_xattr_use_omap = true

osd_pool_default_size = 2
osd_pool_default_min_size = 1

pid file = /var/run/ceph/$name.pid
[mon.magna089]
host = magna089


Version-Release number of selected component (if applicable): 1.3.1 RHEL 7.1


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 2 John Wilkins 2015-11-06 19:11:14 UTC
See https://gitlab.cee.redhat.com/jowilkin/red-hat-ceph-storage-administration-guide/commit/02d08faee60e45a4b3ffbcb9b3d09e39af8a3e17

We need confirmation from Kefu Chai. mon_initial_members isn't supposed to be required unless it is part of an initial quorum. However, settings like mon_addr are no longer present in config_opts.h, and the default for pid is a null string now.

We need to test this out on v1.2.3 as well.

Comment 3 Harish NV Rao 2015-11-06 19:19:32 UTC
Hi Kefu,

Can you please provide confirmation as requested in comment 2?

Regards,
Harish

Comment 4 Harish NV Rao 2015-11-06 19:23:06 UTC
Hi John,

This line - "Also, ensure you have pid file = /var/run/ceph/$name.pid set in the [global] section of your Ceph configuration file." must go to "manual" section as this is not applicable to adding mon in ceph-deploy way.

Harish

Comment 5 Kefu Chai 2015-11-09 05:29:49 UTC
>We need confirmation from Kefu Chai. mon_initial_members isn't supposed to be required unless it is part of an initial quorum. However, settings like mon_addr are no longer present in config_opts.h, and the default for pid is a null string now.

> mon_host was specified with host name. However,
> https://bugzilla.redhat.com/show_bug.cgi?id=1278822 is calling for it to be
> an IP address. Since, mon_addr is no longer a setting, do we use
> public_addr in a ceph.conf [mon.name] section, or specify the IP address in
> mon_host?

tl;dr: if we are adding a new monitor, we can simply use public_addr in [mon.name]
section on the newly added monitor side. and the setting in the doc[1]

> [mon.<hostname>]
> mon_host = <hostname>
> public_addr = {iport}

does not look quite right. because mon_host is a global setting for mon. probably
we should put it in the section of "[mon]" instead, so it would be better if:

> [mon]
> mon_host = <hostname>
> [mon.<hostname>]
> public_addr = {iport}

long story:

mon_host can be a list of DNS-resolvable hostnames (separated by "," or ";" or " "),
or a list of ip addresses. the address list is used build up the initial monmap.
if we fail to create the monmap using mon_host, mon_addr kicks in.

mon_addr is still involved in the process of building the initial monmap if the
"monmap" file is not specified in ceph.conf or it is not readable. in that case,
we collect all "mon addr" in all "[mon.*] sections in hope to get the address of 
each monitor in the monmap, and hence build up the monmap using the collected
addresses.

if the monid (yes, it's the <monid> in the command options of "--id <monid>" or
"-i <monid") exists in the monmap. we will lookup the address for the monitor in
monmap, and check the found ip address against the "mon_addr". if they are
different, we complain for the inconsistency.

if the booting monitor is not listed in the monmap, we must specify either the
"public_network" or a "public_addr". so it's able to get an ip address to bind.

while on the client's side, we just need to make sure the client is able to
contact at least one of the monitor listed in mon. so the mon_host list and
mon_addr matters on client's side. depending on how we will be maintain
the ceph.conf files, if we want to have minimal entries for client side
and monitor(server) side. we can just keep an almost-updated mon_host entry
on client side, which will make sure that the client is able to talk to
one of the quorum. but if we want to have a single copy of ceph.conf, and
use it on all nodes including the client and server, probably, the better
way is to add/remove a section for the newly added/removed daemon and
let the client and daemon to build the monmap by collecting the addresses
from the mon sections.


as to adding the new monitor to mon_initial_members: it's only needed if
it's to be part of the initial quorum. we have this option to enforce that a monitor should be considered when forming a quorum. and this setting is only used on monitor side, not the client side.

Comment 6 John Wilkins 2015-11-09 21:51:28 UTC
Made some modifications to clarify.

Comment 7 Federico Lucifredi 2015-11-10 00:05:34 UTC
Kefu, can you please look at John's edits? If you are +1, QE will proceed to test right away.

Comment 8 Kefu Chai 2015-11-10 04:34:32 UTC
@John, sorry for confusing you.

- in the manual steps, we need also have
  [mon.{mon-id}]
  host = {mon-id}
- on the client's side[1], we have two options:
  1. maintain an almost updated monmap using mon.mon_host
   [mon]
   mon host = {ip:port} {ip:port} {new-ip:port}
  2. put the new monitor in it's own new section
   # but since we are more likely to have "mon host", that setting overrides this one
   # we are building the initial monmap.
   [mon.{mon-id}]
   mon addr = {ip:port}


so, if we want to use a single copy of ceph.conf for both client and server side, and in the meanwhile, we want to minimize the duplicated settings for client and server, we can have following ceph.conf when adding a monitor:

- manual steps

[global]
pid file = /var/run/ceph/$name.pid
[mon]
mon host = {hostname}:{hostname}:{new-hostname}
[mon.{new-mon-id}]
host = {new-hostname}


- ceph-deploy steps

[global]
# if we are using an updated ceph-deploy, we don't need this for ceph-deploy steps
# see https://bugzilla.redhat.com/show_bug.cgi?id=1231203#c60 and 
#     https://bugzilla.redhat.com/show_bug.cgi?id=1231203#c63 and
#     https://bugzilla.redhat.com/show_bug.cgi?id=1231203#c64, will confirm this 
# with Ken at end of this comment
pid file = /var/run/ceph/$name.pid
[mon]
mon host = {hostname}:{hostname}:{new-hostname}



@Ken, could you help confirm that the fix (https://bugzilla.redhat.com/show_bug.cgi?id=1231203#c64) for ceph-deploy will be in 1.3.1?

---
[1] client means monitor, mds, osd, and the cluster clients. whoever would like to talk to a monitor is a monitor client. the settings on client's side apply to both ceph-deploy and manual steps. as the difference between ceph-deploy and manual steps only reside in the server side.

Comment 9 Harish NV Rao 2015-11-10 12:04:47 UTC
Hi John,

I am moving this defect to Assigned state so that you can incorporate any changes if needed based on above comment.

Regards,
Harish

Comment 10 Harish NV Rao 2015-11-10 12:28:36 UTC
(In reply to Kefu Chai from comment #8)
> @John, sorry for confusing you.
> 
> - in the manual steps, we need also have
>   [mon.{mon-id}]
>   host = {mon-id}
> - on the client's side[1], we have two options:
>   1. maintain an almost updated monmap using mon.mon_host
>    [mon]
>    mon host = {ip:port} {ip:port} {new-ip:port}
>   2. put the new monitor in it's own new section
>    # but since we are more likely to have "mon host", that setting overrides
> this one
>    # we are building the initial monmap.
>    [mon.{mon-id}]
>    mon addr = {ip:port}
> 

Instead of above mentioned option(say, Option-1) can we have only below mentioned option (say, Option-2)? I feel, documenting both might be confusing for both doc writers and users. Could we just go with only Option-2 as the preferred option? (Not sure which of these two options are most used at customer place)

> 
> so, if we want to use a single copy of ceph.conf for both client and server
> side, and in the meanwhile, we want to minimize the duplicated settings for
> client and server, we can have following ceph.conf when adding a monitor:
> 


> - manual steps
> 
> [global]
> pid file = /var/run/ceph/$name.pid
> [mon]
> mon host = {hostname}:{hostname}:{new-hostname}
> [mon.{new-mon-id}]
> host = {new-hostname}
> 
> 
> - ceph-deploy steps
> 
> [global]
> # if we are using an updated ceph-deploy, we don't need this for ceph-deploy
> steps
> # see https://bugzilla.redhat.com/show_bug.cgi?id=1231203#c60 and 
> #     https://bugzilla.redhat.com/show_bug.cgi?id=1231203#c63 and
> #     https://bugzilla.redhat.com/show_bug.cgi?id=1231203#c64, will confirm
> this 
> # with Ken at end of this comment
> pid file = /var/run/ceph/$name.pid

Please note that 'ceph-deploy mon add' works without adding "pid file = /var/run/ceph/$name.pid" also. 

> [mon]
> mon host = {hostname}:{hostname}:{new-hostname}
> 
> 
> 
> @Ken, could you help confirm that the fix
> (https://bugzilla.redhat.com/show_bug.cgi?id=1231203#c64) for ceph-deploy
> will be in 1.3.1?
> 
> ---
> [1] client means monitor, mds, osd, and the cluster clients. whoever would
> like to talk to a monitor is a monitor client. the settings on client's side
> apply to both ceph-deploy and manual steps. as the difference between
> ceph-deploy and manual steps only reside in the server side.

Comment 11 Kefu Chai 2015-11-10 17:11:26 UTC
> Could we just go with only Option-2 as the preferred option? 

if we don't have "mon host", then the answer is: yes, we could.

> Please note that 'ceph-deploy mon add' works without adding "pid file = /var/run/ceph/$name.pid" also. 

as i put, it depends on the version of ceph-deploy you are using.

Comment 12 Ken Dreyer (Red Hat) 2015-11-10 17:43:25 UTC
(In reply to Kefu Chai from comment #8)
> @Ken, could you help confirm that the fix
> (https://bugzilla.redhat.com/show_bug.cgi?id=1231203#c64) for ceph-deploy
> will be in 1.3.1?

Yes, that fix has been cherry-picked and will ship in RHCS 1.3.1 as ceph-deploy 1.5.27.3.

Comment 14 Kefu Chai 2015-11-11 04:20:03 UTC
john, i see you are listing 


[mon.{mon-id}]
host = {mon-id}
public_addr = {ip:port}

as an alternative to 

[mon]
mon_host = {mon-ip:port} {mon-ip:port} {new-mon-ip:port}

but they are not.

here, public_addr helps the monitor to figure out what ip address (and port) to bind. but mon_host helps to build the monmap, and monmap in turn is looked up by the monitor to find out its address and port to bind. as we know, monmap is used by both client and server side. so, if user pushes 

[mon.{mon-id}]
host = {mon-id}
public_addr = {ip:port}

to monitor client, the client will not be updated with the latest monmap if the "mon_host" is very out-dated and does not include any monitor in quorum.


maybe we can simply put:

 [mon]
 mon host = {ip:port} {ip:port} {new-ip:port}

 [mon.{mon-id}]
 host = {mon-id}


otherwise we might need to explain to user what monmap stands for, and the priority of mon_host and mon_addr when an initial monmap is built.

Comment 15 Harish NV Rao 2015-11-12 09:27:52 UTC
Moving this back to doc team to incorporate the change proposed by Kefu in comment 14

Comment 16 John Wilkins 2015-11-12 17:07:55 UTC
https://gitlab.cee.redhat.com/jowilkin/red-hat-ceph-storage-administration-guide/commit/b04405115b493d37f3d2dfe37fd3fd9229442b1b


Kefu, 

For now, I'm principally interested in ensuring that the example provided is correct. However, your comment: "otherwise we might need to explain to user what monmap stands for, and the priority of mon_host and mon_addr when an initial monmap is built." is something we should write up for the next release. In fact, I would like to go over the monitor configuration reference and turn it into more of a "how-to" guide for configuring monitors.

Comment 17 Kefu Chai 2015-11-13 01:50:05 UTC
thanks John, the new doc looks good to me. i was trying to unfold all the internals that might help us understand why i suggested the "mon.mon_host" + "mon.{mon-id}.host" settings.

Comment 18 shylesh 2015-11-17 17:10:41 UTC
@Harish,

I covered the following scenarios 

1. add mon, remove mon, again add mon - manual
2. add mon, remove mon, again add mon - automated
3. add automated, remove manual, add automated
4. add new mons - manual and automated, fail leader mon and others one by one in each case.
5. Add mon from ceph-deploy and delete it from ceph-deploy
6. Add  mon from ceph-deploy and delete it manually
7. Add  mon manualy and delete it manually
8. Add mon manually and delete it from ceph-deploy ---> https://bugzilla.redhat.com/show_bug.cgi?id=1282484

so the document looks ok.

Once more catch here is if you install mon manually then you can't control it through /etc/init.d/ceph , not sure you have bug for this, otherwise document looks ok to me. Let me know if anything needs to be covered.

Comment 19 shylesh 2015-11-18 15:05:49 UTC
(In reply to shylesh from comment #18)
> @Harish,
> 
> I covered the following scenarios 
> 
> 1. add mon, remove mon, again add mon - manual
> 2. add mon, remove mon, again add mon - automated
> 3. add automated, remove manual, add automated
> 4. add new mons - manual and automated, fail leader mon and others one by
> one in each case.
> 5. Add mon from ceph-deploy and delete it from ceph-deploy
> 6. Add  mon from ceph-deploy and delete it manually
> 7. Add  mon manualy and delete it manually
> 8. Add mon manually and delete it from ceph-deploy --->
> https://bugzilla.redhat.com/show_bug.cgi?id=1282484
> 
> so the document looks ok.
> 
> Once more catch here is if you install mon manually then you can't control
> it through /etc/init.d/ceph , not sure you have bug for this, otherwise
> document looks ok to me. Let me know if anything needs to be covered.

After modifying the ceph.conf properly I was able to manage the mon daemons through service ceph {start|stop|status} . 

So marking this bug as verified.

Comment 20 Anjana Suparna Sriram 2015-12-18 10:00:16 UTC
Fixed for 1.3.1 Release.