1583020 – [Ceph-ansible] 3 mdss are getting active after fresh cluster setup

Bug 1583020 - [Ceph-ansible] 3 mdss are getting active after fresh cluster setup

Summary: [Ceph-ansible] 3 mdss are getting active after fresh cluster setup

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Ceph Storage
Classification:	Red Hat Storage
Component:	Ceph-Ansible
Sub Component:
Version:	3.1
Hardware:	Unspecified
OS:	Unspecified
Priority:	low
Severity:	low
Target Milestone:	rc
Target Release:	3.1
Assignee:	Patrick Donnelly
QA Contact:	ceph-qe-bugs
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2018-05-28 06:01 UTC by Persona non grata
Modified:	2018-09-26 18:22 UTC (History)
CC List:	10 users (show)
Fixed In Version:	RHEL: ceph-ansible-3.1.0-0.1.rc9.el7cp Ubuntu: ceph-ansible_3.1.0~rc9-2redhat1
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2018-09-26 18:21:13 UTC
Embargoed:
Dependent Products:
Flags:	vakulkar: automate_bug+

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Github	ceph ceph-ansible pull 2695	0	None	None	None	2018-06-04 20:02:04 UTC
Red Hat Product Errata	RHBA-2018:2819	0	None	None	None	2018-09-26 18:22:05 UTC

Description Persona non grata 2018-05-28 06:01:53 UTC

Description of problem:
With cluster config 1 Mon+1 Mgr,4 MDSs,4 OSDs and 4 Clients,3 MDSs are getting active after fresh cluster setup with 1 as standby, which used to be 1 active mds and 3 standbys

Version-Release number of selected component (if applicable):
ceph : ceph-12.2.5-12.el7cp
os: Red Hat Enterprise Linux Server release 7.5 (Maipo)


How reproducible:
Always

Steps to Reproduce:
1.Set up a ceph 3.1 cluster with configs above mentioned.
2.Check ceph health ,should see 1 active mds with 3 standby

Actual results:
    id:     897b8555-f20a-48d5-9d8a-4cb03ea26ee3
    health: HEALTH_WARN
            too few PGs per OSD (20 < min 30)
 
  services:
    mon: 1 daemons, quorum ceph-sshreeka-run125-node1-monmgrinstaller
    mgr: ceph-sshreeka-run125-node1-monmgrinstaller(active)
    mds: cephfs-3/3/3 up  {0=ceph-sshreeka-run125-node6-mds=up:active,1=ceph-sshreeka-run125-node4-mds=up:active,2=ceph-sshreeka-run125-node3-mds=up:active}, 1 up:standby
    osd: 12 osds: 12 up, 12 in
 
  data:
    pools:   3 pools, 80 pgs
    objects: 57 objects, 4870 bytes
    usage:   1298 MB used, 346 GB / 347 GB avail
    pgs:     80 active+clean

Expected results:
Should have 1 active-mds

Additional info:
Log of ansible: http://magna002.ceph.redhat.com/cephci-jenkins/cephci-run-1527140513806/ceph_ansible_0.log

Comment 4 Christina Meno 2018-05-30 15:18:03 UTC

Shreekara would you please provide the ceph.conf in the nodes hosting the MDS services ?

Looking at the docs it seems that this is multimds config http://docs.ceph.com/docs/master/cephfs/multimds/

Either way I think that Patrick would be best suited to advise and configure the default behavior.

It appears as though this is expected for luminous(RHCS3) and was not for jewel(RHCS2)
see https://github.com/ceph/ceph-ansible/commit/683bec9eb231cfcd97e93eaee25a4f5b0e9f76ab#diff-f840ae8f0b27425aa50c964de6b8a46fR23

Comment 5 Persona non grata 2018-05-30 17:49:53 UTC

Hi Gregory,
Here is the links of ceph.conf  file's contents pasted :
http://pastebin.test.redhat.com/597636
http://pastebin.test.redhat.com/597639
http://pastebin.test.redhat.com/597640
http://pastebin.test.redhat.com/597641

Comment 8 Ken Dreyer (Red Hat) 2018-07-10 17:56:02 UTC

https://github.com/ceph/ceph-ansible/pull/2719 was tagged back in ceph-ansible v3.1.0rc7.

I'm setting Fixed In Version to the latest available ceph-ansible NVR.

Comment 11 errata-xmlrpc 2018-09-26 18:21:13 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:2819

Note You need to log in before you can comment on or make changes to this bug.