Bug 1299978 - Cluster is not achieving active + clean state on setting 'osd_crush_update_on_start = false' during installation
Summary: Cluster is not achieving active + clean state on setting 'osd_crush_update_on...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat
Component: Documentation
Version: 1.3.2
Hardware: x86_64
OS: Linux
unspecified
medium
Target Milestone: rc
: 1.3.2
Assignee: ceph-docs@redhat.com
QA Contact: ceph-qe-bugs
URL:
Whiteboard:
Depends On:
Blocks: 1249045
TreeView+ depends on / blocked
 
Reported: 2016-01-19 16:22 UTC by Rachana Patel
Modified: 2016-03-01 08:23 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-03-01 08:23:09 UTC
Target Upstream Version:


Attachments (Terms of Use)

Description Rachana Patel 2016-01-19 16:22:42 UTC
Description of problem:
======================
Installed ceph via CDN and set 'osd_crush_update_on_start = false' as mentioned in install doc. Cluster was unable to achieve 'active+clean' state.

[racpatel@magna048 ~]$ sudo ceph -s
    cluster c9cf8beb-861e-4aba-a1a2-2734623502cf
     health HEALTH_WARN
            64 pgs stuck inactive
            64 pgs stuck unclean
            too few PGs per OSD (21 < min 30)
     monmap e1: 1 mons at {magna090=10.8.128.90:6789/0}
            election epoch 1, quorum 0 magna090
     osdmap e23: 3 osds: 3 up, 3 in
      pgmap v43: 64 pgs, 1 pools, 0 bytes data, 0 objects
            100656 kB used, 2778 GB / 2778 GB avail
                  64 creating



Version-Release number of selected component (if applicable):
============================================================
0.94.5-1.el7cp.x86_64

 

How reproducible:
=================
always


Steps to Reproduce:
===================
1. Ceph installation via cdn
refering doc 'https://gitlab.cee.redhat.com/jowilkin/red-hat-ceph-storage-installation-guide-rhel/blob/5689dfb78e7c07b15ae2b442c298ac314f591622/quick-ceph-deploy.adoc'

2. modified ceph-config file and set 'osd_crush_update_on_start = false'

3. after adding OSD verified cluster state. it never achieved 'active+clean' state

[racpatel@magna048 ~]$ sudo ceph -s
    cluster c9cf8beb-861e-4aba-a1a2-2734623502cf
     health HEALTH_WARN
            64 pgs stuck inactive
            64 pgs stuck unclean
            too few PGs per OSD (21 < min 30)
     monmap e1: 1 mons at {magna090=10.8.128.90:6789/0}
            election epoch 1, quorum 0 magna090
     osdmap e23: 3 osds: 3 up, 3 in
      pgmap v43: 64 pgs, 1 pools, 0 bytes data, 0 objects
            100656 kB used, 2778 GB / 2778 GB avail
                  64 creating


[c1@magna048 ceph-config]$ sudo ceph osd tree
ID WEIGHT TYPE NAME    UP/DOWN REWEIGHT PRIMARY-AFFINITY 
-1      0 root default                                   
 0      0 osd.0             up  1.00000          1.00000 
 1      0 osd.1             up  1.00000          1.00000 
 2      0 osd.2             up  1.00000          1.00000 



[racpatel@magna048 ~]$ sudo ceph pg dump
dumped all in format plain
version 43
stamp 2016-01-18 08:37:05.761242
last_osdmap_epoch 23
last_pg_scan 1
full_ratio 0.95
nearfull_ratio 0.85
pg_stat	objects	mip	degr	misp	unf	bytes	log	disklog	state	state_stamp	v	reported	up	up_primary	acting	acting_primary	last_scrub	scrub_stamp	last_deep_scrub	deep_scrub_stamp
0.22	0	0	0	0	0	0	0	0	creating	0.000000	0'0	0:0	[]	-1	[]	-1	0'0	2016-01-15 18:02:08.788131	0'0	2016-01-15 18:02:08.788131
0.21	0	0	0	0	0	0	0	0	creating	0.000000	0'0	0:0	[]	-1	[]	-1	0'0	2016-01-15 18:02:08.788130	0'0	2016-01-15 18:02:08.788130
0.20	0	0	0	0	0	0	0	0	creating	0.000000	0'0	0:0	[]	-1	[]	-1	0'0	2016-01-15 18:02:08.788130	0'0	2016-01-15 18:02:08.788130
0.1f	0	0	0	0	0	0	0	0	creating	0.000000	0'0	0:0	[]	-1	[]	-1	0'0	2016-01-15 18:02:08.788130	0'0	2016-01-15 18:02:08.788130
0.1e	0	0	0	0	0	0	0	0	creating	0.000000	0'0	0:0	[]	-1	[]	-1	0'0	2016-01-15 18:02:08.788129	0'0	2016-01-15 18:02:08.788129
0.1d	0	0	0	0	0	0	0	0	creating	0.000000	0'0	0:0	[]	-1	[]	-1	0'0	2016-01-15 18:02:08.788129	0'0	2016-01-15 18:02:08.788129
0.1c	0	0	0	0	0	0	0	0	creating	0.000000	0'0	0:0	[]	-1	[]	-1	0'0	2016-01-15 18:02:08.788129	0'0	2016-01-15 18:02:08.788129
0.1b	0	0	0	0	0	0	0	0	creating	0.000000	0'0	0:0	[]	-1	[]	-1	0'0	2016-01-15 18:02:08.788128	0'0	2016-01-15 18:02:08.788128
0.1a	0	0	0	0	0	0	0	0	creating	0.000000	0'0	0:0	[]	-1	[]	-1	0'0	2016-01-15 18:02:08.788128	0'0	2016-01-15 18:02:08.788128
0.19	0	0	0	0	0	0	0	0	creating	0.000000	0'0	0:0	[]	-1	[]	-1	0'0	2016-01-15 18:02:08.788128	0'0	2016-01-15 18:02:08.788128
0.18	0	0	0	0	0	0	0	0	creating	0.000000	0'0	0:0	[]	-1	[]	-1	0'0	2016-01-15 18:02:08.788128	0'0	2016-01-15 18:02:08.788128
0.17	0	0	0	0	0	0	0	0	creating	0.000000	0'0	0:0	[]	-1	[]	-1	0'0	2016-01-15 18:02:08.788127	0'0	2016-01-15 18:02:08.788127
0.16	0	0	0	0	0	0	0	0	creating	0.000000	0'0	0:0	[]	-1	[]	-1	0'0	2016-01-15 18:02:08.788127	0'0	2016-01-15 18:02:08.788127
0.15	0	0	0	0	0	0	0	0	creating	0.000000	0'0	0:0	[]	-1	[]	-1	0'0	2016-01-15 18:02:08.788127	0'0	2016-01-15 18:02:08.788127



0.25	0	0	0	0	0	0	0	0	creating	0.000000	0'0	0:0	[]	-1	[]	-1	0'0	2016-01-15 18:02:08.788137	0'0	2016-01-15 18:02:08.788137
0.24	0	0	0	0	0	0	0	0	creating	0.000000	0'0	0:0	[]	-1	[]	-1	0'0	2016-01-15 18:02:08.788137	0'0	2016-01-15 18:02:08.788137
0.23	0	0	0	0	0	0	0	0	creating	0.000000	0'0	0:0	[]	-1	[]	-1	0'0	2016-01-15 18:02:08.788136	0'0	2016-01-15 18:02:08.788136
pool 0	0	0	0	0	0	0	0	0
 sum	0	0	0	0	0	0	0	0
osdstat	kbused	kbavail	kb	hb in	hb out
0	33552	971010736	971044288	[]	[]
1	33552	971010736	971044288	[0,2]	[]
2	33552	971010736	971044288	[0]	[]
 sum	100656	2913032208	2913132864



4. removed 'osd_crush_update_on_start = false' from config file and restarted all daemons and checked output of OSD tree - hierarchy was not build.

5. created CRUSH hierarchy using 'ceph osd crush move' as below:-
[racpatel@magna100 ~]$ sudo ceph osd tree
ID WEIGHT  TYPE NAME         UP/DOWN REWEIGHT PRIMARY-AFFINITY 
-1 2.69998 root default                                        
-4 0.89999     host magna106                                   
 1 0.89999         osd.1          up  1.00000          1.00000 
-2 0.89999     host magna101                                   
 0 0.89999         osd.0          up  1.00000          1.00000 
-3 0.89999     host magna117                                   


 cluster achieved 'active+ clean' state.




Actual results:
==============
On setting 'osd_crush_update_on_start = false' during installation , Cluster is not achieving active + clean state.



Additional info:
=================
Did installation in another setup and didnt set 'osd_crush_update_on_start = false' in config file during installation and found that OSD tree shows proper hierarchy and cluster achieved 'active+clean' state.

Comment 2 Ken Dreyer (Red Hat) 2016-01-19 18:43:45 UTC
This change was added to the docs in bug 1249045. Maybe we should revert that change? I'm blocking that bug with this one.

Comment 4 Hemanth Kumar 2016-02-03 08:40:36 UTC
The section describing about "osd_crush_update_on_start" has been removed from the Doc now.. 
Verified the doc.. Moving to verified state..


Note You need to log in before you can comment on or make changes to this bug.