This service will be undergoing maintenance at 00:00 UTC, 2017-10-23 It is expected to last about 30 minutes
Bug 1299978 - Cluster is not achieving active + clean state on setting 'osd_crush_update_on_start = false' during installation
Cluster is not achieving active + clean state on setting 'osd_crush_update_on...
Status: CLOSED CURRENTRELEASE
Product: Red Hat Ceph Storage
Classification: Red Hat
Component: Documentation (Show other bugs)
1.3.2
x86_64 Linux
unspecified Severity medium
: rc
: 1.3.2
Assigned To: ceph-docs@redhat.com
ceph-qe-bugs
:
Depends On:
Blocks: 1249045
  Show dependency treegraph
 
Reported: 2016-01-19 11:22 EST by Rachana Patel
Modified: 2016-03-01 03:23 EST (History)
7 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2016-03-01 03:23:09 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Rachana Patel 2016-01-19 11:22:42 EST
Description of problem:
======================
Installed ceph via CDN and set 'osd_crush_update_on_start = false' as mentioned in install doc. Cluster was unable to achieve 'active+clean' state.

[racpatel@magna048 ~]$ sudo ceph -s
    cluster c9cf8beb-861e-4aba-a1a2-2734623502cf
     health HEALTH_WARN
            64 pgs stuck inactive
            64 pgs stuck unclean
            too few PGs per OSD (21 < min 30)
     monmap e1: 1 mons at {magna090=10.8.128.90:6789/0}
            election epoch 1, quorum 0 magna090
     osdmap e23: 3 osds: 3 up, 3 in
      pgmap v43: 64 pgs, 1 pools, 0 bytes data, 0 objects
            100656 kB used, 2778 GB / 2778 GB avail
                  64 creating



Version-Release number of selected component (if applicable):
============================================================
0.94.5-1.el7cp.x86_64

 

How reproducible:
=================
always


Steps to Reproduce:
===================
1. Ceph installation via cdn
refering doc 'https://gitlab.cee.redhat.com/jowilkin/red-hat-ceph-storage-installation-guide-rhel/blob/5689dfb78e7c07b15ae2b442c298ac314f591622/quick-ceph-deploy.adoc'

2. modified ceph-config file and set 'osd_crush_update_on_start = false'

3. after adding OSD verified cluster state. it never achieved 'active+clean' state

[racpatel@magna048 ~]$ sudo ceph -s
    cluster c9cf8beb-861e-4aba-a1a2-2734623502cf
     health HEALTH_WARN
            64 pgs stuck inactive
            64 pgs stuck unclean
            too few PGs per OSD (21 < min 30)
     monmap e1: 1 mons at {magna090=10.8.128.90:6789/0}
            election epoch 1, quorum 0 magna090
     osdmap e23: 3 osds: 3 up, 3 in
      pgmap v43: 64 pgs, 1 pools, 0 bytes data, 0 objects
            100656 kB used, 2778 GB / 2778 GB avail
                  64 creating


[c1@magna048 ceph-config]$ sudo ceph osd tree
ID WEIGHT TYPE NAME    UP/DOWN REWEIGHT PRIMARY-AFFINITY 
-1      0 root default                                   
 0      0 osd.0             up  1.00000          1.00000 
 1      0 osd.1             up  1.00000          1.00000 
 2      0 osd.2             up  1.00000          1.00000 



[racpatel@magna048 ~]$ sudo ceph pg dump
dumped all in format plain
version 43
stamp 2016-01-18 08:37:05.761242
last_osdmap_epoch 23
last_pg_scan 1
full_ratio 0.95
nearfull_ratio 0.85
pg_stat	objects	mip	degr	misp	unf	bytes	log	disklog	state	state_stamp	v	reported	up	up_primary	acting	acting_primary	last_scrub	scrub_stamp	last_deep_scrub	deep_scrub_stamp
0.22	0	0	0	0	0	0	0	0	creating	0.000000	0'0	0:0	[]	-1	[]	-1	0'0	2016-01-15 18:02:08.788131	0'0	2016-01-15 18:02:08.788131
0.21	0	0	0	0	0	0	0	0	creating	0.000000	0'0	0:0	[]	-1	[]	-1	0'0	2016-01-15 18:02:08.788130	0'0	2016-01-15 18:02:08.788130
0.20	0	0	0	0	0	0	0	0	creating	0.000000	0'0	0:0	[]	-1	[]	-1	0'0	2016-01-15 18:02:08.788130	0'0	2016-01-15 18:02:08.788130
0.1f	0	0	0	0	0	0	0	0	creating	0.000000	0'0	0:0	[]	-1	[]	-1	0'0	2016-01-15 18:02:08.788130	0'0	2016-01-15 18:02:08.788130
0.1e	0	0	0	0	0	0	0	0	creating	0.000000	0'0	0:0	[]	-1	[]	-1	0'0	2016-01-15 18:02:08.788129	0'0	2016-01-15 18:02:08.788129
0.1d	0	0	0	0	0	0	0	0	creating	0.000000	0'0	0:0	[]	-1	[]	-1	0'0	2016-01-15 18:02:08.788129	0'0	2016-01-15 18:02:08.788129
0.1c	0	0	0	0	0	0	0	0	creating	0.000000	0'0	0:0	[]	-1	[]	-1	0'0	2016-01-15 18:02:08.788129	0'0	2016-01-15 18:02:08.788129
0.1b	0	0	0	0	0	0	0	0	creating	0.000000	0'0	0:0	[]	-1	[]	-1	0'0	2016-01-15 18:02:08.788128	0'0	2016-01-15 18:02:08.788128
0.1a	0	0	0	0	0	0	0	0	creating	0.000000	0'0	0:0	[]	-1	[]	-1	0'0	2016-01-15 18:02:08.788128	0'0	2016-01-15 18:02:08.788128
0.19	0	0	0	0	0	0	0	0	creating	0.000000	0'0	0:0	[]	-1	[]	-1	0'0	2016-01-15 18:02:08.788128	0'0	2016-01-15 18:02:08.788128
0.18	0	0	0	0	0	0	0	0	creating	0.000000	0'0	0:0	[]	-1	[]	-1	0'0	2016-01-15 18:02:08.788128	0'0	2016-01-15 18:02:08.788128
0.17	0	0	0	0	0	0	0	0	creating	0.000000	0'0	0:0	[]	-1	[]	-1	0'0	2016-01-15 18:02:08.788127	0'0	2016-01-15 18:02:08.788127
0.16	0	0	0	0	0	0	0	0	creating	0.000000	0'0	0:0	[]	-1	[]	-1	0'0	2016-01-15 18:02:08.788127	0'0	2016-01-15 18:02:08.788127
0.15	0	0	0	0	0	0	0	0	creating	0.000000	0'0	0:0	[]	-1	[]	-1	0'0	2016-01-15 18:02:08.788127	0'0	2016-01-15 18:02:08.788127



0.25	0	0	0	0	0	0	0	0	creating	0.000000	0'0	0:0	[]	-1	[]	-1	0'0	2016-01-15 18:02:08.788137	0'0	2016-01-15 18:02:08.788137
0.24	0	0	0	0	0	0	0	0	creating	0.000000	0'0	0:0	[]	-1	[]	-1	0'0	2016-01-15 18:02:08.788137	0'0	2016-01-15 18:02:08.788137
0.23	0	0	0	0	0	0	0	0	creating	0.000000	0'0	0:0	[]	-1	[]	-1	0'0	2016-01-15 18:02:08.788136	0'0	2016-01-15 18:02:08.788136
pool 0	0	0	0	0	0	0	0	0
 sum	0	0	0	0	0	0	0	0
osdstat	kbused	kbavail	kb	hb in	hb out
0	33552	971010736	971044288	[]	[]
1	33552	971010736	971044288	[0,2]	[]
2	33552	971010736	971044288	[0]	[]
 sum	100656	2913032208	2913132864



4. removed 'osd_crush_update_on_start = false' from config file and restarted all daemons and checked output of OSD tree - hierarchy was not build.

5. created CRUSH hierarchy using 'ceph osd crush move' as below:-
[racpatel@magna100 ~]$ sudo ceph osd tree
ID WEIGHT  TYPE NAME         UP/DOWN REWEIGHT PRIMARY-AFFINITY 
-1 2.69998 root default                                        
-4 0.89999     host magna106                                   
 1 0.89999         osd.1          up  1.00000          1.00000 
-2 0.89999     host magna101                                   
 0 0.89999         osd.0          up  1.00000          1.00000 
-3 0.89999     host magna117                                   


 cluster achieved 'active+ clean' state.




Actual results:
==============
On setting 'osd_crush_update_on_start = false' during installation , Cluster is not achieving active + clean state.



Additional info:
=================
Did installation in another setup and didnt set 'osd_crush_update_on_start = false' in config file during installation and found that OSD tree shows proper hierarchy and cluster achieved 'active+clean' state.
Comment 2 Ken Dreyer (Red Hat) 2016-01-19 13:43:45 EST
This change was added to the docs in bug 1249045. Maybe we should revert that change? I'm blocking that bug with this one.
Comment 4 Hemanth Kumar 2016-02-03 03:40:36 EST
The section describing about "osd_crush_update_on_start" has been removed from the Doc now.. 
Verified the doc.. Moving to verified state..

Note You need to log in before you can comment on or make changes to this bug.