Bug 1374985 - Director report "Overcloud Deployed" while Ceph OSD is faild to start
Summary: Director report "Overcloud Deployed" while Ceph OSD is faild to start
Keywords:
Status: CLOSED DUPLICATE of bug 1372804
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: rhosp-director
Version: 10.0 (Newton)
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: 10.0 (Newton)
Assignee: John Fulton
QA Contact: Omri Hochman
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-09-11 11:54 UTC by Asaf Hirshberg
Modified: 2016-09-15 13:38 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-09-15 13:38:57 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Asaf Hirshberg 2016-09-11 11:54:38 UTC
Description of problem:
OSPD10 director report that the overcloud deployed successfully while Ceph OSD is down:

Stack overcloud CREATE_COMPLETE
...
...
Overcloud Endpoint: http://10.35.180.17:5000/v2.0
Overcloud Deployed

Tried to upload an image and saw that it's stuck on saving:
(automation) [stack@puma33 sts]$ glance image-list
+--------------------------------------+----------+-------------+------------------+---------+--------+
| ID                                   | Name     | Disk Format | Container Format | Size    | Status |
+--------------------------------------+----------+-------------+------------------+---------+--------+
| f3270120-7f36-4b74-b3cd-7fc73f177680 | cirros33 | qcow2       | bare             | 9761280 | saving |
+--------------------------------------+----------+-------------+------------------+---------+--------+

From ceph storage:

[root@overcloud-cephstorage-0 ~]# ceph health
HEALTH_ERR 384 pgs are stuck inactive for more than 300 seconds; 384 pgs stuck inactive
[root@overcloud-cephstorage-0 ~]# ceph osd tree
ID WEIGHT  TYPE NAME                        UP/DOWN REWEIGHT PRIMARY-AFFINITY 
-1 0.42679 root default                                                       
-2 0.42679     host overcloud-cephstorage-0                                   
 0 0.42679         osd.0                       down        0          1.00000 
[root@overcloud-cephstorage-0 ~]# 




Version-Release number of selected component (if applicable):
OpenStack-10.0-RHEL-7 Puddle: 2016-09-07.6

environemnt: bare-metal 3 controllers, 2 computes, 1 ceph storage

Steps to Reproduce:
1. use storage.environment without any costumization..
2.deploy the overcloud with 3 controllers,2 computes, 1 ceph

Actual results:
deployed succeeded but ceph not operational 

Expected results:
ceph OSD is UP and operational

Additional info:

Comment 2 John Fulton 2016-09-14 15:12:43 UTC
Asaf,

What is your storage backend? If it's file, the default when you don't specify a a dedicated block device to be an OSD [1], then it could just be another instance of BZ 1372804 which will happen with our current puddle images. Please upload a tarball of the heat templates you used to do the deploy so we may verify. 

  John

[1] Here's an example of specifying a dedicated block dev to be an OSD. Did you do this? 

CephStorageExtraConfig:
    ceph::profile::params::osds:
      '/dev/sdb':
        journal: '/dev/sdm'

Comment 3 Asaf Hirshberg 2016-09-15 08:41:10 UTC
John,

I used the template for the storage-environment and the default as I did in ospd-9: 
ceph::profile::params::osds: {/srv/data: {}}

Comment 4 John Fulton 2016-09-15 13:38:57 UTC
Hi Asaf,

If you're using the OSP10 puddle and you're using the following: 

 ceph::profile::params::osds: {/srv/data: {}}

Then you'll run into BZ 1372804, so I'm closing this BZ as a duplicate. 

Have a look at BZ 1372804 to understand the root cause and planned fix in addition to a workaround. 

  John

*** This bug has been marked as a duplicate of bug 1372804 ***


Note You need to log in before you can comment on or make changes to this bug.