Bug 1392304

Summary: Nodes which are not connected to storage management network get incorrect cluster_network set in /etc/ceph/ceph.conf
Product: Red Hat OpenStack Reporter: Marius Cornea <mcornea>
Component: rhosp-directorAssignee: Giulio Fidente <gfidente>
Status: CLOSED NOTABUG QA Contact: Omri Hochman <ohochman>
Severity: high Docs Contact:
Priority: unspecified    
Version: 10.0 (Newton)CC: dbecker, gfidente, jomurphy, jslagle, mburns, morazi, rhel-osp-director-maint
Target Milestone: ga   
Target Release: 10.0 (Newton)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-11-07 19:38:40 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Marius Cornea 2016-11-07 07:20:51 UTC
Description of problem:
Nodes which are not connected to storage management network get incorrect cluster_network set in /etc/ceph/ceph.conf. 

For instance a compute node gets the following config:

/etc/ceph/ceph.conf 

[global]
osd_pool_default_min_size = 1
auth_service_required = cephx
mon_initial_members = overcloud-serviceapi-0,overcloud-serviceapi-1,overcloud-serviceapi-2
fsid = d825caf0-a446-11e6-91fe-525400a81fbf
cluster_network = 192.168.0.18/25
auth_supported = cephx
auth_cluster_required = cephx
mon_host = 10.0.0.154,10.0.0.153,10.0.0.157
auth_client_required = cephx
public_network = 10.0.0.144/25

Where 192.168.0.0/25 network is the ctlplane network.

While a Ceph storage node gets the following:

[global]
osd_pool_default_min_size = 1
auth_service_required = cephx
mon_initial_members = overcloud-serviceapi-0,overcloud-serviceapi-1,overcloud-serviceapi-2
fsid = d825caf0-a446-11e6-91fe-525400a81fbf
cluster_network = 10.0.1.14/25
auth_supported = cephx
auth_cluster_required = cephx
mon_host = 10.0.0.154,10.0.0.153,10.0.0.157
auth_client_required = cephx
public_network = 10.0.0.152/25


Version-Release number of selected component (if applicable):
openstack-tripleo-heat-templates-5.0.0-1.2.el7ost.noarch

How reproducible:
100%

I'm not sure if this could lead to any issue but the config on the ceph node is not reflecting the cluster network addressing.

Comment 1 Giulio Fidente 2016-11-07 10:00:23 UTC
When the config files have references to a network which isn't deployed on a node, these fallback to the 'ctlplane' by design, the behaviour we're seeing seems to match the expectations as these nodes don't have a leg on the storage management network.

Ceph clients don't need to use the Ceph cluster_network so this should not be an issue. For those nodes running the CephOSD service instead, which does use cluster_network, it is necessary to configure a NIC on the storage management network.