Hide Forgot
Description of problem: Nodes which are not connected to storage management network get incorrect cluster_network set in /etc/ceph/ceph.conf. For instance a compute node gets the following config: /etc/ceph/ceph.conf [global] osd_pool_default_min_size = 1 auth_service_required = cephx mon_initial_members = overcloud-serviceapi-0,overcloud-serviceapi-1,overcloud-serviceapi-2 fsid = d825caf0-a446-11e6-91fe-525400a81fbf cluster_network = 192.168.0.18/25 auth_supported = cephx auth_cluster_required = cephx mon_host = 10.0.0.154,10.0.0.153,10.0.0.157 auth_client_required = cephx public_network = 10.0.0.144/25 Where 192.168.0.0/25 network is the ctlplane network. While a Ceph storage node gets the following: [global] osd_pool_default_min_size = 1 auth_service_required = cephx mon_initial_members = overcloud-serviceapi-0,overcloud-serviceapi-1,overcloud-serviceapi-2 fsid = d825caf0-a446-11e6-91fe-525400a81fbf cluster_network = 10.0.1.14/25 auth_supported = cephx auth_cluster_required = cephx mon_host = 10.0.0.154,10.0.0.153,10.0.0.157 auth_client_required = cephx public_network = 10.0.0.152/25 Version-Release number of selected component (if applicable): openstack-tripleo-heat-templates-5.0.0-1.2.el7ost.noarch How reproducible: 100% I'm not sure if this could lead to any issue but the config on the ceph node is not reflecting the cluster network addressing.
When the config files have references to a network which isn't deployed on a node, these fallback to the 'ctlplane' by design, the behaviour we're seeing seems to match the expectations as these nodes don't have a leg on the storage management network. Ceph clients don't need to use the Ceph cluster_network so this should not be an issue. For those nodes running the CephOSD service instead, which does use cluster_network, it is necessary to configure a NIC on the storage management network.