Bug 1308562

Summary: Ceph monitor doesn't use the correct address for binding in IPv6 deployment
Product: Red Hat OpenStack Reporter: Marius Cornea <mcornea>
Component: openstack-puppet-modulesAssignee: Giulio Fidente <gfidente>
Status: CLOSED ERRATA QA Contact: Yogev Rabl <yrabl>
Severity: high Docs Contact:
Priority: high    
Version: 7.0 (Kilo)CC: augol, dbecker, gfidente, jcoufal, jguiditt, jschluet, jslagle, mburns, morazi, rhel-osp-director-maint, yeylon
Target Milestone: ga   
Target Release: 8.0 (Liberty)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: openstack-puppet-modules-7.0.14-1.el7ost Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1308910 (view as bug list) Environment:
Last Closed: 2016-04-07 21:28:27 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1299613, 1302593, 1308910, 1309816, 1309822    

Description Marius Cornea 2016-02-15 14:00:46 UTC
Description of problem:
IPv6 deployment with 1 ctrl, 3 ceph nodes times out. 

Version-Release number of selected component (if applicable):
openstack-tripleo-heat-templates-0.8.6-120.el7ost.noarch

How reproducible:
100%

Steps to Reproduce:
1. Deploy IPv6 environment with 1 x ctrl, 1 x compute, 3 ceph nodes 

Actual results:
Deployment gets stuck because the client is trying to reach the ceph mon on the local address while the ceph mon binds on the storage vip address. 

Expected results:
Deployments proceeds.

Comment 1 Giulio Fidente 2016-02-15 14:06:19 UTC
it looks like ceph-mon does not find the initial members list and binds on a local ip address from the public network instead of what is configured as mon_host in ceph.conf ... could need a require in the .pp forcing ::mon to be executed only after the config is dumped, investigating.

Comment 2 Giulio Fidente 2016-02-15 16:20:45 UTC
the problem is that ceph-mon won't find itself in the list of hosts given by 'mon host' and it will bind on the first local ip on the public network instead of what we pass in 'mon host'

this will be found in the ceph-mon logs

  0 ceph version 0.94.1 (e4bfad3a3c51054df7e537a724c8d0bf9be972ff), process ceph-mon, pid 6053
  0 mon.overcloud-controller-0 does not exist in monmap, will attempt to join an existing cluster
  0 using public_addr [fd00:fd00:fd00:3000::10]:0/0 -> [fd00:fd00:fd00:3000::10]:6789/0

while 'mon_host' in ceph.conf had:

mon_host = [fd00:fd00:fd00:3000::14]

this could be a clone of https://bugzilla.redhat.com/show_bug.cgi?id=1301701

Comment 3 Giulio Fidente 2016-02-15 17:46:38 UTC
we might be able to override the standard binding mechanism using 'public addr' but this requires a change in puppet-ceph ; working on it

Comment 4 Jaromir Coufal 2016-02-15 21:20:33 UTC
Not a blocker for 7.3. Moving to 8GA.

Comment 9 Jason Guiditta 2016-03-10 16:11:20 UTC
Merged upstream, built.

Comment 12 Yogev Rabl 2016-03-29 08:34:57 UTC
Verified by automation
version openstack-puppet-modules-7.0.16-1.el7ost.noarch

Comment 13 errata-xmlrpc 2016-04-07 21:28:27 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHEA-2016-0603.html