Bug 1680155

Summary:	ceph-ansible is configuring VIP address for MON and RGW
Product:	[Red Hat Storage] Red Hat Ceph Storage	Reporter:	broskos
Component:	Ceph-Ansible	Assignee:	Dimitri Savineau <dsavinea>
Status:	CLOSED ERRATA	QA Contact:	Yogev Rabl <yrabl>
Severity:	medium	Docs Contact:
Priority:	medium
Version:	3.2	CC:	alink, anharris, aschoen, broskos, ceph-eng-bugs, cpaquin, dsavinea, edonnell, gabrioux, gfidente, gmeno, jhardee, johfulto, mburns, mskalski, nalmond, nthomas, sankarshan, tchandra, tserlin, vashastr, yrabl
Target Milestone:	rc	Keywords:	Triaged, ZStream
Target Release:	3.3
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:	RHEL: ceph-ansible-3.2.16-1.el7cp Ubuntu: ceph-ansible_3.2.16-2redhat1	Doc Type:	Bug Fix
Doc Text:	.Virtual IPv6 addresses are no longer configured for MON and RGW daemons Previously, virtual IPv6 addresses could be configured in the Ceph configuration file for MON and RGW daemons because virtual IPv6 addresses are the first value present in the Ansible IPv6 address fact. The underlying code has been changed, and the last value in the Ansible IPv6 address fact is now used, and MON and RGW IPv6 configurations are set to the right value.	Story Points:	---
Clone Of:		Environment:
Last Closed:	2019-08-21 15:10:25 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	1578730, 1726135

Description broskos 2019-02-22 20:46:22 UTC

Description of problem:
Ceph-ansible is grabbing the VIP address and using it for MON, MGR and RGW containers for the controller node that hosts the storage VIP when the deployment is run.

Version-Release number of selected component (if applicable):
ceph-ansible-3.1.5-1.el7cp.noarch

How reproducible:
Seems consistent to me, got it 3 times in a row, however it's hard to detect since the intitial deploy completes successfully.  I marked this urgent because of how hard it is to notice

Steps to Reproduce:
1.  Configure Storage and Cluster networks for IPv6
2.  Deploy cluster with OSP-d 13, works fine
3.  Re-deploy with exact same settings, redeploy fails

Troubleshooting and you find that the RGW container on the controller with the VIP fails to start,  get poking around and you find that the VIP is configured in the ceph.conf file for MON and RGW

Please find the following VIP entry in the PCS Status and ceph.conf output below: 

2605:1c00:50f2:29a8::19



pcs status
---------------------------------------------------------
 pcs status
Cluster name: tripleo_cluster
Stack: corosync
Current DC: engcloud-controller-2 (version 1.1.19-8.el7_6.2-c3c624ea3d) - partition with quorum
Last updated: Fri Feb 22 20:36:14 2019
Last change: Fri Feb 22 20:05:42 2019 by root via cibadmin on engcloud-controller-1

12 nodes configured
38 resources configured

Online: [ engcloud-controller-0 engcloud-controller-1 engcloud-controller-2 ]
GuestOnline: [ galera-bundle-0@engcloud-controller-0 galera-bundle-1@engcloud-controller-1 galera-bundle-2@engcloud-controller-2 rabbitmq-bundle-0@engcloud-controller-0 rabbitmq-bundle-1@engcloud-controller-1 rabbitmq-bundle-2@engcloud-controller-2 redis-bundle-0@engcloud-controller-0 redis-bundle-1@engcloud-controller-1 redis-bundle-2@engcloud-controller-2 ]

Full list of resources:

 Docker container set: rabbitmq-bundle [satellite-eng.nfv.charterlab.com:5000/nfv_charterlab_com-eng-osp13-osp13_containers-rabbitmq:pcmklatest]
   rabbitmq-bundle-0    (ocf::heartbeat:rabbitmq-cluster):      Started engcloud-controller-0
   rabbitmq-bundle-1    (ocf::heartbeat:rabbitmq-cluster):      Started engcloud-controller-1
   rabbitmq-bundle-2    (ocf::heartbeat:rabbitmq-cluster):      Started engcloud-controller-2
 Docker container set: galera-bundle [satellite-eng.nfv.charterlab.com:5000/nfv_charterlab_com-eng-osp13-osp13_containers-mariadb:pcmklatest]
   galera-bundle-0      (ocf::heartbeat:galera):        Master engcloud-controller-0
   galera-bundle-1      (ocf::heartbeat:galera):        Master engcloud-controller-1
   galera-bundle-2      (ocf::heartbeat:galera):        Master engcloud-controller-2
 Docker container set: redis-bundle [satellite-eng.nfv.charterlab.com:5000/nfv_charterlab_com-eng-osp13-osp13_containers-redis:pcmklatest]
   redis-bundle-0       (ocf::heartbeat:redis): Master engcloud-controller-0
   redis-bundle-1       (ocf::heartbeat:redis): Slave engcloud-controller-1
   redis-bundle-2       (ocf::heartbeat:redis): Slave engcloud-controller-2
 ip-44.154.16.45        (ocf::heartbeat:IPaddr2):       Started engcloud-controller-0
 ip-2605.1c00.50f2.2980.44.154.0.30     (ocf::heartbeat:IPaddr2):       Started engcloud-controller-1
 ip-2605.1c00.50f2.2998..11     (ocf::heartbeat:IPaddr2):       Started engcloud-controller-2
 ip-2605.1c00.50f2.2998..1e     (ocf::heartbeat:IPaddr2):       Started engcloud-controller-0
 ip-2605.1c00.50f2.29a8..19     (ocf::heartbeat:IPaddr2):       Started engcloud-controller-1
 ip-2605.1c00.50f2.29b0..10     (ocf::heartbeat:IPaddr2):       Started engcloud-controller-2
 Docker container set: haproxy-bundle [satellite-eng.nfv.charterlab.com:5000/nfv_charterlab_com-eng-osp13-osp13_containers-haproxy:pcmklatest]
   haproxy-bundle-docker-0      (ocf::heartbeat:docker):        Started engcloud-controller-0
   haproxy-bundle-docker-1      (ocf::heartbeat:docker):        Started engcloud-controller-1
   haproxy-bundle-docker-2      (ocf::heartbeat:docker):        Started engcloud-controller-2
 Docker container: openstack-cinder-volume [satellite-eng.nfv.charterlab.com:5000/nfv_charterlab_com-eng-osp13-osp13_containers-cinder-volume:pcmklatest]
   openstack-cinder-volume-docker-0     (ocf::heartbeat:docker):        Started engcloud-controller-0
 Docker container: openstack-cinder-backup [satellite-eng.nfv.charterlab.com:5000/nfv_charterlab_com-eng-osp13-osp13_containers-cinder-backup:pcmklatest]
   openstack-cinder-backup-docker-0     (ocf::heartbeat:docker):        Started engcloud-controller-1

--------------------------------------------------

ceph.conf
--------------------------------------------------
[client.libvirt]
admin socket = /var/run/ceph/$cluster-$type.$id.$pid.$cctid.asok # must be writable by QEMU and allowed by SELinux or AppArmor
log file = /var/log/ceph/qemu-guest-$pid.log # must be writable by QEMU and allowed by SELinux or AppArmor

[client.rgw.engcloud-controller-0]
host = engcloud-controller-0
keyring = /var/lib/ceph/radosgw/ceph-rgw.engcloud-controller-0/keyring
log file = /var/log/ceph/ceph-rgw-engcloud-controller-0.log
rgw frontends = civetweb port=[2605:1c00:50f2:29a8::23]:8080 num_threads=100

[client.rgw.engcloud-controller-1]
host = engcloud-controller-1
keyring = /var/lib/ceph/radosgw/ceph-rgw.engcloud-controller-1/keyring
log file = /var/log/ceph/ceph-rgw-engcloud-controller-1.log
rgw frontends = civetweb port=[2605:1c00:50f2:29a8::19]:8080 num_threads=100

[client.rgw.engcloud-controller-2]
host = engcloud-controller-2
keyring = /var/lib/ceph/radosgw/ceph-rgw.engcloud-controller-2/keyring
log file = /var/log/ceph/ceph-rgw-engcloud-controller-2.log
rgw frontends = civetweb port=[2605:1c00:50f2:29a8::17]:8080 num_threads=100

# Please do not change this file directly since it is managed by Ansible and will be overwritten
[global]
# let's force the admin socket the way it was so we can properly check for existing instances
# also the line $cluster-$name.$pid.$cctid.asok is only needed when running multiple instances
# of the same daemon, thing ceph-ansible cannot do at the time of writing
admin socket = "$run_dir/$cluster-$name.asok"
cluster network = 2605:1c00:50f2:29b0::/64,2605:1c00:50f2:29b1::/64
fsid = 84cf784a-2baa-11e9-bdf3-525400afa1a4
journal_size = 5120
log file = /dev/null
mon cluster log file = /dev/null
mon host = [2605:1c00:50f2:29a8::19],[2605:1c00:50f2:29a8::23],[2605:1c00:50f2:29a8::17]
mon initial members = engcloud-controller-1,engcloud-controller-0,engcloud-controller-2
mon_max_pg_per_osd = 3072
ms bind ipv6 = true
osd_pool_default_min_size = 2
osd_pool_default_pg_num = 128
osd_pool_default_pgp_num = 128
osd_pool_default_size = 3
public network = 2605:1c00:50f2:29a8::/64,2605:1c00:50f2:29a9::/64
rgw_keystone_accepted_roles = Member, admin
rgw_keystone_admin_domain = default
rgw_keystone_admin_password = GNTyKv8Q6ZECvx4vBeKAvGZaD
rgw_keystone_admin_project = service
rgw_keystone_admin_user = swift
rgw_keystone_api_version = 3
rgw_keystone_implicit_tenants = true
rgw_keystone_revocation_interval = 0
rgw_keystone_url = http://[2605:1c00:50f2:2998::1e]:5000
rgw_s3_auth_use_keystone = true

----------------------------------------------------------

Actual results:
Ceph.conf contains VIP address for MON and RGW

Expected results:
ceph.conf should contain server address not VIP

Additional info:
Looks like ceph-ansible is picking the first address in the list:
https://github.com/ceph/ceph-ansible/blob/0eb56e36f8ce52015aa6c343faccd589e5fd2c6c/roles/ceph-facts/tasks/set_radosgw_address.yml#L29

This may need to look for netmask info or something to ensure it's not a vip... 
/32 for IPv4 /128 for IPv6.

Comment 1 broskos 2019-02-23 13:00:34 UTC

Update:  This behavior is triggered by re-running the overcloud deploy, like one would do to change a config value or perform a scale out.  The initial deployment does not have this issue, but as soon as deployment completes, if I re-run the exact same deploy command this issue crops up.  

Also note that it's important to check the ceph.conf on the node that is hosting the storage VIP after the re-deploy to see the issue.

I think during the initial deploy that ceph-ansible completes before the VIPs are created in PCS, so the problem does not occur.

Comment 2 Giulio Fidente 2019-02-26 12:04:04 UTC

thanks, despite not affecting fresh deployments, I see how this can be hit on further stack updates, looks pretty urgent indeed

Comment 3 mskalski 2019-03-21 12:44:18 UTC

I observe the same behaviour and just to add to what already been said, issue will be probably only visible if you use IPv6 addresses. It looks like when new ipv4 address is added to interface it lands on the end of the IP lists, but in case of IPv6 it actually becomes first address of the interface and this is what ansible grabs when ceph-ansible playbooks are rerun.

Comment 6 Giulio Fidente 2019-03-26 13:19:53 UTC

(In reply to mskalski from comment #3)
> I observe the same behaviour and just to add to what already been said,
> issue will be probably only visible if you use IPv6 addresses. It looks like
> when new ipv4 address is added to interface it lands on the end of the IP
> lists, but in case of IPv6 it actually becomes first address of the
> interface and this is what ansible grabs when ceph-ansible playbooks are
> rerun.

I was thinking to reprise the suggestion in the BZ report "This may need to look for netmask info or something to ensure it's not a vip... /32 for IPv4 /128 for IPv6."; can you confirm the IPv6 address has /128 subnet?

Comment 7 mskalski 2019-03-26 13:28:14 UTC

(In reply to Giulio Fidente from comment #6)
> I was thinking to reprise the suggestion in the BZ report "This may need to
> look for netmask info or something to ensure it's not a vip... /32 for IPv4
> /128 for IPv6."; can you confirm the IPv6 address has /128 subnet?

Yes vip address has /128 mask:

[root@overcloud8yi-ctrl-0 ~]# pcs status

Full list of resources:

 Docker container set: rabbitmq-bundle [192.168.213.1:8787/rhosp13/openstack-rabbitmq:pcmklatest]
   rabbitmq-bundle-0	(ocf::heartbeat:rabbitmq-cluster):	Started overcloud8yi-ctrl-0
   rabbitmq-bundle-1	(ocf::heartbeat:rabbitmq-cluster):	Started overcloud8yi-ctrl-1
   rabbitmq-bundle-2	(ocf::heartbeat:rabbitmq-cluster):	Started overcloud8yi-ctrl-2
 Docker container set: galera-bundle [192.168.213.1:8787/rhosp13/openstack-mariadb:pcmklatest]
   galera-bundle-0	(ocf::heartbeat:galera):	Master overcloud8yi-ctrl-0
   galera-bundle-1	(ocf::heartbeat:galera):	Master overcloud8yi-ctrl-1
   galera-bundle-2	(ocf::heartbeat:galera):	Master overcloud8yi-ctrl-2
 ip-192.168.213.60	(ocf::heartbeat:IPaddr2):	Started overcloud8yi-ctrl-0
 ip-10.87.4.227	(ocf::heartbeat:IPaddr2):	Started overcloud8yi-ctrl-1
 ip-172.16.0.90	(ocf::heartbeat:IPaddr2):	Started overcloud8yi-ctrl-2
 ip-fd9e.2d4e.a32a.7777..17	(ocf::heartbeat:IPaddr2):	Started overcloud8yi-ctrl-0


[root@overcloud8yi-ctrl-0 ~]# ip a s vlan333
6: vlan333@eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 52:54:00:d0:c6:6a brd ff:ff:ff:ff:ff:ff
    inet6 fd9e:2d4e:a32a:7777::17/128 scope global
       valid_lft forever preferred_lft forever
    inet6 fd9e:2d4e:a32a:7777::14/64 scope global
       valid_lft forever preferred_lft forever
    inet6 fe80::5054:ff:fed0:c66a/64 scope link
       valid_lft forever preferred_lft forever

Comment 8 broskos 2019-03-26 13:30:43 UTC

yes:

from pcs:

ip-2605.1c00.50f2.2900.44.150.0.30     (ocf::heartbeat:IPaddr2):       Started qacloud-controller-1


[root@qacloud-controller-1 ~]# ip a |grep 2900
    inet6 2605:1c00:50f2:2900:44:150:0:30/128 scope global
    inet6 2605:1c00:50f2:2900::20/64 scope global
[root@qacloud-controller-1 ~]#

Comment 9 Giulio Fidente 2019-03-26 17:23:48 UTC

Thanks for helping; for ipv4 deployments we seem to be passing the right argument to ipaddr (ip/prefix). For example in a recent CI job [1] monitor_address_block is selecting only /24 addresses.

Can you help collecting this same information from your ipv6 deployment? In OSP14 the ceph-ansible inventory and group_vars are saved in a path like "/var/lib/mistral/overcloud/ceph-ansible". In OSP13 deployments instead the inventory and vars are saved in a temporary directory (created under /tmp) if you set CephAnsiblePlaybookVerbosity to 1 in a Heat environment file before the deployment; alternatively it should be possible to grep these params in the mistral-executor logs.

1. http://logs.openstack.org/23/638323/6/check/tripleo-ci-centos-7-scenario004-standalone/aaddc99/logs/undercloud/home/zuul/undercloud-ansible-ScgSrj/ceph-ansible/group_vars/all.yml.txt.gz

Comment 10 mskalski 2019-03-26 22:47:27 UTC

So this is new deployment with this vip for storage network:

[root@overcloud8yi-ctrl-0 ~]# pcs resource show ip-fd9e.2d4e.a32a.7777..26
 Resource: ip-fd9e.2d4e.a32a.7777..26 (class=ocf provider=heartbeat type=IPaddr2)
  Attributes: cidr_netmask=128 ip=fd9e:2d4e:a32a:7777::26 lvs_ipv6_addrlabel=true lvs_ipv6_addrlabel_value=99 nic=vlan333
  Meta Attrs: resource-stickiness=INFINITY
  Operations: monitor interval=10s timeout=20s (ip-fd9e.2d4e.a32a.7777..26-monitor-interval-10s)
              start interval=0s timeout=20s (ip-fd9e.2d4e.a32a.7777..26-start-interval-0s)
              stop interval=0s timeout=20s (ip-fd9e.2d4e.a32a.7777..26-stop-interval-0s)

The monitor_address_block and radosgw_address_block are passed as expected with subnet address (from mistral executor.log):

u'radosgw_address_block': u'fd9e:2d4e:a32a:7777::/64', u'user_config': True, u'radosgw_keystone': True, u'ceph_mgr_docker_extra_env': u'-e MGR_DASHBOARD=0', u'ceph_docker_image_tag': u'3-23', u'containerized_deployment': True, u'public_network': u'fd9e:2d4e:a32a:7777::/64', u'generate_fsid': False, u'monitor_address_block': u'fd9e:2d4e:a32a:7777::/64'

What i think is happening on ceph ansible site is in this line [1] we gets local host addresses from given address_block [2] and the first one is chosen. The local list addresses on hosts looks like this

[root@overcloud8yi-ctrl-0 ~]# ansible -m setup localhost

    "ansible_facts": {
        "ansible_all_ipv4_addresses": [
            "192.168.213.52",
            "192.168.213.67",
            "10.87.4.235",
            "172.31.0.1",
            "172.16.0.103"
        ],
        "ansible_all_ipv6_addresses": [
            "fe80::5054:ff:fecc:7d59",
            "fe80::5054:ff:fee6:b6d1",
            "fe80::5054:ff:fe49:e0a1",
            "fe80::42:79ff:fe2c:d276",
            "fe80::5054:ff:fecc:7d59",
            "fd9e:2d4e:a32a:7777::26",
            "fd9e:2d4e:a32a:7777::19",
            "fe80::5054:ff:fecc:7d59"
        ],


So as mentioned earlier new IPv6 address becomes first on the list (as opposed to IPv4) and the the vip address fd9e:2d4e:a32a:7777::26 will be chosen.

Not sure what operation are performed on CI but please remember that this issue is most likely IPv6 specific due the new IP addresses order, and visible when stack is updated because then vip is present and becomes first IPv6 address. During initial deployment ceph configuration is prepared before vip is setup.   


[1] https://github.com/ceph/ceph-ansible/blob/v3.2.10/roles/ceph-config/templates/ceph.conf.j2#L47
[2] https://docs.ansible.com/ansible/latest/user_guide/playbooks_filters_ipaddr.html#getting-information-about-hosts-and-networks

Comment 11 Giulio Fidente 2019-03-27 06:25:29 UTC

hi, thanks again for helping with this bug.

it seems in fact that in this line [1], while ipaddr() does get the ip/prefix argument correctly and could theoretically filter out addresses out of the wanted prefix, the _addresses list generated as ansible fact does not include the ip prefixes :(

an alternative option could be to discard the address later in the process if its got a /128 or /32 prefix

1. https://github.com/ceph/ceph-ansible/blob/v3.2.10/roles/ceph-config/templates/ceph.conf.j2#L47

Comment 12 mskalski 2019-04-02 09:00:57 UTC

Hi, do you have any updates on this issue? I see that in 4.0 branch things changed a bit around getting IP addresses, maybe simplest solution is to just take last address of interface from given subnet in case of IPv6 in 3.2 branch and let 4.0 has more sophisticated way to get it?

Comment 27 errata-xmlrpc 2019-08-21 15:10:25 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2019:2538