Bug 1680155
| Summary: | ceph-ansible is configuring VIP address for MON and RGW | ||
|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat Ceph Storage | Reporter: | broskos |
| Component: | Ceph-Ansible | Assignee: | Dimitri Savineau <dsavinea> |
| Status: | CLOSED ERRATA | QA Contact: | Yogev Rabl <yrabl> |
| Severity: | medium | Docs Contact: | |
| Priority: | medium | ||
| Version: | 3.2 | CC: | alink, anharris, aschoen, broskos, ceph-eng-bugs, cpaquin, dsavinea, edonnell, gabrioux, gfidente, gmeno, jhardee, johfulto, mburns, mskalski, nalmond, nthomas, sankarshan, tchandra, tserlin, vashastr, yrabl |
| Target Milestone: | rc | Keywords: | Triaged, ZStream |
| Target Release: | 3.3 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | RHEL: ceph-ansible-3.2.16-1.el7cp Ubuntu: ceph-ansible_3.2.16-2redhat1 | Doc Type: | Bug Fix |
| Doc Text: |
.Virtual IPv6 addresses are no longer configured for MON and RGW daemons
Previously, virtual IPv6 addresses could be configured in the Ceph configuration file for MON and RGW daemons because virtual IPv6 addresses are the first value present in the Ansible IPv6 address fact. The underlying code has been changed, and the last value in the Ansible IPv6 address fact is now used, and MON and RGW IPv6 configurations are set to the right value.
|
Story Points: | --- |
| Clone Of: | Environment: | ||
| Last Closed: | 2019-08-21 15:10:25 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | |||
| Bug Blocks: | 1578730, 1726135 | ||
|
Description
broskos
2019-02-22 20:46:22 UTC
Update: This behavior is triggered by re-running the overcloud deploy, like one would do to change a config value or perform a scale out. The initial deployment does not have this issue, but as soon as deployment completes, if I re-run the exact same deploy command this issue crops up. Also note that it's important to check the ceph.conf on the node that is hosting the storage VIP after the re-deploy to see the issue. I think during the initial deploy that ceph-ansible completes before the VIPs are created in PCS, so the problem does not occur. thanks, despite not affecting fresh deployments, I see how this can be hit on further stack updates, looks pretty urgent indeed I observe the same behaviour and just to add to what already been said, issue will be probably only visible if you use IPv6 addresses. It looks like when new ipv4 address is added to interface it lands on the end of the IP lists, but in case of IPv6 it actually becomes first address of the interface and this is what ansible grabs when ceph-ansible playbooks are rerun. (In reply to mskalski from comment #3) > I observe the same behaviour and just to add to what already been said, > issue will be probably only visible if you use IPv6 addresses. It looks like > when new ipv4 address is added to interface it lands on the end of the IP > lists, but in case of IPv6 it actually becomes first address of the > interface and this is what ansible grabs when ceph-ansible playbooks are > rerun. I was thinking to reprise the suggestion in the BZ report "This may need to look for netmask info or something to ensure it's not a vip... /32 for IPv4 /128 for IPv6."; can you confirm the IPv6 address has /128 subnet? (In reply to Giulio Fidente from comment #6) > I was thinking to reprise the suggestion in the BZ report "This may need to > look for netmask info or something to ensure it's not a vip... /32 for IPv4 > /128 for IPv6."; can you confirm the IPv6 address has /128 subnet? Yes vip address has /128 mask: [root@overcloud8yi-ctrl-0 ~]# pcs status Full list of resources: Docker container set: rabbitmq-bundle [192.168.213.1:8787/rhosp13/openstack-rabbitmq:pcmklatest] rabbitmq-bundle-0 (ocf::heartbeat:rabbitmq-cluster): Started overcloud8yi-ctrl-0 rabbitmq-bundle-1 (ocf::heartbeat:rabbitmq-cluster): Started overcloud8yi-ctrl-1 rabbitmq-bundle-2 (ocf::heartbeat:rabbitmq-cluster): Started overcloud8yi-ctrl-2 Docker container set: galera-bundle [192.168.213.1:8787/rhosp13/openstack-mariadb:pcmklatest] galera-bundle-0 (ocf::heartbeat:galera): Master overcloud8yi-ctrl-0 galera-bundle-1 (ocf::heartbeat:galera): Master overcloud8yi-ctrl-1 galera-bundle-2 (ocf::heartbeat:galera): Master overcloud8yi-ctrl-2 ip-192.168.213.60 (ocf::heartbeat:IPaddr2): Started overcloud8yi-ctrl-0 ip-10.87.4.227 (ocf::heartbeat:IPaddr2): Started overcloud8yi-ctrl-1 ip-172.16.0.90 (ocf::heartbeat:IPaddr2): Started overcloud8yi-ctrl-2 ip-fd9e.2d4e.a32a.7777..17 (ocf::heartbeat:IPaddr2): Started overcloud8yi-ctrl-0 [root@overcloud8yi-ctrl-0 ~]# ip a s vlan333 6: vlan333@eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether 52:54:00:d0:c6:6a brd ff:ff:ff:ff:ff:ff inet6 fd9e:2d4e:a32a:7777::17/128 scope global valid_lft forever preferred_lft forever inet6 fd9e:2d4e:a32a:7777::14/64 scope global valid_lft forever preferred_lft forever inet6 fe80::5054:ff:fed0:c66a/64 scope link valid_lft forever preferred_lft forever yes:
from pcs:
ip-2605.1c00.50f2.2900.44.150.0.30 (ocf::heartbeat:IPaddr2): Started qacloud-controller-1
[root@qacloud-controller-1 ~]# ip a |grep 2900
inet6 2605:1c00:50f2:2900:44:150:0:30/128 scope global
inet6 2605:1c00:50f2:2900::20/64 scope global
[root@qacloud-controller-1 ~]#
Thanks for helping; for ipv4 deployments we seem to be passing the right argument to ipaddr (ip/prefix). For example in a recent CI job [1] monitor_address_block is selecting only /24 addresses. Can you help collecting this same information from your ipv6 deployment? In OSP14 the ceph-ansible inventory and group_vars are saved in a path like "/var/lib/mistral/overcloud/ceph-ansible". In OSP13 deployments instead the inventory and vars are saved in a temporary directory (created under /tmp) if you set CephAnsiblePlaybookVerbosity to 1 in a Heat environment file before the deployment; alternatively it should be possible to grep these params in the mistral-executor logs. 1. http://logs.openstack.org/23/638323/6/check/tripleo-ci-centos-7-scenario004-standalone/aaddc99/logs/undercloud/home/zuul/undercloud-ansible-ScgSrj/ceph-ansible/group_vars/all.yml.txt.gz So this is new deployment with this vip for storage network:
[root@overcloud8yi-ctrl-0 ~]# pcs resource show ip-fd9e.2d4e.a32a.7777..26
Resource: ip-fd9e.2d4e.a32a.7777..26 (class=ocf provider=heartbeat type=IPaddr2)
Attributes: cidr_netmask=128 ip=fd9e:2d4e:a32a:7777::26 lvs_ipv6_addrlabel=true lvs_ipv6_addrlabel_value=99 nic=vlan333
Meta Attrs: resource-stickiness=INFINITY
Operations: monitor interval=10s timeout=20s (ip-fd9e.2d4e.a32a.7777..26-monitor-interval-10s)
start interval=0s timeout=20s (ip-fd9e.2d4e.a32a.7777..26-start-interval-0s)
stop interval=0s timeout=20s (ip-fd9e.2d4e.a32a.7777..26-stop-interval-0s)
The monitor_address_block and radosgw_address_block are passed as expected with subnet address (from mistral executor.log):
u'radosgw_address_block': u'fd9e:2d4e:a32a:7777::/64', u'user_config': True, u'radosgw_keystone': True, u'ceph_mgr_docker_extra_env': u'-e MGR_DASHBOARD=0', u'ceph_docker_image_tag': u'3-23', u'containerized_deployment': True, u'public_network': u'fd9e:2d4e:a32a:7777::/64', u'generate_fsid': False, u'monitor_address_block': u'fd9e:2d4e:a32a:7777::/64'
What i think is happening on ceph ansible site is in this line [1] we gets local host addresses from given address_block [2] and the first one is chosen. The local list addresses on hosts looks like this
[root@overcloud8yi-ctrl-0 ~]# ansible -m setup localhost
"ansible_facts": {
"ansible_all_ipv4_addresses": [
"192.168.213.52",
"192.168.213.67",
"10.87.4.235",
"172.31.0.1",
"172.16.0.103"
],
"ansible_all_ipv6_addresses": [
"fe80::5054:ff:fecc:7d59",
"fe80::5054:ff:fee6:b6d1",
"fe80::5054:ff:fe49:e0a1",
"fe80::42:79ff:fe2c:d276",
"fe80::5054:ff:fecc:7d59",
"fd9e:2d4e:a32a:7777::26",
"fd9e:2d4e:a32a:7777::19",
"fe80::5054:ff:fecc:7d59"
],
So as mentioned earlier new IPv6 address becomes first on the list (as opposed to IPv4) and the the vip address fd9e:2d4e:a32a:7777::26 will be chosen.
Not sure what operation are performed on CI but please remember that this issue is most likely IPv6 specific due the new IP addresses order, and visible when stack is updated because then vip is present and becomes first IPv6 address. During initial deployment ceph configuration is prepared before vip is setup.
[1] https://github.com/ceph/ceph-ansible/blob/v3.2.10/roles/ceph-config/templates/ceph.conf.j2#L47
[2] https://docs.ansible.com/ansible/latest/user_guide/playbooks_filters_ipaddr.html#getting-information-about-hosts-and-networks
hi, thanks again for helping with this bug. it seems in fact that in this line [1], while ipaddr() does get the ip/prefix argument correctly and could theoretically filter out addresses out of the wanted prefix, the _addresses list generated as ansible fact does not include the ip prefixes :( an alternative option could be to discard the address later in the process if its got a /128 or /32 prefix 1. https://github.com/ceph/ceph-ansible/blob/v3.2.10/roles/ceph-config/templates/ceph.conf.j2#L47 Hi, do you have any updates on this issue? I see that in 4.0 branch things changed a bit around getting IP addresses, maybe simplest solution is to just take last address of interface from given subnet in case of IPv6 in 3.2 branch and let 4.0 has more sophisticated way to get it? Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2019:2538 |