1451786 – [ceph-ansible] [ceph-container] : unable to bring up cluster using IPv6

Bug 1451786 - [ceph-ansible] [ceph-container] : unable to bring up cluster using IPv6

Summary: [ceph-ansible] [ceph-container] : unable to bring up cluster using IPv6

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	Red Hat Ceph Storage
Classification:	Red Hat Storage
Component:	Container
Sub Component:
Version:	2.2
Hardware:	x86_64
OS:	Linux
Priority:	low
Severity:	medium
Target Milestone:	z2
Target Release:	3.0
Assignee:	Guillaume Abrioux
QA Contact:	Vasishta
Docs Contact:	Bara Ancincova
URL:
Whiteboard:
Depends On:
Blocks:	1437916 1494421
TreeView+	depends on / blocked

Reported:	2017-05-17 14:16 UTC by Rachana Patel
Modified:	2019-08-27 05:08 UTC (History)
CC List:	16 users (show)
Fixed In Version:	RHEL: ceph-ansible-3.0.0-0.1.rc1.el7cp Ubuntu: ceph-ansible_3.0.0~rc3-2redhat1
Doc Type:	Bug Fix
Doc Text:	.Using IPv6 addressing is now supported with containerized Ceph clusters Previously, an attempt to deploy a Ceph cluster as a container image failed if IPv6 addressing was used. With this update, IPv6 addressing is supported.
Clone Of:
Environment:
Last Closed:	2019-08-27 05:08:16 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Comment 2 John Poelstra 2017-05-17 15:14:29 UTC

discussed at program meeting, is a blocker if actual bug. will review today and determine timeline to fix.

Comment 3 seb 2017-05-18 09:59:12 UTC

Fix pending upstream: https://github.com/ceph/ceph-ansible/pull/1548

Comment 4 seb 2017-05-18 12:50:21 UTC

backport: https://github.com/ceph/ceph-ansible/pull/1549

Comment 5 seb 2017-05-18 14:40:40 UTC

Just released a new tag: https://github.com/ceph/ceph-ansible/releases/tag/v2.2.6.
Can we get a new build?

Comment 9 seb 2017-05-29 10:04:57 UTC

It looks like Ansible did not find any IPV6 address on that machine.
Is the IPV6 stack enabled?

Can you run 'ansible mons -m setup' and search for any ipv6 fields.

Can I look into the machine? Thanks!

Comment 10 John Poelstra 2017-05-31 15:15:03 UTC

Discussed at the program meeting, QE needs to bring machine up to re-verify and will have it ready on Friday, India time.

Development believes IPV6 is not properly setup.  Development can re-run test and share machine with QE.

Comment 12 seb 2017-06-02 13:34:00 UTC

vidushi

Can you run 'ansible mons -m setup' and search for any ipv6 fields?

Comment 15 Christina Meno 2017-06-05 15:15:31 UTC

Andrew,

Would you please read the logs provided in c14 and see if we can determine cause?

Comment 17 Andrew Schoen 2017-06-05 15:57:07 UTC

Upstream PR: https://github.com/ceph/ceph-ansible/pull/1587

Comment 18 Andrew Schoen 2017-06-05 18:46:19 UTC

backport PR: https://github.com/ceph/ceph-ansible/pull/1588

Comment 19 Andrew Schoen 2017-06-05 22:28:06 UTC

This is included in the v2.2.9 upstream tag

Comment 31 Guillaume Abrioux 2017-06-08 07:49:10 UTC

The fix proposed in https://github.com/ceph/ceph-ansible/pull/1587 is not working because the ansible fact 'ansible_default_ipv6' is filled with the IP address carried by the network interface used by the default gateway, therefore if there is no default ipv6 route present on nodes, this fact will be empty.

By the way, it means this patch set the MON_IP environment variable with the IP carried by the network interface used for the default gw which is not always what is expected.

for instance:

[vagrant@ceph-mon0 ~]$ ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether 52:54:00:2e:67:cc brd ff:ff:ff:ff:ff:ff
    inet 192.168.121.112/24 brd 192.168.121.255 scope global dynamic eth0
       valid_lft 1653sec preferred_lft 1653sec
    inet6 fe80::5054:ff:fe2e:67cc/64 scope link
       valid_lft forever preferred_lft forever
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether 52:54:00:d9:06:21 brd ff:ff:ff:ff:ff:ff
    inet 192.168.77.10/24 brd 192.168.77.255 scope global eth1
       valid_lft forever preferred_lft forever
    inet6 fe80::5054:ff:fed9:621/64 scope link
       valid_lft forever preferred_lft forever
4: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN
    link/ether 02:42:02:cf:e1:fe brd ff:ff:ff:ff:ff:ff
    inet 172.17.0.1/16 scope global docker0
       valid_lft forever preferred_lft forever
[vagrant@ceph-mon0 ~]$

[vagrant@ceph-mon0 ~]$ ip r
default via 192.168.121.1 dev eth0  proto static  metric 101
172.17.0.0/16 dev docker0  proto kernel  scope link  src 172.17.0.1
192.168.77.0/24 dev eth1  proto kernel  scope link  src 192.168.77.10
192.168.121.0/24 dev eth0  proto kernel  scope link  src 192.168.121.112  metric 100
[vagrant@ceph-mon0 ~]$


192.168.77.10 is the IP address we want to set in MON_IP.

but :

[root@ceph-mon0 ~]# cat /etc/systemd/system/ceph-mon@.service | grep MON_IP
   -e MON_IP=192.168.121.112 \
[root@ceph-mon0 ~]#


I think we should make a new patch in templates ceph-mon.service.j2 and ceph.conf.j2 to use the facts in hostvars['ansible_' + interface_name]['ipv4'|'ipv6'] even if it adds some complexity to the template.

Comment 32 Guillaume Abrioux 2017-06-08 13:58:06 UTC

upstream pr: https://github.com/ceph/ceph-ansible/pull/1594

it's still WIP but here are some tests results I made:

non-containerized deployment IPV4 :
[vagrant@ceph-mon0 ~]$ sudo ceph -s
    cluster ec922ac9-fdd8-45d5-b4b1-ac83b2590c48
     health HEALTH_WARN
            64 pgs degraded
            64 pgs undersized
            too few PGs per OSD (21 < min 30)
     monmap e2: 3 mons at {ceph-mon0=192.168.77.10:6789/0,ceph-mon1=192.168.77.11:6789/0,ceph-mon2=192.168.77.12:6789/0}
            election epoch 6, quorum 0,1,2 ceph-mon0,ceph-mon1,ceph-mon2
        mgr no daemons active
     osdmap e13: 3 osds: 3 up, 3 in
            flags sortbitwise,require_jewel_osds,require_kraken_osds
      pgmap v19: 64 pgs, 1 pools, 0 bytes data, 0 objects
            101992 kB used, 36431 MB / 36530 MB avail
                  64 undersized+degraded+peered
[vagrant@ceph-mon0 ~]$



non-containerized deployment IPV6 :
[vagrant@ceph-mon0 ~]$ sudo ceph -s
    cluster b928cef7-512d-4cd2-bd0f-58061e32bfe1
     health HEALTH_WARN
            64 pgs degraded
            64 pgs undersized
            too few PGs per OSD (21 < min 30)
     monmap e2: 3 mons at {ceph-mon0=[2001:db8:ca2:6::10]:6789/0,ceph-mon1=[2001:db8:ca2:6::11]:6789/0,ceph-mon2=[2001:db8:ca2:6::12]:6789/0}
            election epoch 6, quorum 0,1,2 ceph-mon0,ceph-mon1,ceph-mon2
        mgr no daemons active
     osdmap e13: 3 osds: 3 up, 3 in
            flags sortbitwise,require_jewel_osds,require_kraken_osds
      pgmap v22: 64 pgs, 1 pools, 0 bytes data, 0 objects
            101460 kB used, 36431 MB / 36530 MB avail
                  64 undersized+degraded+peered
[vagrant@ceph-mon0 ~]$



containerized deployment IPV4 :
[root@ceph-mon0 ~]# docker exec ceph-mon-ceph-mon0 ceph -s
    cluster 66c5d2b8-b0e6-4888-bfea-439ed24ce7f4
     health HEALTH_WARN
            64 pgs degraded
            64 pgs undersized
            too few PGs per OSD (21 < min 30)
     monmap e3: 3 mons at {ceph-mon0=192.168.77.10:6789/0,ceph-mon1=192.168.77.11:6789/0,ceph-mon2=192.168.77.12:6789/0}
            election epoch 8, quorum 0,1,2 ceph-mon0,ceph-mon1,ceph-mon2
        mgr no daemons active
     osdmap e8: 3 osds: 3 up, 3 in
            flags sortbitwise,require_jewel_osds,require_kraken_osds
      pgmap v13: 64 pgs, 1 pools, 0 bytes data, 0 objects
            102056 kB used, 36431 MB / 36530 MB avail
                  64 undersized+degraded+peered
[root@ceph-mon0 ~]#



containerized deployment IPV6 :
[root@ceph-mon0 ~]# docker exec ceph-mon-ceph-mon0 ceph  -s
    cluster 87d81971-e58a-454f-bedb-b8cc9df56ae2
     health HEALTH_ERR
            64 pgs are stuck inactive for more than 300 seconds
            64 pgs degraded
            64 pgs stuck inactive
            64 pgs stuck unclean
            64 pgs undersized
            too few PGs per OSD (21 < min 30)
     monmap e3: 3 mons at {ceph-mon0=[2001:db8:ca2:6::10]:6789/0,ceph-mon1=[2001:db8:ca2:6::11]:6789/0,ceph-mon2=[2001:db8:ca2:6::12]:6789/0}
            election epoch 8, quorum 0,1,2 ceph-mon0,ceph-mon1,ceph-mon2
        mgr no daemons active
     osdmap e8: 3 osds: 3 up, 3 in
            flags sortbitwise,require_jewel_osds,require_kraken_osds
      pgmap v14: 64 pgs, 1 pools, 0 bytes data, 0 objects
            101764 kB used, 36431 MB / 36530 MB avail
                  64 undersized+degraded+peered
[root@ceph-mon0 ~]#

Comment 33 Guillaume Abrioux 2017-06-09 09:26:13 UTC

Waiting for the PR to be merged upstream

Comment 34 Guillaume Abrioux 2017-07-05 14:13:00 UTC

merged upstream : https://github.com/ceph/ceph-ansible/commit/88df105d0b13f2a245666409cafd752d0176ecc3

Comment 37 Christina Meno 2017-08-31 20:15:59 UTC

[gmeno@localhost ceph-ansible]$ git tag  --contains 88df105d0b13f2a245666409cafd752d0176ecc3 | grep v3
v3.0.0rc1
v3.0.0rc2
v3.0.0rc3
v3.0.0rc4

Looks like this should be in ON_QA state based on the the commit it fix

Comment 42 Guillaume Abrioux 2018-04-09 10:50:55 UTC

monitor_address is set in all.yml while it should be set in inventory host file.

all.yml is applying for all nodes, so it will consider "2620:52:0:880:225:90ff:fefc:1a8a" is the ip address for all mons.

ceph_mon_docker_interface and ceph_mon_docker_subnet are not needed.

Note You need to log in before you can comment on or make changes to this bug.