discussed at program meeting, is a blocker if actual bug. will review today and determine timeline to fix.
Fix pending upstream: https://github.com/ceph/ceph-ansible/pull/1548
backport: https://github.com/ceph/ceph-ansible/pull/1549
Just released a new tag: https://github.com/ceph/ceph-ansible/releases/tag/v2.2.6. Can we get a new build?
It looks like Ansible did not find any IPV6 address on that machine. Is the IPV6 stack enabled? Can you run 'ansible mons -m setup' and search for any ipv6 fields. Can I look into the machine? Thanks!
Discussed at the program meeting, QE needs to bring machine up to re-verify and will have it ready on Friday, India time. Development believes IPV6 is not properly setup. Development can re-run test and share machine with QE.
vidushi Can you run 'ansible mons -m setup' and search for any ipv6 fields?
Andrew, Would you please read the logs provided in c14 and see if we can determine cause?
Upstream PR: https://github.com/ceph/ceph-ansible/pull/1587
backport PR: https://github.com/ceph/ceph-ansible/pull/1588
This is included in the v2.2.9 upstream tag
The fix proposed in https://github.com/ceph/ceph-ansible/pull/1587 is not working because the ansible fact 'ansible_default_ipv6' is filled with the IP address carried by the network interface used by the default gateway, therefore if there is no default ipv6 route present on nodes, this fact will be empty. By the way, it means this patch set the MON_IP environment variable with the IP carried by the network interface used for the default gw which is not always what is expected. for instance: [vagrant@ceph-mon0 ~]$ ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000 link/ether 52:54:00:2e:67:cc brd ff:ff:ff:ff:ff:ff inet 192.168.121.112/24 brd 192.168.121.255 scope global dynamic eth0 valid_lft 1653sec preferred_lft 1653sec inet6 fe80::5054:ff:fe2e:67cc/64 scope link valid_lft forever preferred_lft forever 3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000 link/ether 52:54:00:d9:06:21 brd ff:ff:ff:ff:ff:ff inet 192.168.77.10/24 brd 192.168.77.255 scope global eth1 valid_lft forever preferred_lft forever inet6 fe80::5054:ff:fed9:621/64 scope link valid_lft forever preferred_lft forever 4: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN link/ether 02:42:02:cf:e1:fe brd ff:ff:ff:ff:ff:ff inet 172.17.0.1/16 scope global docker0 valid_lft forever preferred_lft forever [vagrant@ceph-mon0 ~]$ [vagrant@ceph-mon0 ~]$ ip r default via 192.168.121.1 dev eth0 proto static metric 101 172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1 192.168.77.0/24 dev eth1 proto kernel scope link src 192.168.77.10 192.168.121.0/24 dev eth0 proto kernel scope link src 192.168.121.112 metric 100 [vagrant@ceph-mon0 ~]$ 192.168.77.10 is the IP address we want to set in MON_IP. but : [root@ceph-mon0 ~]# cat /etc/systemd/system/ceph-mon@.service | grep MON_IP -e MON_IP=192.168.121.112 \ [root@ceph-mon0 ~]# I think we should make a new patch in templates ceph-mon.service.j2 and ceph.conf.j2 to use the facts in hostvars['ansible_' + interface_name]['ipv4'|'ipv6'] even if it adds some complexity to the template.
upstream pr: https://github.com/ceph/ceph-ansible/pull/1594 it's still WIP but here are some tests results I made: non-containerized deployment IPV4 : [vagrant@ceph-mon0 ~]$ sudo ceph -s cluster ec922ac9-fdd8-45d5-b4b1-ac83b2590c48 health HEALTH_WARN 64 pgs degraded 64 pgs undersized too few PGs per OSD (21 < min 30) monmap e2: 3 mons at {ceph-mon0=192.168.77.10:6789/0,ceph-mon1=192.168.77.11:6789/0,ceph-mon2=192.168.77.12:6789/0} election epoch 6, quorum 0,1,2 ceph-mon0,ceph-mon1,ceph-mon2 mgr no daemons active osdmap e13: 3 osds: 3 up, 3 in flags sortbitwise,require_jewel_osds,require_kraken_osds pgmap v19: 64 pgs, 1 pools, 0 bytes data, 0 objects 101992 kB used, 36431 MB / 36530 MB avail 64 undersized+degraded+peered [vagrant@ceph-mon0 ~]$ non-containerized deployment IPV6 : [vagrant@ceph-mon0 ~]$ sudo ceph -s cluster b928cef7-512d-4cd2-bd0f-58061e32bfe1 health HEALTH_WARN 64 pgs degraded 64 pgs undersized too few PGs per OSD (21 < min 30) monmap e2: 3 mons at {ceph-mon0=[2001:db8:ca2:6::10]:6789/0,ceph-mon1=[2001:db8:ca2:6::11]:6789/0,ceph-mon2=[2001:db8:ca2:6::12]:6789/0} election epoch 6, quorum 0,1,2 ceph-mon0,ceph-mon1,ceph-mon2 mgr no daemons active osdmap e13: 3 osds: 3 up, 3 in flags sortbitwise,require_jewel_osds,require_kraken_osds pgmap v22: 64 pgs, 1 pools, 0 bytes data, 0 objects 101460 kB used, 36431 MB / 36530 MB avail 64 undersized+degraded+peered [vagrant@ceph-mon0 ~]$ containerized deployment IPV4 : [root@ceph-mon0 ~]# docker exec ceph-mon-ceph-mon0 ceph -s cluster 66c5d2b8-b0e6-4888-bfea-439ed24ce7f4 health HEALTH_WARN 64 pgs degraded 64 pgs undersized too few PGs per OSD (21 < min 30) monmap e3: 3 mons at {ceph-mon0=192.168.77.10:6789/0,ceph-mon1=192.168.77.11:6789/0,ceph-mon2=192.168.77.12:6789/0} election epoch 8, quorum 0,1,2 ceph-mon0,ceph-mon1,ceph-mon2 mgr no daemons active osdmap e8: 3 osds: 3 up, 3 in flags sortbitwise,require_jewel_osds,require_kraken_osds pgmap v13: 64 pgs, 1 pools, 0 bytes data, 0 objects 102056 kB used, 36431 MB / 36530 MB avail 64 undersized+degraded+peered [root@ceph-mon0 ~]# containerized deployment IPV6 : [root@ceph-mon0 ~]# docker exec ceph-mon-ceph-mon0 ceph -s cluster 87d81971-e58a-454f-bedb-b8cc9df56ae2 health HEALTH_ERR 64 pgs are stuck inactive for more than 300 seconds 64 pgs degraded 64 pgs stuck inactive 64 pgs stuck unclean 64 pgs undersized too few PGs per OSD (21 < min 30) monmap e3: 3 mons at {ceph-mon0=[2001:db8:ca2:6::10]:6789/0,ceph-mon1=[2001:db8:ca2:6::11]:6789/0,ceph-mon2=[2001:db8:ca2:6::12]:6789/0} election epoch 8, quorum 0,1,2 ceph-mon0,ceph-mon1,ceph-mon2 mgr no daemons active osdmap e8: 3 osds: 3 up, 3 in flags sortbitwise,require_jewel_osds,require_kraken_osds pgmap v14: 64 pgs, 1 pools, 0 bytes data, 0 objects 101764 kB used, 36431 MB / 36530 MB avail 64 undersized+degraded+peered [root@ceph-mon0 ~]#
Waiting for the PR to be merged upstream
merged upstream : https://github.com/ceph/ceph-ansible/commit/88df105d0b13f2a245666409cafd752d0176ecc3
[gmeno@localhost ceph-ansible]$ git tag --contains 88df105d0b13f2a245666409cafd752d0176ecc3 | grep v3 v3.0.0rc1 v3.0.0rc2 v3.0.0rc3 v3.0.0rc4 Looks like this should be in ON_QA state based on the the commit it fix
monitor_address is set in all.yml while it should be set in inventory host file. all.yml is applying for all nodes, so it will consider "2620:52:0:880:225:90ff:fefc:1a8a" is the ip address for all mons. ceph_mon_docker_interface and ceph_mon_docker_subnet are not needed.