Description of problem: OSP17 IPV6 ceph job is failing on task "Run cephadm bootstrap" with stderr Error EINVAL: Failed to connect to controller-0 Version-Release number of selected component (if applicable): 17 How reproducible: Everytime in 17 Integration pipeline Steps to Reproduce: 1. Deploy environment with ceph + IPV6 Actual results: Deployment failing on task "Run cephadm bootstrap" with stderr Error EINVAL: Failed to connect to controller-0 Expected results: Deployment should completes succesfully Additional info: Logs attached in private comments.
more readable: ➜ foo jq -r '.[]' j Verifying podman|docker is present... Verifying lvm2 is present... Verifying time synchronization is in place... Unit chronyd.service is enabled and running Repeating the final host check... podman|docker (/bin/podman) is present systemctl is present lvcreate is present Unit chronyd.service is enabled and running Host looks OK Cluster fsid: 3ba21fbf-2232-44b2-a8be-e37b61273af5 Verifying IP [fd00:fd00:fd00:3000::269] port 3300 ... Verifying IP [fd00:fd00:fd00:3000::269] port 6789 ... Mon IP [fd00:fd00:fd00:3000::269] is in CIDR network fd00:fd00:fd00:3000::/64 - internal network (--cluster-network) has not been provided, OSD replication will default to the public_network Pulling container image undercloud-0.ctlplane.redhat.local:8787/rh-osbs/rhceph:5-12... Ceph version: ceph version 16.2.0-72.el8cp (1e802193e0b4084ffcdb2338dd09f08bbea54a1a) pacific (stable) Extracting ceph user uid/gid from container image... Creating initial keys... Creating initial monmap... Creating mon... Waiting for mon to start... Waiting for mon... mon is available Assimilating anything we can from ceph.conf... Generating new minimal ceph.conf... Restarting the monitor... Setting mon public_network to fd00:fd00:fd00:3000::/64 Enabling IPv6 (ms_bind_ipv6) binding Wrote config to /etc/ceph/ceph.conf Wrote keyring to /etc/ceph/ceph.client.admin.keyring Creating mgr... Verifying port 9283 ... Waiting for mgr to start... Waiting for mgr... mgr not available, waiting (1/15)... mgr not available, waiting (2/15)... mgr not available, waiting (3/15)... mgr is available Enabling cephadm module... Waiting for the mgr to restart... Waiting for mgr epoch 5... mgr epoch 5 is available Setting orchestrator backend to cephadm... Using provided ssh keys... Adding host controller-0... Non-zero exit code 22 from /bin/podman run --rm --ipc=host --net=host --entrypoint /usr/bin/ceph --init -e CONTAINER_IMAGE=undercloud-0.ctlplane.redhat.local:8787/rh-osbs/rhceph:5-12 -e NODE_NAME=controller-0 -e CEPH_USE_RANDOM_NONCE=1 -v /var/log/ceph/3ba21fbf-2232-44b2-a8be-e37b61273af5:/var/log/ceph:z -v /tmp/ceph-tmp71lj62nz:/etc/ceph/ceph.client.admin.keyring:z -v /tmp/ceph-tmp70z5o_6n:/etc/ceph/ceph.conf:z undercloud-0.ctlplane.redhat.local:8787/rh-osbs/rhceph:5-12 orch host add controller-0 [fd00:fd00:fd00:3000::269] /usr/bin/ceph: stderr Error EINVAL: Failed to connect to controller-0 ([fd00:fd00:fd00:3000::269]). /usr/bin/ceph: stderr Please make sure that the host is reachable and accepts connections using the cephadm SSH key /usr/bin/ceph: stderr /usr/bin/ceph: stderr To add the cephadm SSH key to the host: /usr/bin/ceph: stderr > ceph cephadm get-pub-key > ~/ceph.pub /usr/bin/ceph: stderr > ssh-copy-id -f -i ~/ceph.pub ceph-admin@[fd00:fd00:fd00:3000::269] /usr/bin/ceph: stderr /usr/bin/ceph: stderr To check that the host is reachable: /usr/bin/ceph: stderr > ceph cephadm get-ssh-config > ssh_config /usr/bin/ceph: stderr > ceph config-key get mgr/cephadm/ssh_identity_key > ~/cephadm_private_key /usr/bin/ceph: stderr > chmod 0600 ~/cephadm_private_key /usr/bin/ceph: stderr > ssh -F ssh_config -i ~/cephadm_private_key ceph-admin@[fd00:fd00:fd00:3000::269] ERROR: Failed to add host <controller-0>: Failed command: /bin/podman run --rm --ipc=host --net=host --entrypoint /usr/bin/ceph --init -e CONTAINER_IMAGE=undercloud-0.ctlplane.redhat.local:8787/rh-osbs/rhceph:5-12 -e NODE_NAME=controller-0 -e CEPH_USE_RANDOM_NONCE=1 -v /var/log/ceph/3ba21fbf-2232-44b2-a8be-e37b61273af5:/var/log/ceph:z -v /tmp/ceph-tmp71lj62nz:/etc/ceph/ceph.client.admin.keyring:z -v /tmp/ceph-tmp70z5o_6n:/etc/ceph/ceph.conf:z undercloud-0.ctlplane.redhat.local:8787/rh-osbs/rhceph:5-12 orch host add controller-0 [fd00:fd00:fd00:3000::269] This is going to get fixed by https://github.com/ceph/ceph/pull/43029
I tested the fix for this bug and it let me get my mon running in IPv6 but unfortunately I hit a new IPv6 bug for the OSD. I reported it upstream here: https://tracker.ceph.com/issues/52867
cc Daniel
Just set requires_doc_text to - as this was caused by an internal CI issue
We have a workaround on the tripleo/director side [1] but we'd rather not merge a workaround into TripleO. We request that Ceph have a fix for the upstream issue [2] either in pick_address.cc or cephadm or whatever you like. Should I open a downstream BZ to track this other issue? [1] https://review.opendev.org/c/openstack/tripleo-ansible/+/814064 [2] https://tracker.ceph.com/issues/52867
(In reply to John Fulton from comment #12) > We have a workaround on the tripleo/director side [1] but we'd rather not > merge a workaround into TripleO. We request that Ceph have a fix for the > upstream issue [2] either in pick_address.cc or cephadm or whatever you > like. Should I open a downstream BZ to track this other issue? > > [1] https://review.opendev.org/c/openstack/tripleo-ansible/+/814064 > [2] https://tracker.ceph.com/issues/52867 https://bugzilla.redhat.com/show_bug.cgi?id=2016496
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Red Hat Ceph Storage 5.0 Bug Fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2021:4105