Attempting to deploy OSP 14 w/ Ceph and receiving the following error: "TASK [ceph-config : generate ceph.conf configuration file] *********************", "Thursday 07 February 2019 12:34:20 -0600 (0:00:00.310) 0:01:17.616 ***** ", "fatal: [overcloud-controller-2]: FAILED! => {\"msg\": \"No first item, sequence was empty.\"}", "NO MORE HOSTS LEFT *************************************************************",
Created attachment 1527884 [details] ansible.log
Created attachment 1527885 [details] Deploy script
Created attachment 1527886 [details] containers-prepare-parameter.yaml
Created attachment 1527887 [details] overcloud-config.yaml
Created attachment 1527888 [details] ceph-config.yaml
can you attach a sosreport from your undercloud and a tarball of /var/lib/mistral/ ?
Tarball of /var/lib/mistral is at http://www.penurio.us/pub/var_lib_mistral.tar
what version of ceph-ansible are you using? that would have been in the sosreport which would be better, but i need to know the CA version.
Created attachment 1527901 [details] /var/lib/mistral
(In reply to John Fulton from comment #8) > what version of ceph-ansible are you using? > > that would have been in the sosreport which would be better, but i need to > know the CA version. sosreport is coming.
sosreport is at http://www.penurio.us/pub/sosreport-undercloud-2019-02-07-jlllimi.tar.xz
Created attachment 1534426 [details] ceph-ansible run directory
Guillaume, This bug happened with ceph-ansible-3.2.5-1.el7cp.noarch you can see the ceph-ansible run directory by downloading it from: https://bugzilla.redhat.com/attachment.cgi?id=1534426 It contains the following after you untar it: [fultonj@skagra ceph-ansible{master}]$ ls | sort ceph_ansible_command.log extra_vars.yml fetch_dir group_vars host_vars inventory.yml nodes_uuid_command.log nodes_uuid_data.json nodes_uuid_playbook.yml [fultonj@skagra ceph-ansible{master}]$
The issue comes from a TripleO misconfiguration The ansible error refers to https://github.com/ceph/ceph-ansible/blob/stable-3.2/roles/ceph-config/templates/ceph.conf.j2#L83 Because TripleO is using the monitor_address_block variable to determine the mon ip address to bind, it try to find an ip address in the ansible ipaddress list fact (ansible_all_ipv4_addresses) that in part of the network defined in monitor_address_block. If there's no match then an empty array will be returned by the ipaddr filter resulting the first filter to fail with 'The error was: No first item, sequence was empty' ---- $ grep monitor_address_block group_vars/all.yml monitor_address_block: 192.168.24.0/24 ---- cluster_network and public_network use that network too. But all overcloud nodes are configured with 192.168.19.0/24 network cidr (which is the default ctrlplane network). ---- { "overcloud-cephstorage-0": "[192.168.19.104]*,[overcloud-cephstorage-0.localdomain]*,[overcloud-cephstorage-0]*,[192.168.19.104]*,[overcloud-cephstorage-0.storage.localdomain]*,[overcloud-cephstorage-0.storage]*,[192.168.19.104]*,[overcloud-cephstorage-0.storagemgmt.localdomain]*,[overcloud-cephstorage-0.storagemgmt]*,[192.168.19.104]*,[overcloud-cephstorage-0.internalapi.localdomain]*,[overcloud-cephstorage-0.internalapi]*,[192.168.19.104]*,[overcloud-cephstorage-0.tenant.localdomain]*,[overcloud-cephstorage-0.tenant]*,[192.168.19.104]*,[overcloud-cephstorage-0.external.localdomain]*,[overcloud-cephstorage-0.external]*,[192.168.19.104]*,[overcloud-cephstorage-0.management.localdomain]*,[overcloud-cephstorage-0.management]*,[192.168.19.104]*,[overcloud-cephstorage-0.ctlplane.localdomain]*,[overcloud-cephstorage-0.ctlplane]*", "overcloud-cephstorage-1": "[192.168.19.107]*,[overcloud-cephstorage-1.localdomain]*,[overcloud-cephstorage-1]*,[192.168.19.107]*,[overcloud-cephstorage-1.storage.localdomain]*,[overcloud-cephstorage-1.storage]*,[192.168.19.107]*,[overcloud-cephstorage-1.storagemgmt.localdomain]*,[overcloud-cephstorage-1.storagemgmt]*,[192.168.19.107]*,[overcloud-cephstorage-1.internalapi.localdomain]*,[overcloud-cephstorage-1.internalapi]*,[192.168.19.107]*,[overcloud-cephstorage-1.tenant.localdomain]*,[overcloud-cephstorage-1.tenant]*,[192.168.19.107]*,[overcloud-cephstorage-1.external.localdomain]*,[overcloud-cephstorage-1.external]*,[192.168.19.107]*,[overcloud-cephstorage-1.management.localdomain]*,[overcloud-cephstorage-1.management]*,[192.168.19.107]*,[overcloud-cephstorage-1.ctlplane.localdomain]*,[overcloud-cephstorage-1.ctlplane]*", "overcloud-cephstorage-2": "[192.168.19.105]*,[overcloud-cephstorage-2.localdomain]*,[overcloud-cephstorage-2]*,[192.168.19.105]*,[overcloud-cephstorage-2.storage.localdomain]*,[overcloud-cephstorage-2.storage]*,[192.168.19.105]*,[overcloud-cephstorage-2.storagemgmt.localdomain]*,[overcloud-cephstorage-2.storagemgmt]*,[192.168.19.105]*,[overcloud-cephstorage-2.internalapi.localdomain]*,[overcloud-cephstorage-2.internalapi]*,[192.168.19.105]*,[overcloud-cephstorage-2.tenant.localdomain]*,[overcloud-cephstorage-2.tenant]*,[192.168.19.105]*,[overcloud-cephstorage-2.external.localdomain]*,[overcloud-cephstorage-2.external]*,[192.168.19.105]*,[overcloud-cephstorage-2.management.localdomain]*,[overcloud-cephstorage-2.management]*,[192.168.19.105]*,[overcloud-cephstorage-2.ctlplane.localdomain]*,[overcloud-cephstorage-2.ctlplane]*", "overcloud-compute-0": "[192.168.19.113]*,[overcloud-compute-0.localdomain]*,[overcloud-compute-0]*,[192.168.19.113]*,[overcloud-compute-0.storage.localdomain]*,[overcloud-compute-0.storage]*,[192.168.19.113]*,[overcloud-compute-0.storagemgmt.localdomain]*,[overcloud-compute-0.storagemgmt]*,[192.168.19.113]*,[overcloud-compute-0.internalapi.localdomain]*,[overcloud-compute-0.internalapi]*,[192.168.19.113]*,[overcloud-compute-0.tenant.localdomain]*,[overcloud-compute-0.tenant]*,[192.168.19.113]*,[overcloud-compute-0.external.localdomain]*,[overcloud-compute-0.external]*,[192.168.19.113]*,[overcloud-compute-0.management.localdomain]*,[overcloud-compute-0.management]*,[192.168.19.113]*,[overcloud-compute-0.ctlplane.localdomain]*,[overcloud-compute-0.ctlplane]*", "overcloud-compute-1": "[192.168.19.110]*,[overcloud-compute-1.localdomain]*,[overcloud-compute-1]*,[192.168.19.110]*,[overcloud-compute-1.storage.localdomain]*,[overcloud-compute-1.storage]*,[192.168.19.110]*,[overcloud-compute-1.storagemgmt.localdomain]*,[overcloud-compute-1.storagemgmt]*,[192.168.19.110]*,[overcloud-compute-1.internalapi.localdomain]*,[overcloud-compute-1.internalapi]*,[192.168.19.110]*,[overcloud-compute-1.tenant.localdomain]*,[overcloud-compute-1.tenant]*,[192.168.19.110]*,[overcloud-compute-1.external.localdomain]*,[overcloud-compute-1.external]*,[192.168.19.110]*,[overcloud-compute-1.management.localdomain]*,[overcloud-compute-1.management]*,[192.168.19.110]*,[overcloud-compute-1.ctlplane.localdomain]*,[overcloud-compute-1.ctlplane]*", "overcloud-controller-0": "[192.168.19.130]*,[overcloud-controller-0.localdomain]*,[overcloud-controller-0]*,[192.168.19.130]*,[overcloud-controller-0.storage.localdomain]*,[overcloud-controller-0.storage]*,[192.168.19.130]*,[overcloud-controller-0.storagemgmt.localdomain]*,[overcloud-controller-0.storagemgmt]*,[192.168.19.130]*,[overcloud-controller-0.internalapi.localdomain]*,[overcloud-controller-0.internalapi]*,[192.168.19.130]*,[overcloud-controller-0.tenant.localdomain]*,[overcloud-controller-0.tenant]*,[192.168.19.130]*,[overcloud-controller-0.external.localdomain]*,[overcloud-controller-0.external]*,[192.168.19.130]*,[overcloud-controller-0.management.localdomain]*,[overcloud-controller-0.management]*,[192.168.19.130]*,[overcloud-controller-0.ctlplane.localdomain]*,[overcloud-controller-0.ctlplane]*", "overcloud-controller-1": "[192.168.19.109]*,[overcloud-controller-1.localdomain]*,[overcloud-controller-1]*,[192.168.19.109]*,[overcloud-controller-1.storage.localdomain]*,[overcloud-controller-1.storage]*,[192.168.19.109]*,[overcloud-controller-1.storagemgmt.localdomain]*,[overcloud-controller-1.storagemgmt]*,[192.168.19.109]*,[overcloud-controller-1.internalapi.localdomain]*,[overcloud-controller-1.internalapi]*,[192.168.19.109]*,[overcloud-controller-1.tenant.localdomain]*,[overcloud-controller-1.tenant]*,[192.168.19.109]*,[overcloud-controller-1.external.localdomain]*,[overcloud-controller-1.external]*,[192.168.19.109]*,[overcloud-controller-1.management.localdomain]*,[overcloud-controller-1.management]*,[192.168.19.109]*,[overcloud-controller-1.ctlplane.localdomain]*,[overcloud-controller-1.ctlplane]*", "overcloud-controller-2": "[192.168.19.125]*,[overcloud-controller-2.localdomain]*,[overcloud-controller-2]*,[192.168.19.125]*,[overcloud-controller-2.storage.localdomain]*,[overcloud-controller-2.storage]*,[192.168.19.125]*,[overcloud-controller-2.storagemgmt.localdomain]*,[overcloud-controller-2.storagemgmt]*,[192.168.19.125]*,[overcloud-controller-2.internalapi.localdomain]*,[overcloud-controller-2.internalapi]*,[192.168.19.125]*,[overcloud-controller-2.tenant.localdomain]*,[overcloud-controller-2.tenant]*,[192.168.19.125]*,[overcloud-controller-2.external.localdomain]*,[overcloud-controller-2.external]*,[192.168.19.125]*,[overcloud-controller-2.management.localdomain]*,[overcloud-controller-2.management]*,[192.168.19.125]*,[overcloud-controller-2.ctlplane.localdomain]*,[overcloud-controller-2.ctlplane]*" } ---- Only overcloud-controller-2 node fails because ceph-ansible deploys mons in container sequentially https://github.com/ceph/ceph-ansible/blob/stable-3.2/site-docker.yml.sample#L101 We propably need to modify ceph-validate role to add a test on that.
(In reply to Dimitri Savineau from comment #16) > The issue comes from a TripleO misconfiguration Do you mean that there's an error in the templates (very possible) or a bug in TripleO?
> Do you mean that there's an error in the templates (very possible) or a bug in TripleO? Probably a bug in TripleO. You're ctlplane network was configured with 192.168.19.0/24 (I assume you can found this value configured in the undercloud.conf file) and the overlcoud networks are reusing that network according to the ansible log. But the network cidr value generated by TripleO (via mistral I guess) as an input for ceph-ansible is wrong. The value generated is still the default ctlplane network value (192.168.24.0/24) for public_network, cluster_networt and monitor_address_block and doesn't reflect the real value configured.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2019:0911