Description of problem: A hyper converged Overcloud upgrade failed while running the command $ openstack overcloud upgrade converge \ --templates \ -e /home/stack/virt/internal.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml \ -e /home/stack/virt/network/network-environment.yaml \ -e /home/stack/virt/hostnames.yml \ -e /usr/share/openstack-tripleo-heat-templates/environments/ceph-ansible/ceph-ansible.yaml \ -e /home/stack/virt/debug.yaml \ -e /home/stack/virt/nodes_data.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/cinder-backup.yaml \ -e /home/stack/virt/ceph-min-osds.yaml \ -e /home/stack/virt/ceph-single-host-mode.yaml \ -e /home/stack/virt/docker-images.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/lifecycle/upgrade-converge.yaml \ -r /usr/share/openstack-tripleo-heat-templates/roles_data.yaml The upgrade failed during the run of Ceph-Ansible, with the errors (from /var/log/mistral/ceph-install-workflow.log): 2018-05-19 21:17:52,305 p=12968 u=mistral | failed: [192.168.24.16] (item=192.168.24.19) => {"changed": false, "cmd": ["docker", "exec", "ceph-mon-controller-0", "ceph", "--cluster", "ceph", "auth", "get-or-create", "mgr.controller-1", "mon", "allow profile mgr", "osd", "allow *", "mds", "allow *", "-o", "/etc/ceph/ceph.mgr.controller-1.keyring"], "delta": "0:00:00.444762", "end ": "2018-05-20 01:17:50.280241", "item": "192.168.24.19", "msg": "non-zero return code", "rc": 22, "start": "2018-05-20 01:17:49.835479", "stderr": "Error EINVAL: bad entity name", "stderr_li nes": ["Error EINVAL: bad entity name"], "stdout": "", "stdout_lines": []} 2018-05-19 21:17:53,094 p=12968 u=mistral | failed: [192.168.24.16] (item=192.168.24.18) => {"changed": false, "cmd": ["docker", "exec", "ceph-mon-controller-0", "ceph", "--cluster", "ceph", "auth", "get-or-create", "mgr.controller-2", "mon", "allow profile mgr", "osd", "allow *", "mds", "allow *", "-o", "/etc/ceph/ceph.mgr.controller-2.keyring"], "delta": "0:00:00.430806", "end ": "2018-05-20 01:17:51.074694", "item": "192.168.24.18", "msg": "non-zero return code", "rc": 22, "start": "2018-05-20 01:17:50.643888", "stderr": "Error EINVAL: bad entity name", "stderr_li nes": ["Error EINVAL: bad entity name"], "stdout": "", "stdout_lines": []} 2018-05-19 21:17:53,824 p=12968 u=mistral | failed: [192.168.24.16] (item=192.168.24.16) => {"changed": false, "cmd": ["docker", "exec", "ceph-mon-controller-0", "ceph", "--cluster", "ceph", "auth", "get-or-create", "mgr.controller-0", "mon", "allow profile mgr", "osd", "allow *", "mds", "allow *", "-o", "/etc/ceph/ceph.mgr.controller-0.keyring"], "delta": "0:00:00.389069", "end": "2018-05-20 01:17:51.804882", "item": "192.168.24.16", "msg": "non-zero return code", "rc": 22, "start": "2018-05-20 01:17:51.415813", "stderr": "Error EINVAL: bad entity name", "stderr_lines": ["Error EINVAL: bad entity name"], "stdout": "", "stdout_lines": []} Version-Release number of selected component (if applicable): ceph-ansible-3.1.0-0.1.rc3.el7cp.noarch puppet-tripleo-8.3.2-6.el7ost.noarch ansible-tripleo-ipsec-8.1.1-0.20180308133440.8f5369a.el7ost.noarch openstack-tripleo-common-8.6.1-12.el7ost.noarch openstack-tripleo-validations-8.4.1-5.el7ost.noarch openstack-tripleo-common-containers-8.6.1-12.el7ost.noarch python-tripleoclient-9.2.1-9.el7ost.noarch openstack-tripleo-image-elements-8.0.1-1.el7ost.noarch openstack-tripleo-heat-templates-8.0.2-22.el7ost.noarch openstack-tripleo-puppet-elements-8.0.0-2.el7ost.noarch How reproducible: Unknown Steps to Reproduce: 1. Deploy an HCI overcloud in latest OSP 12 2. Run an upgrade process (including updating the roles_data.yaml) Actual results: The upgrade failed during the update of Ceph Expected results: The upgrade finish successfully Additional info:
My mistake, in comment 1, it says it is an HCI deployment, it is not. It is a monolithic deployment with 3 controller nodes, 2 compute nodes and 1 Ceph storage node with 5 OSDs
I thought this has been verified here: https://bugzilla.redhat.com/show_bug.cgi?id=1574995 Yogev, what's the difference? Thanks.
(In reply to leseb from comment #2) > I thought this has been verified here: > https://bugzilla.redhat.com/show_bug.cgi?id=1574995 > > Yogev, what's the difference? > Thanks. I thought the same thing, I am reopening that bug, I just wanted to have a record for it in the RH Openstack domain and make depend on that bug.