Created attachment 1505856 [details] container upgrade failure Description of problem: During execution of the rolling update playbook to upgrade a containerized jewel installation to luminous(3.2), the playbook is failing with the following message: An exception occurred during task execution. To see the full traceback, use -vvv. The error was: AnsibleFileNotFound: Could not find or access '~/fetch//0d5194c8-20d1-410e-be3b-ba05d14e25d8//etc/ceph/ceph.mgr.ceph-clacroix-1542220323880-node1-mon.keyring' failed: [ceph-clacroix-1542220323880-node1-mon] (item={u'dest': u'/var/lib/ceph/mgr/ceph-ceph-clacroix-1542220323880-node1-mon/keyring', u'name': u'/etc/ceph/ceph.mgr.ceph-clacroix-1542220323880-node1-mon.keyring', u'copy_key': True}) => {"changed": false, "failed": true, "item": {"copy_key": true, "dest": "/var/lib/ceph/mgr/ceph-ceph-clacroix-1542220323880-node1-mon/keyring", "name": "/etc/ceph/ceph.mgr.ceph-clacroix-1542220323880-node1-mon.keyring"}, "msg": "Could not find or access '~/fetch//0d5194c8-20d1-410e-be3b-ba05d14e25d8//etc/ceph/ceph.mgr.ceph-clacroix-1542220323880-node1-mon.keyring'"} The cluster is configured prior to upgrade to collocate the mgr and mons. The fetch directory is also configured to be ~/fetch. Version-Release number of selected component (if applicable): ceph-ansible-3.2.0-0.1.rc1.el7cp.noarch How reproducible: Every attempt to upgrade a containerized jewel installation to luminous 3.2. Steps to Reproduce: 1. Install jewel containerized 2. Configure inventory to collocate mgr on existing mons 3. Run rolling update playbook Actual results: Failure (see above) during execution. Full logs attached. Expected results: Successful playbook execution and upgraded cluster. Additional info:
Can I see your inventory file? Do you have a [mgrs] section? Thanks! To give you more info, we have to determine why this task got skipped https://github.com/ceph/ceph-ansible/blob/d5409109fbec7a318fae09ad469f10ac0aae3866/infrastructure-playbooks/rolling_update.yml#L257-L258
Yes, there is a mgr section in the inventory file. The inventory file is updated after the jewel installation but before the upgrade to luminous. The entries for the mons section are copied to the mgrs section. Full contents are below. Note this was a separate run where I reproduced the issue so the hostnames won't match up to what are in the original log, however the structure should be identical. [mons] ceph-clacroix-1542743011152-node1-mon monitor_interface=eth0 ceph-clacroix-1542743011152-node3-mon monitor_interface=eth0 ceph-clacroix-1542743011152-node2-mon monitor_interface=eth0 [osds] ceph-clacroix-1542743011152-node5-osd monitor_interface=eth0 devices='["/dev/vdb", "/dev/vdc", "/dev/vdd"]' ceph-clacroix-1542743011152-node4-osd monitor_interface=eth0 devices='["/dev/vdb", "/dev/vdc", "/dev/vdd"]' ceph-clacroix-1542743011152-node6-osd monitor_interface=eth0 devices='["/dev/vdb", "/dev/vdc", "/dev/vdd"]' [rgws] ceph-clacroix-1542743011152-node9-rgw radosgw_interface=eth0 [clients] ceph-clacroix-1542743011152-node10-client client_interface=eth0 [mgrs] ceph-clacroix-1542743011152-node1-mon monitor_interface=eth0 ceph-clacroix-1542743011152-node3-mon monitor_interface=eth0 ceph-clacroix-1542743011152-node2-mon monitor_interface=eth0
*** Bug 1653667 has been marked as a duplicate of this bug. ***
In https://github.com/ceph/ceph-ansible/releases/tag/v3.2.0rc5
Verified the issue has been resolved.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:0020