Bug 1649957
| Summary: | jewel to luminous containerized upgrade fails when mgr is collocated with mons | ||||||
|---|---|---|---|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat Ceph Storage | Reporter: | Coady LaCroix <clacroix> | ||||
| Component: | Ceph-Ansible | Assignee: | Guillaume Abrioux <gabrioux> | ||||
| Status: | CLOSED ERRATA | QA Contact: | Coady LaCroix <clacroix> | ||||
| Severity: | urgent | Docs Contact: | |||||
| Priority: | high | ||||||
| Version: | 3.2 | CC: | aschoen, ceph-eng-bugs, clacroix, gabrioux, gmeno, hgurav, hnallurv, nthomas, rperiyas, sankarshan, seb, tserlin, vakulkar, vpoliset | ||||
| Target Milestone: | rc | Keywords: | Automation, AutomationBlocker | ||||
| Target Release: | 3.2 | Flags: | vakulkar:
automate_bug+
|
||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | RHEL: ceph-ansible-3.2.0-0.1.rc5.el7cp Ubuntu: ceph-ansible_3.2.0~rc5-2redhat1 | Doc Type: | If docs needed, set a value | ||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2019-01-03 19:02:22 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Attachments: |
|
||||||
Can I see your inventory file? Do you have a [mgrs] section? Thanks! To give you more info, we have to determine why this task got skipped https://github.com/ceph/ceph-ansible/blob/d5409109fbec7a318fae09ad469f10ac0aae3866/infrastructure-playbooks/rolling_update.yml#L257-L258 Yes, there is a mgr section in the inventory file. The inventory file is updated after the jewel installation but before the upgrade to luminous. The entries for the mons section are copied to the mgrs section. Full contents are below. Note this was a separate run where I reproduced the issue so the hostnames won't match up to what are in the original log, however the structure should be identical. [mons] ceph-clacroix-1542743011152-node1-mon monitor_interface=eth0 ceph-clacroix-1542743011152-node3-mon monitor_interface=eth0 ceph-clacroix-1542743011152-node2-mon monitor_interface=eth0 [osds] ceph-clacroix-1542743011152-node5-osd monitor_interface=eth0 devices='["/dev/vdb", "/dev/vdc", "/dev/vdd"]' ceph-clacroix-1542743011152-node4-osd monitor_interface=eth0 devices='["/dev/vdb", "/dev/vdc", "/dev/vdd"]' ceph-clacroix-1542743011152-node6-osd monitor_interface=eth0 devices='["/dev/vdb", "/dev/vdc", "/dev/vdd"]' [rgws] ceph-clacroix-1542743011152-node9-rgw radosgw_interface=eth0 [clients] ceph-clacroix-1542743011152-node10-client client_interface=eth0 [mgrs] ceph-clacroix-1542743011152-node1-mon monitor_interface=eth0 ceph-clacroix-1542743011152-node3-mon monitor_interface=eth0 ceph-clacroix-1542743011152-node2-mon monitor_interface=eth0 *** Bug 1653667 has been marked as a duplicate of this bug. *** Verified the issue has been resolved. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:0020 |
Created attachment 1505856 [details] container upgrade failure Description of problem: During execution of the rolling update playbook to upgrade a containerized jewel installation to luminous(3.2), the playbook is failing with the following message: An exception occurred during task execution. To see the full traceback, use -vvv. The error was: AnsibleFileNotFound: Could not find or access '~/fetch//0d5194c8-20d1-410e-be3b-ba05d14e25d8//etc/ceph/ceph.mgr.ceph-clacroix-1542220323880-node1-mon.keyring' failed: [ceph-clacroix-1542220323880-node1-mon] (item={u'dest': u'/var/lib/ceph/mgr/ceph-ceph-clacroix-1542220323880-node1-mon/keyring', u'name': u'/etc/ceph/ceph.mgr.ceph-clacroix-1542220323880-node1-mon.keyring', u'copy_key': True}) => {"changed": false, "failed": true, "item": {"copy_key": true, "dest": "/var/lib/ceph/mgr/ceph-ceph-clacroix-1542220323880-node1-mon/keyring", "name": "/etc/ceph/ceph.mgr.ceph-clacroix-1542220323880-node1-mon.keyring"}, "msg": "Could not find or access '~/fetch//0d5194c8-20d1-410e-be3b-ba05d14e25d8//etc/ceph/ceph.mgr.ceph-clacroix-1542220323880-node1-mon.keyring'"} The cluster is configured prior to upgrade to collocate the mgr and mons. The fetch directory is also configured to be ~/fetch. Version-Release number of selected component (if applicable): ceph-ansible-3.2.0-0.1.rc1.el7cp.noarch How reproducible: Every attempt to upgrade a containerized jewel installation to luminous 3.2. Steps to Reproduce: 1. Install jewel containerized 2. Configure inventory to collocate mgr on existing mons 3. Run rolling update playbook Actual results: Failure (see above) during execution. Full logs attached. Expected results: Successful playbook execution and upgraded cluster. Additional info: