1649957 – jewel to luminous containerized upgrade fails when mgr is collocated with mons

Bug 1649957 - jewel to luminous containerized upgrade fails when mgr is collocated with mons

Summary: jewel to luminous containerized upgrade fails when mgr is collocated with mons

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Ceph Storage
Classification:	Red Hat Storage
Component:	Ceph-Ansible
Sub Component:
Version:	3.2
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	urgent
Target Milestone:	rc
Target Release:	3.2
Assignee:	Guillaume Abrioux
QA Contact:	Coady LaCroix
Docs Contact:
URL:
Whiteboard:
Duplicates (1):	1653667 (view as bug list)
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2018-11-14 23:40 UTC by Coady LaCroix
Modified:	2019-01-03 19:02 UTC (History)
CC List:	14 users (show)
Fixed In Version:	RHEL: ceph-ansible-3.2.0-0.1.rc5.el7cp Ubuntu: ceph-ansible_3.2.0~rc5-2redhat1
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2019-01-03 19:02:22 UTC
Embargoed:
Dependent Products:
Flags:	vakulkar: automate_bug+

Attachments	(Terms of Use)
container upgrade failure (541.56 KB, application/zip) 2018-11-14 23:40 UTC, Coady LaCroix	no flags	Details
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Github	ceph ceph-ansible pull 3372	0	None	None	None	2018-11-27 13:36:32 UTC
Red Hat Product Errata	RHBA-2019:0020	0	None	None	None	2019-01-03 19:02:28 UTC

Description Coady LaCroix 2018-11-14 23:40:14 UTC

Created attachment 1505856 [details]
container upgrade failure

Description of problem: 

During execution of the rolling update playbook to upgrade a containerized jewel installation to luminous(3.2), the playbook is failing with the following message:

An exception occurred during task execution. To see the full traceback, use -vvv. The error was: AnsibleFileNotFound: Could not find or access '~/fetch//0d5194c8-20d1-410e-be3b-ba05d14e25d8//etc/ceph/ceph.mgr.ceph-clacroix-1542220323880-node1-mon.keyring'
failed: [ceph-clacroix-1542220323880-node1-mon] (item={u'dest': u'/var/lib/ceph/mgr/ceph-ceph-clacroix-1542220323880-node1-mon/keyring', u'name': u'/etc/ceph/ceph.mgr.ceph-clacroix-1542220323880-node1-mon.keyring', u'copy_key': True}) => {"changed": false, "failed": true, "item": {"copy_key": true, "dest": "/var/lib/ceph/mgr/ceph-ceph-clacroix-1542220323880-node1-mon/keyring", "name": "/etc/ceph/ceph.mgr.ceph-clacroix-1542220323880-node1-mon.keyring"}, "msg": "Could not find or access '~/fetch//0d5194c8-20d1-410e-be3b-ba05d14e25d8//etc/ceph/ceph.mgr.ceph-clacroix-1542220323880-node1-mon.keyring'"}

The cluster is configured prior to upgrade to collocate the mgr and mons. The fetch directory is also configured to be ~/fetch.


Version-Release number of selected component (if applicable):
ceph-ansible-3.2.0-0.1.rc1.el7cp.noarch

How reproducible:
Every attempt to upgrade a containerized jewel installation to luminous 3.2.

Steps to Reproduce:
1. Install jewel containerized 
2. Configure inventory to collocate mgr on existing mons
3. Run rolling update playbook

Actual results:
Failure (see above) during execution. Full logs attached.

Expected results:
Successful playbook execution and upgraded cluster.

Additional info:

Comment 3 Sébastien Han 2018-11-20 17:34:54 UTC

Can I see your inventory file? Do you have a [mgrs] section?
Thanks!


To give you more info, we have to determine why this task got skipped https://github.com/ceph/ceph-ansible/blob/d5409109fbec7a318fae09ad469f10ac0aae3866/infrastructure-playbooks/rolling_update.yml#L257-L258

Comment 4 Coady LaCroix 2018-11-20 21:09:09 UTC

Yes, there is a mgr section in the inventory file. The inventory file is updated after the jewel installation but before the upgrade to luminous. The entries for the mons section are copied to the mgrs section. Full contents are below. Note this was a separate run where I reproduced the issue so the hostnames won't match up to what are in the original log, however the structure should be identical.

[mons]
ceph-clacroix-1542743011152-node1-mon monitor_interface=eth0
ceph-clacroix-1542743011152-node3-mon monitor_interface=eth0
ceph-clacroix-1542743011152-node2-mon monitor_interface=eth0
[osds]
ceph-clacroix-1542743011152-node5-osd monitor_interface=eth0  devices='["/dev/vdb", "/dev/vdc", "/dev/vdd"]' 
ceph-clacroix-1542743011152-node4-osd monitor_interface=eth0  devices='["/dev/vdb", "/dev/vdc", "/dev/vdd"]' 
ceph-clacroix-1542743011152-node6-osd monitor_interface=eth0  devices='["/dev/vdb", "/dev/vdc", "/dev/vdd"]' 
[rgws]
ceph-clacroix-1542743011152-node9-rgw radosgw_interface=eth0
[clients]
ceph-clacroix-1542743011152-node10-client client_interface=eth0
[mgrs]
ceph-clacroix-1542743011152-node1-mon monitor_interface=eth0
ceph-clacroix-1542743011152-node3-mon monitor_interface=eth0
ceph-clacroix-1542743011152-node2-mon monitor_interface=eth0

Comment 5 seb 2018-11-27 13:25:48 UTC

*** Bug 1653667 has been marked as a duplicate of this bug. ***

Comment 6 seb 2018-11-29 09:01:03 UTC

In https://github.com/ceph/ceph-ansible/releases/tag/v3.2.0rc5

Comment 10 Coady LaCroix 2018-12-04 22:23:04 UTC

Verified the issue has been resolved.

Comment 12 errata-xmlrpc 2019-01-03 19:02:22 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0020

Note You need to log in before you can comment on or make changes to this bug.