Bug 1548357

Summary:

[ceph-ansible] [ceph-container] : during rolling update playbook failing trying to restart mgr without copying restart script to mgr

Product:

[Red Hat Storage] Red Hat Ceph Storage

Reporter:

Vasishta <vashastr>

Component:

Ceph-Ansible

Assignee:

Sébastien Han <shan>

Status:

CLOSED ERRATA

QA Contact:

Vasishta <vashastr>

Severity:

high

Docs Contact:

Priority:

unspecified

Version:

3.0

CC:

adeza, agunn, aschoen, ceph-eng-bugs, gabrioux, gmeno, hnallurv, kdreyer, nthomas, sankarshan, vashastr

Target Milestone:

Target Release:

3.0

Hardware:

Unspecified

OS:

Unspecified

Whiteboard:

Fixed In Version:

RHEL: ceph-ansible-3.0.27-1.el7cp Ubuntu: ceph-ansible_3.0.27-2redhat1

Doc Type:

Bug Fix

Doc Text:

.A required script is copied when doing a rolling upgrade with Ansible Previously, if the active Ceph Manager node is not the first node to be upgraded, when running the `ceph-ansible` rolling update playbook, then a required restart script was not copied to the Ceph Manager node. This would cause the rolling update to fail. In this release, the required script does get copied to the Ceph Manager node.

Story Points:

---

Clone Of:

Environment:

Last Closed:

2018-03-08 15:54:03 UTC

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
File contains contents of inventory file, ansible-playbook log	none

Description Vasishta 2018-02-23 10:03:36 UTC

Created attachment 1399797 [details]
File contains contents of inventory file, ansible-playbook log

Description of problem:
During rolling update playbook failing trying to restart (handler) mgr without copying restart script to mgr

As per the observation, this issue is happening when the first node that playbook takes up is not the active mgr 

Version-Release number of selected component (if applicable):
ceph-ansible-3.0.26-1.el7cp.noarch

How reproducible:
Always (2/2)

Steps to Reproduce:
1. Configure containerized cluster of (latest-1) version 
2. Upgrade to latest version running rolling_update


Actual results:
RUNNING HANDLER [ceph-defaults : restart ceph mgr daemon(s) - container] **
failed: [magna035 -> magna043] (item=magna043) => {"changed": false, "cmd": "/tmp/restart_mgr_daemon.sh", "item": "magna043", "msg": "[Errno 2] No such file or directory", "rc": 2}


Expected results:
Playbook must not try to restart daemon without copying respective script

Additional info:

Comment 3 Guillaume Abrioux 2018-02-23 10:19:21 UTC

will be included in v3.0.27

Comment 4 Vasishta 2018-02-23 13:40:35 UTC

Working fine with workaround -

Mention active mgr's name in top of the mon group (when monitor and mgrs are collocated) in the inventory file and run rolling_update again.

Comment 5 Ken Dreyer (Red Hat) 2018-02-23 21:52:00 UTC

Based on the workaround, retargeting to z2

Comment 12 Vasishta 2018-03-01 18:10:41 UTC

1) Used - ceph-ansible-3.0.27-1.el7cp.noarch

2) Ensured that active mgr is not the first one to be listed in both mon group and mgr. 

3) rolling updated worked fine upgrading cluster from 3.0 live to ceph-3.0-rhel-7-docker-candidate-99411-20180228192608 .


Moving to VERIFIED state.

Regards,
Vasishta Shastry
AQE, Ceph

Comment 16 errata-xmlrpc 2018-03-08 15:54:03 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:0474