Bug 1543284 - [ceph-ansible] [ceph-container] : playbook failed trying to restart dmcrypt OSDs
Summary: [ceph-ansible] [ceph-container] : playbook failed trying to restart dmcrypt OSDs
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat Storage
Component: Ceph-Ansible
Version: 3.0
Hardware: Unspecified
OS: Unspecified
unspecified
urgent
Target Milestone: z1
: 3.0
Assignee: Guillaume Abrioux
QA Contact: Vasishta
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-02-08 07:17 UTC by Vasishta
Modified: 2018-03-08 15:54 UTC (History)
8 users (show)

Fixed In Version: RHEL: ceph-ansible-3.0.25-1.el7cp Ubuntu: ceph-ansible_3.0.25-2redhat1
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-03-08 15:54:03 UTC
Embargoed:


Attachments (Terms of Use)
File contains contents of inventory file, ansible-playbook log (9.59 MB, text/plain)
2018-02-08 07:17 UTC, Vasishta
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Github ceph ceph-ansible pull 2380 0 'None' 'closed' 'osd: fix osd restart when dmcrypt' 2019-12-06 16:32:03 UTC
Red Hat Product Errata RHBA-2018:0474 0 normal SHIPPED_LIVE Red Hat Ceph Storage 3.0 bug fix update 2018-03-08 20:51:53 UTC

Description Vasishta 2018-02-08 07:17:10 UTC
Created attachment 1393032 [details]
File contains contents of inventory file, ansible-playbook log

Description of problem:
Playbook is failing while trying to run handler 'ceph-defaults : restart ceph osds daemon(s) - container' on collocated+dmcrypt OSDs

Version-Release number of selected component (if applicable):
ceph-ansible-3.0.23-1.el7cp.noarch
ceph-3.0-rhel-7-docker-candidate-28895-20180204092708 

How reproducible:
Always (3/3)

Steps to Reproduce:
1. Configure containerized cluster with collocated+dmcrypt OSDs
2. Run rolling update.

Actual results:
ansible-playbook failing while trying to restart OSDs with dmcypt enabled

"msg": "non-zero return code", 
    "rc": 1, 
    "start": "2018-02-08 06:27:08.037939", 
    "stderr": "Error response from daemon: No such container: b2a7aacd1819", 
    "stderr_lines": [
        "Error response from daemon: No such container: b2a7aacd1819"
    ], 
    "stdout": "Socket file /var/run/ceph/mno34-osd.Timed out while trying to look for a Ceph OSD socket.\nAbort mission!.asok could not be found, which means the osd daemon is not running.", 
    "stdout_lines": [
        "Socket file /var/run/ceph/mno34-osd.Timed out while trying to look for a Ceph OSD socket.", 
        "Abort mission!.asok could not be found, which means the osd daemon is not running."

Expected results:
ansible-playbook must complete its run successfully.

Additional info:
When playbook is re run, same issue is not getting repeated.

Comment 6 Guillaume Abrioux 2018-02-08 12:48:44 UTC
fixed by https://github.com/ceph/ceph-ansible/pull/2380/commits/7d179e2abe33e8363aab48db8a392b230dcfc47a

the fix will be included in v3.0.25

Comment 7 Harish NV Rao 2018-02-12 07:05:08 UTC
@Guillaume, when will the fix be available for testing?

Comment 9 Guillaume Abrioux 2018-02-14 01:26:11 UTC
Hi Harish, the fix is available in v3.0.25

Comment 14 Vasishta 2018-02-23 13:42:59 UTC
Hi, 

Working fine using ceph-ansible-3.0.26-1.el7cp.noarch to upgrade to ceph-3.0-rhel-7-docker-candidate-38019-20180222163657 from rhceph-3-rhel7.

While upgrading cluster having OSDs with collocated journals, faced issue reported in Bug 1548357, Followed workaround mentioned in https://bugzilla.redhat.com/show_bug.cgi?id=1548357#c4 . (mention mgr's name at top of mons group in inventory file)

Moving to VERIFIED state.

Regards,
Vasishta Shatsry
AQE, Ceph

Comment 17 errata-xmlrpc 2018-03-08 15:54:03 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:0474


Note You need to log in before you can comment on or make changes to this bug.