Bug 1754432

Summary: [cee/sd][ceph-ansible] when running playbook to push new ceph.conf: ansible-playbook site.yml --tags='ceph_update_config' playbook fails on "The conditional check 'osd_socket_stat.rc == 0' failed" (for mon_socket_stat too)
Product: [Red Hat Storage] Red Hat Ceph Storage Reporter: Tomas Petr <tpetr>
Component: Ceph-AnsibleAssignee: Dimitri Savineau <dsavinea>
Status: CLOSED ERRATA QA Contact: Vasishta <vashastr>
Severity: medium Docs Contact: Erin Donnelly <edonnell>
Priority: medium    
Version: 3.3CC: aschoen, ceph-eng-bugs, ceph-qe-bugs, dsavinea, gabrioux, gmeno, nthomas, tchandra, tserlin, ykaul
Target Milestone: z2   
Target Release: 3.3   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: RHEL: ceph-ansible-3.2.36-1.el7cp Ubuntu: ceph-ansible_3.2.36-2redhat1 Doc Type: Bug Fix
Doc Text:
.The `ceph-ansible` playbooks are no longer missing certain tags Previously, the `ceph-ansible` playbooks were missing some tags, so running `ceph-ansible` with those specific tags was failing. With this update, the Ceph roles are tagged correctly in the `ceph-ansible` playbooks, and running `ceph-ansible` with those specific tags works as expected.
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-12-19 17:58:49 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1726135    

Description Tomas Petr 2019-09-23 08:52:32 UTC
Description of problem:
when running playbook to push new ceph.conf:
ansible-playbook site.yml --tags='ceph_update_config'
playbook fails on  "The conditional check 'osd_socket_stat.rc == 0' failed" or for mon_socket_stat too:
 - first the playbook failed on [ceph-handler : restart ceph mon daemon(s) - non container] 
 - re-run of the playbook skipped this, as the ceph.conf matched already for the mon, as generate conf in previous run did the ceph.conf change
 - but failed on OSD [ceph-handler : restart ceph osds daemon(s) - non container] 
 - next re-run would again skip this as the ceph.conf would match, as generate conf in previous run did the ceph.conf change

first run
-------
RUNNING HANDLER [ceph-handler : restart ceph mon daemon(s) - non container] ****************************************************************************************************************
Friday 20 September 2019  11:42:39 +0200 (0:00:01.517)       0:04:37.255 ****** 
fatal: [mon-node-0]: FAILED! => {
    "msg": "The conditional check 'mon_socket_stat.rc == 0' failed. The error was: error while evaluating conditional (mon_socket_stat.rc == 0): 'mon_socket_stat' is undefined\n\nThe error appears to have been in '/usr/share/ceph-ansible/
roles/ceph-handler/handlers/main.yml': line 27, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n- name: restart ceph mon daemon(s) - non container\n  ^ here\n"
-------

second run
-------
RUNNING HANDLER [ceph-handler : restart ceph osds daemon(s) - non container] ****************************************************************************************************************
Friday 20 September 2019  11:42:39 +0200 (0:00:01.517)       0:04:37.255 ****** 
fatal: [osd-node-0]: FAILED! => {
    "msg": "The conditional check 'osd_socket_stat.rc == 0' failed. The error was: error while evaluating conditional (osd_socket_stat.rc == 0): 'osd_socket_stat' is undefined\n\nThe error appears to have been in '/usr/share/ceph-ansible/
roles/ceph-handler/handlers/main.yml': line 82, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n- name: restart ceph osds daemon(s) - non container\n  ^ here\n"
-------

Version-Release number of selected component (if applicable):
ceph-mon-12.2.8-89.el7cp
ansible-2.6.14-1.el7ae.noarch
ceph-ansible-3.2.8-1.el7cp.noarch

reproduced also on latest
ceph-ansible-3.2.24-1.el7cp.noarch
ansible-2.6.19-1.el7ae.noarch
ceph-mon-12.2.12-45.el7cp

How reproducible:
always

Steps to Reproduce:
1. push new ceph.conf change with ansible-playbook site.yml --tags='ceph_update_config'
2. observe playbok fail
3.

Actual results:
playbook fails

Expected results:
playbook succeeds

Additional info:

Comment 13 Erin Donnelly 2019-11-26 20:30:36 UTC
Thanks Dimitri, could you take a look at my updated doc text and let me know if it looks ok?

Comment 15 Vasishta 2019-11-28 12:36:31 UTC
Playbook completes without any failures and config overrides are pushed to nodes.
But all configs are pushed to all nodes.
Ex - mon/osd configs are even pushed to nfs/mds/rgw nodes.

I'm moving this BZ to VERIFIED state, I've opened new BZ for issue mentioned here - BZ 1777840

VRIFIED using ceph-ansible-3.2.36-1.el7cp.noarch.rpm

Regards,
Vasishta Shastry
QE, Ceph

Comment 17 errata-xmlrpc 2019-12-19 17:58:49 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2019:4353