Bug 1925503

Summary: [cee/sd][containerized][ceph-ansible] when running site-container.yml with '--limit' on co-located nodes, the playbook fails on RUNNING HANDLER [ceph-handler : unset noup flag] triggering non-containerized version of command
Product: [Red Hat Storage] Red Hat Ceph Storage Reporter: Tomas Petr <tpetr>
Component: Ceph-AnsibleAssignee: Guillaume Abrioux <gabrioux>
Status: CLOSED ERRATA QA Contact: Vasishta <vashastr>
Severity: high Docs Contact:
Priority: unspecified    
Version: 4.2CC: aschoen, ceph-eng-bugs, gabrioux, gmeno, gsitlani, hfukumot, kdreyer, nthomas, tserlin, vashastr, vereddy, ykaul
Target Milestone: ---Keywords: TestOnly
Target Release: 4.2z2   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: ceph-ansible-4.0.49.el7cp ceph-ansible-4.0.49.el8cp Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-06-15 17:13:37 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Tomas Petr 2021-02-05 11:44:42 UTC
Description of problem:
When executing  "ansible-playbook site-container.yml  --limit 10.74.176.189" , where node '10.74.176.189' is running co-located MON and OSD services, the playbook fails on following step - executing non-containerized version of command:
-----
RUNNING HANDLER [ceph-handler : unset noup flag] *********************************************************************************************************************************************************************************************
task path: /usr/share/ceph-ansible/roles/ceph-handler/tasks/handler_osds.yml:6
Friday 05 February 2021  05:02:07 -0500 (0:00:00.141)       0:07:54.605 ******* 
fatal: [10.74.176.189 -> 10.74.176.44]: FAILED! => changed=false 
  cmd: ceph --cluster ceph osd unset noup
  msg: '[Errno 2] No such file or directory: b''ceph'': b''ceph'''
  rc: 2
-----

Other commands are executed in their containerized version - see attached ansible -vv log.
We do not see it fail when executed without '--limit'.

We see same issue when executed like:
#ansible-playbook site-container.yml --limit nfss -i hosts
 - where nfss hosts are OSD hosts as well


Version-Release number of selected component (if applicable):
ceph version 14.2.11-95.el8cp
ansible-2.9.17-1.el8ae.noarch
ceph-ansible-4.0.41-1.el8cp.noarc
ceph_docker_image_tag: 4-41
podman-2.0.5-5.module+el8.3.0+8221+97165c3f.x86_64

How reproducible:
always

Steps to Reproduce:
1. deploy containerized ceph cluster with co-located MON+OSD services
2. run "ansible-playbook site-container.yml  --limit " for on of the colocated nodes
3.

Actual results:


Expected results:


Additional info:

Comment 1 RHEL Program Management 2021-02-05 11:44:47 UTC
Please specify the severity of this bug. Severity is defined here:
https://bugzilla.redhat.com/page.cgi?id=fields.html#bug_severity.

Comment 30 errata-xmlrpc 2021-06-15 17:13:37 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: Red Hat Ceph Storage 4.2 Security and Bug Fix Update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:2445