Bug 1670625
| Summary: | timeout makes podman pull hang or fail ceph deployment on rhel8 beta | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat Ceph Storage | Reporter: | John Fulton <johfulto> | ||||||
| Component: | Ceph-Ansible | Assignee: | John Fulton <johfulto> | ||||||
| Status: | CLOSED ERRATA | QA Contact: | Yogev Rabl <yrabl> | ||||||
| Severity: | high | Docs Contact: | |||||||
| Priority: | medium | ||||||||
| Version: | 4.0 | CC: | anharris, aschoen, ceph-eng-bugs, gfidente, gmeno, johfulto, nthomas, tserlin, yrabl | ||||||
| Target Milestone: | rc | ||||||||
| Target Release: | 4.0 | ||||||||
| Hardware: | Unspecified | ||||||||
| OS: | Unspecified | ||||||||
| Whiteboard: | |||||||||
| Fixed In Version: | ceph-ansible-4.0.0-0.1.rc6.el8cp | Doc Type: | If docs needed, set a value | ||||||
| Doc Text: | Story Points: | --- | |||||||
| Clone Of: | Environment: | ||||||||
| Last Closed: | 2020-01-31 12:45:38 UTC | Type: | Bug | ||||||
| Regression: | --- | Mount Type: | --- | ||||||
| Documentation: | --- | CRM: | |||||||
| Verified Versions: | Category: | --- | |||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||
| Embargoed: | |||||||||
| Bug Depends On: | |||||||||
| Bug Blocks: | 1594251 | ||||||||
| Attachments: |
|
||||||||
Created attachment 1524802 [details]
applying this workaround to ceph-ansible lets me wokraround this issue
Created attachment 1524804 [details]
tarball of my modified ceph-ansible where error was reproduced including ceph-ansible run log
John, I'm not sure to understand the issue here. I see a timeout set to 300sec so 5min, then I see a delta of '0:05:00.004171' which makes me believe the code behaves as it should. Am I missing something? (In reply to leseb from comment #4) > John, I'm not sure to understand the issue here. > I see a timeout set to 300sec so 5min, then I see a delta of > '0:05:00.004171' which makes me believe the code behaves as it should. > > Am I missing something? Sorry for not including that. When retries=5, the first time around timeout works as it should. However, when we get to retries=4, the timeout does not work as the 'podman pull' command hangs longer than 300s. The following copy/paste was taken on Jan 28th showing that it was hung for 3 days. [root@rhel8 ~]# ps axu | grep podman root 19893 0.0 0.0 219328 992 pts/2 S+ 15:07 0:00 grep --color=auto podman root 45442 0.0 0.3 2795056 54452 ? Ssl Jan25 0:07 /usr/bin/podman start -a memcached root 45519 0.0 0.0 85972 1756 ? Ssl Jan25 0:00 /usr/libexec/podman/conmon -s -c 7b3ad8a268388fa25916f30dadbc13ba394e76f1f9bcddc49c17b75c80904aca -u 7b3ad8a268388fa25916f30dadbc13ba394e76f1f9bcddc49c17b75c80904aca -r /usr/bin/runc -b /var/lib/containers/storage/overlay-containers/7b3ad8a268388fa25916f30dadbc13ba394e76f1f9bcddc49c17b75c80904aca/userdata -p /var/run/containers/storage/overlay-containers/7b3ad8a268388fa25916f30dadbc13ba394e76f1f9bcddc49c17b75c80904aca/userdata/pidfile -l /var/lib/containers/storage/overlay-containers/7b3ad8a268388fa25916f30dadbc13ba394e76f1f9bcddc49c17b75c80904aca/userdata/ctr.log --exit-dir /var/run/libpod/exits --exit-command /usr/bin/podman --exit-command-arg --root --exit-command-arg /var/lib/containers/storage --exit-command-arg --runroot --exit-command-arg /var/run/containers/storage --exit-command-arg --log-level --exit-command-arg error --exit-command-arg --cgroup-manager --exit-command-arg systemd --exit-command-arg --tmpdir --exit-command-arg /var/run/libpod --exit-command-arg --storage-driver --exit-command-arg overlay --exit-command-arg container --exit-command-arg cleanup --exit-command-arg 7b3ad8a268388fa25916f30dadbc13ba394e76f1f9bcddc49c17b75c80904aca --socket-dir-path /var/run/libpod/socket --log-level error root 46026 0.0 0.3 2795056 54480 ? Ssl Jan25 0:06 /usr/bin/podman start -a rabbitmq root 46102 0.0 0.0 85972 1808 ? Ssl Jan25 0:00 /usr/libexec/podman/conmon -s -c 365dbea80b7df33a533373c248af81a2d99b0b217c73145e641f2adfa56b3d6e -u 365dbea80b7df33a533373c248af81a2d99b0b217c73145e641f2adfa56b3d6e -r /usr/bin/runc -b /var/lib/containers/storage/overlay-containers/365dbea80b7df33a533373c248af81a2d99b0b217c73145e641f2adfa56b3d6e/userdata -p /var/run/containers/storage/overlay-containers/365dbea80b7df33a533373c248af81a2d99b0b217c73145e641f2adfa56b3d6e/userdata/pidfile -l /var/lib/containers/storage/overlay-containers/365dbea80b7df33a533373c248af81a2d99b0b217c73145e641f2adfa56b3d6e/userdata/ctr.log --exit-dir /var/run/libpod/exits --exit-command /usr/bin/podman --exit-command-arg --root --exit-command-arg /var/lib/containers/storage --exit-command-arg --runroot --exit-command-arg /var/run/containers/storage --exit-command-arg --log-level --exit-command-arg error --exit-command-arg --cgroup-manager --exit-command-arg systemd --exit-command-arg --tmpdir --exit-command-arg /var/run/libpod --exit-command-arg --storage-driver --exit-command-arg overlay --exit-command-arg container --exit-command-arg cleanup --exit-command-arg 365dbea80b7df33a533373c248af81a2d99b0b217c73145e641f2adfa56b3d6e --socket-dir-path /var/run/libpod/socket --log-level error root 47371 0.0 0.0 218752 960 pts/1 S Jan25 0:00 timeout 300s podman pull docker.io/ceph/daemon:latest-master root 47372 0.0 0.3 2132236 59524 pts/1 Tl Jan25 0:00 podman pull docker.io/ceph/daemon:latest-master [root@rhel8 ~]# # hung for 3 days [root@rhel8 ~]# # a little more than 300s Why is timeout being used for the container image pull? https://github.com/ceph/ceph-ansible/commit/ab587642885f1f518fe14ee7f1c7fc8cbbbf29f0 root cause of this might be https://bugzilla.redhat.com/show_bug.cgi?id=1671023, but I'd prefer to have a workaround, e.g. if the user chooses not to use timeout by setting "docker_pull_timeout: 0", then it would be nice if ceph-ansible didn't use timeout. I will be happy to send a PR for this. Adding "-s 9" to the timeout command [1] made the deployment not hang indefinitely and instead fail, so the timeout worked with that change which is an improvement [2].
However, I don't see why the deployment should fail just because ansible ran the command. If I run it directly, both with or without timeout, then the file downloads with no problem [3]; thus I prefer the option of being able to not use timeout.
[1]
[stack@rhel8 ceph-ansible]$ git diff
diff --git a/roles/ceph-container-common/tasks/fetch_image.yml b/roles/ceph-container-common/tasks/fetch_image.yml
index 30c69289..ca34d65b 100644
--- a/roles/ceph-container-common/tasks/fetch_image.yml
+++ b/roles/ceph-container-common/tasks/fetch_image.yml
@@ -177,7 +177,7 @@
- ceph_nfs_container_inspect_before_pull.get('rc') == 0
- name: "pulling {{ ceph_docker_registry }}/{{ ceph_docker_image }}:{{ ceph_docker_image_tag }} image"
- command: "timeout {{ docker_pull_timeout }} {{ container_binary }} pull {{ ceph_docker_registry }}/{{ ceph_docker_image }}:{{ ceph_docker_image_tag }}"
+ command: "timeout -s 9 {{ docker_pull_timeout }} {{ container_binary }} pull {{ ceph_docker_registry }}/{{ ceph_docker_image }}:{{ ceph_docker_image_tag }}"
changed_when: false
register: docker_image
until: docker_image.rc == 0
[stack@rhel8 ceph-ansible]$
[2]
2019-01-30 17:18:40,221 p=44210 u=root | TASK [ceph-container-common : set_fact ceph_nfs_image_repodigest_before_pulling] ***
2019-01-30 17:18:40,221 p=44210 u=root | task path: /home/stack/ceph-ansible/roles/ceph-container-common/tasks/fetch_image.yml:172
2019-01-30 17:18:40,221 p=44210 u=root | Wednesday 30 January 2019 17:18:40 +0000 (0:00:00.042) 0:00:17.822 *****
2019-01-30 17:18:40,235 p=44210 u=root | skipping: [rhel8] => changed=false
skip_reason: Conditional result was False
2019-01-30 17:18:40,265 p=44210 u=root | TASK [ceph-container-common : pulling docker.io/ceph/daemon:latest-master image] ***
2019-01-30 17:18:40,265 p=44210 u=root | task path: /home/stack/ceph-ansible/roles/ceph-container-common/tasks/fetch_image.yml:179
2019-01-30 17:18:40,265 p=44210 u=root | Wednesday 30 January 2019 17:18:40 +0000 (0:00:00.043) 0:00:17.866 *****
2019-01-30 17:18:40,286 p=44210 u=root | Using module file /usr/lib/python3.6/site-packages/ansible/modules/commands/command.py
2019-01-30 17:23:40,415 p=44210 u=root | FAILED - RETRYING: pulling docker.io/ceph/daemon:latest-master image (3 retries left).Result was: changed=false
attempts: 1
cmd:
- timeout
- -s
- '9'
- 300s
- podman
- pull
- docker.io/ceph/daemon:latest-master
delta: '0:05:00.003505'
end: '2019-01-30 17:23:40.397354'
invocation:
module_args:
_raw_params: timeout -s 9 300s podman pull docker.io/ceph/daemon:latest-master
_uses_shell: false
argv: null
chdir: null
creates: null
executable: null
removes: null
stdin: null
warn: true
msg: non-zero return code
rc: -9
retries: 4
start: '2019-01-30 17:18:40.393849'
stderr: Trying to pull docker.io/ceph/daemon:latest-master...Getting image source signatures
stderr_lines:
- Trying to pull docker.io/ceph/daemon:latest-master...Getting image source signatures
stdout: ''
stdout_lines: <omitted>
2019-01-30 17:23:50,422 p=44210 u=root | Using module file /usr/lib/python3.6/site-packages/ansible/modules/commands/command.py
2019-01-30 17:28:50,554 p=44210 u=root | FAILED - RETRYING: pulling docker.io/ceph/daemon:latest-master image (2 retries left).Result was: changed=false
attempts: 2
cmd:
- timeout
- -s
- '9'
- 300s
- podman
- pull
- docker.io/ceph/daemon:latest-master
delta: '0:05:00.004152'
end: '2019-01-30 17:28:50.536582'
invocation:
module_args:
_raw_params: timeout -s 9 300s podman pull docker.io/ceph/daemon:latest-master
_uses_shell: false
argv: null
chdir: null
creates: null
executable: null
removes: null
stdin: null
warn: true
msg: non-zero return code
rc: -9
retries: 4
start: '2019-01-30 17:23:50.532430'
stderr: Trying to pull docker.io/ceph/daemon:latest-master...Getting image source signatures
stderr_lines:
- Trying to pull docker.io/ceph/daemon:latest-master...Getting image source signatures
stdout: ''
stdout_lines: <omitted>
2019-01-30 17:29:00,562 p=44210 u=root | Using module file /usr/lib/python3.6/site-packages/ansible/modules/commands/command.py
2019-01-30 17:34:00,703 p=44210 u=root | FAILED - RETRYING: pulling docker.io/ceph/daemon:latest-master image (1 retries left).Result was: changed=false
attempts: 3
cmd:
- timeout
- -s
- '9'
- 300s
- podman
- pull
- docker.io/ceph/daemon:latest-master
delta: '0:05:00.003979'
end: '2019-01-30 17:34:00.679542'
invocation:
module_args:
_raw_params: timeout -s 9 300s podman pull docker.io/ceph/daemon:latest-master
_uses_shell: false
argv: null
chdir: null
creates: null
executable: null
removes: null
stdin: null
warn: true
msg: non-zero return code
rc: -9
retries: 4
start: '2019-01-30 17:29:00.675563'
stderr: Trying to pull docker.io/ceph/daemon:latest-master...Getting image source signatures
stderr_lines:
- Trying to pull docker.io/ceph/daemon:latest-master...Getting image source signatures
stdout: ''
stdout_lines: <omitted>
2019-01-30 17:34:10,709 p=44210 u=root | Using module file /usr/lib/python3.6/site-packages/ansible/modules/commands/command.py
2019-01-30 17:39:10,843 p=44210 u=root | fatal: [rhel8]: FAILED! => changed=false
attempts: 3
cmd:
- timeout
- -s
- '9'
- 300s
- podman
- pull
- docker.io/ceph/daemon:latest-master
delta: '0:05:00.004553'
end: '2019-01-30 17:39:10.822775'
invocation:
module_args:
_raw_params: timeout -s 9 300s podman pull docker.io/ceph/daemon:latest-master
_uses_shell: false
argv: null
chdir: null
creates: null
executable: null
removes: null
stdin: null
warn: true
msg: non-zero return code
rc: -9
start: '2019-01-30 17:34:10.818222'
stderr: Trying to pull docker.io/ceph/daemon:latest-master...Getting image source signatures
stderr_lines:
- Trying to pull docker.io/ceph/daemon:latest-master...Getting image source signatures
stdout: ''
stdout_lines: <omitted>
2019-01-30 17:39:10,844 p=44210 u=root | NO MORE HOSTS LEFT *************************************************************
2019-01-30 17:39:10,844 p=44210 u=root | PLAY RECAP *********************************************************************
2019-01-30 17:39:10,844 p=44210 u=root | rhel8 : ok=45 changed=3 unreachable=0 failed=1
2019-01-30 17:39:10,845 p=44210 u=root | INSTALLER STATUS ***************************************************************
2019-01-30 17:39:10,849 p=44210 u=root | Install Ceph Monitor : In Progress (0:20:36)
2019-01-30 17:39:10,849 p=44210 u=root | This phase can be restarted by running: roles/ceph-mon/tasks/main.yml
2019-01-30 17:39:10,849 p=44210 u=root | Wednesday 30 January 2019 17:39:10 +0000 (0:20:30.583) 0:20:48.450 *****
2019-01-30 17:39:10,849 p=44210 u=root | ===============================================================================
2019-01-30 17:39:10,850 p=44210 u=root | ceph-container-common : pulling docker.io/ceph/daemon:latest-master image 1230.58s
/home/stack/ceph-ansible/roles/ceph-container-common/tasks/fetch_image.yml:179
2019-01-30 17:39:10,850 p=44210 u=root | install python for fedora ----------------------------------------------- 4.50s
/usr/share/ceph-ansible/raw_install_python.yml:14 -----------------------------
2019-01-30 17:39:10,850 p=44210 u=root | gather and delegate facts ----------------------------------------------- 2.02s
/usr/share/ceph-ansible/site-docker.yml.sample:34 -----------------------------
2019-01-30 17:39:10,850 p=44210 u=root | ceph-validate : validate provided configuration ------------------------- 1.37s
/home/stack/ceph-ansible/roles/ceph-validate/tasks/main.yml:2 -----------------
2019-01-30 17:39:10,850 p=44210 u=root | ceph-handler : check for a mon container -------------------------------- 0.40s
/home/stack/ceph-ansible/roles/ceph-handler/tasks/check_running_containers.yml:2
2019-01-30 17:39:10,850 p=44210 u=root | ceph-facts : create a local fetch directory if it does not exist -------- 0.32s
/home/stack/ceph-ansible/roles/ceph-facts/tasks/facts.yml:74 ------------------
2019-01-30 17:39:10,851 p=44210 u=root | ceph-container-common : remove ceph udev rules -------------------------- 0.30s
/home/stack/ceph-ansible/roles/ceph-container-common/tasks/pre_requisites/remove_ceph_udev_rules.yml:2
2019-01-30 17:39:10,851 p=44210 u=root | check for python -------------------------------------------------------- 0.26s
/usr/share/ceph-ansible/raw_install_python.yml:2 ------------------------------
2019-01-30 17:39:10,851 p=44210 u=root | ceph-handler : check for an osd container ------------------------------- 0.26s
/home/stack/ceph-ansible/roles/ceph-handler/tasks/check_running_containers.yml:11
2019-01-30 17:39:10,851 p=44210 u=root | ceph-handler : check for a mgr container -------------------------------- 0.26s
/home/stack/ceph-ansible/roles/ceph-handler/tasks/check_running_containers.yml:38
2019-01-30 17:39:10,851 p=44210 u=root | ceph-handler : check for a mon container -------------------------------- 0.25s
/home/stack/ceph-ansible/roles/ceph-handler/tasks/check_running_containers.yml:2
2019-01-30 17:39:10,851 p=44210 u=root | ceph-handler : check for a mgr container -------------------------------- 0.25s
/home/stack/ceph-ansible/roles/ceph-handler/tasks/check_running_containers.yml:38
2019-01-30 17:39:10,851 p=44210 u=root | ceph-handler : check for an osd container ------------------------------- 0.25s
/home/stack/ceph-ansible/roles/ceph-handler/tasks/check_running_containers.yml:11
2019-01-30 17:39:10,851 p=44210 u=root | ceph-facts : is ceph running already? ----------------------------------- 0.21s
/home/stack/ceph-ansible/roles/ceph-facts/tasks/facts.yml:53 ------------------
2019-01-30 17:39:10,852 p=44210 u=root | ceph-facts : check if podman binary is present -------------------------- 0.20s
/home/stack/ceph-ansible/roles/ceph-facts/tasks/facts.yml:11 ------------------
2019-01-30 17:39:10,852 p=44210 u=root | check if podman binary is present --------------------------------------- 0.20s
/usr/share/ceph-ansible/site-docker.yml.sample:56 -----------------------------
2019-01-30 17:39:10,852 p=44210 u=root | check if it is atomic host ---------------------------------------------- 0.17s
/usr/share/ceph-ansible/site-docker.yml.sample:43 -----------------------------
2019-01-30 17:39:10,852 p=44210 u=root | ceph-facts : set_fact _monitor_address to monitor_address_block --------- 0.17s
/home/stack/ceph-ansible/roles/ceph-facts/tasks/set_monitor_address.yml:2 -----
2019-01-30 17:39:10,852 p=44210 u=root | ceph-facts : check if it is atomic host --------------------------------- 0.16s
/home/stack/ceph-ansible/roles/ceph-facts/tasks/facts.yml:2 -------------------
2019-01-30 17:39:10,852 p=44210 u=root | ceph-validate : include check_system.yml -------------------------------- 0.11s
/home/stack/ceph-ansible/roles/ceph-validate/tasks/main.yml:47 ----------------
[3]
[stack@rhel8 ceph-ansible]$ time podman pull docker.io/ceph/daemon:latest-master
Trying to pull docker.io/ceph/daemon:latest-master...Getting image source signatures
Copying blob a02a4930cb5d: 71.68 MiB / 71.68 MiB [=========================] 19s
Copying blob a9df2003811e: 149.99 MiB / 149.99 MiB [=======================] 19s
Copying blob 9f8cc4bfe3e6: 36.83 MiB / 36.83 MiB [=========================] 19s
Copying blob d9b683c59cc1: 763 B / 763 B [=================================] 19s
Copying blob 945ffa6dc462: 423 B / 423 B [=================================] 19s
Copying blob 5079a1e5ea24: 292 B / 292 B [=================================] 19s
Copying blob 40e4854b4ccc: 30.96 KiB / 30.96 KiB [=========================] 19s
Copying blob 9ffb0b7f6cdf: 412 B / 412 B [=================================] 19s
Copying blob 91fb7adff82f: 483.29 KiB / 483.29 KiB [=======================] 19s
Copying blob 7668d903fd4d: 1.45 KiB / 1.45 KiB [===========================] 19s
Copying config f23e24277d1a: 15.38 KiB / 15.38 KiB [========================] 0s
Writing manifest to image destination
Storing signatures
f23e24277d1a3d35999b36de8878638bab088293f227b42e7e9b33064387f350
real 0m27.068s
user 0m21.099s
sys 0m3.221s
[stack@rhel8 ceph-ansible]$
[stack@rhel8 ceph-ansible]$ timeout -s 9 60 time podman pull docker.io/ceph/daemon:latest-master
timeout: failed to run command ‘time’: No such file or directory
[stack@rhel8 ceph-ansible]$ time timeout -s 9 60 podman pull docker.io/ceph/daemon:latest-master
Trying to pull docker.io/ceph/daemon:latest-master...Getting image source signatures
Copying blob a02a4930cb5d: 71.68 MiB / 71.68 MiB [=========================] 25s
Copying blob a9df2003811e: 149.99 MiB / 149.99 MiB [=======================] 25s
Copying blob 9f8cc4bfe3e6: 36.83 MiB / 36.83 MiB [=========================] 25s
Copying blob d9b683c59cc1: 763 B / 763 B [=================================] 25s
Copying blob 945ffa6dc462: 423 B / 423 B [=================================] 25s
Copying blob 5079a1e5ea24: 292 B / 292 B [=================================] 25s
Copying blob 40e4854b4ccc: 30.96 KiB / 30.96 KiB [=========================] 25s
Copying blob 9ffb0b7f6cdf: 412 B / 412 B [=================================] 25s
Copying blob 91fb7adff82f: 483.29 KiB / 483.29 KiB [=======================] 25s
Copying blob 7668d903fd4d: 1.45 KiB / 1.45 KiB [===========================] 25s
Copying config f23e24277d1a: 15.38 KiB / 15.38 KiB [========================] 0s
Writing manifest to image destination
Storing signatures
f23e24277d1a3d35999b36de8878638bab088293f227b42e7e9b33064387f350
real 0m33.605s
user 0m21.198s
sys 0m3.193s
[stack@rhel8 ceph-ansible]$
Assigning to John since he did the fix. Part of https://github.com/ceph/ceph-ansible/releases/tag/v4.0.0beta1 Verified on ceph-ansible-4.0.0-0.1.rc6.el8cp.noarch Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:0312 |
timeout makes 'podman pull' hang more than the timeout value on rhel8 beta [1] Though I first discovered this when using TripleO to trigger ceph-ansible on RHEL8 beta, I found I could reproduce it without using TripleO [2] [1] 2019-01-25 14:45:33,583 p=93318 u=root | TASK [ceph-container-common : pulling docker.io/ceph/daemon:latest-master image] *** 2019-01-25 14:45:33,583 p=93318 u=root | task path: /home/stack/ceph-ansible/roles/ceph-container-common/tasks/fetch_image.yml:179 2019-01-25 14:45:33,583 p=93318 u=root | Friday 25 January 2019 14:45:33 +0000 (0:00:00.129) 0:00:14.050 ******** 2019-01-25 14:45:33,604 p=93318 u=root | Using module file /usr/lib/python3.6/site-packages/ansible/modules/commands/command.py 2019-01-25 14:50:33,732 p=93318 u=root | FAILED - RETRYING: pulling docker.io/ceph/daemon:latest-master image (3 retries left).Result was: changed=false attempts: 1 cmd: - timeout - 300s - podman - pull - docker.io/ceph/daemon:latest-master delta: '0:05:00.004171' end: '2019-01-25 14:50:33.714004' invocation: module_args: _raw_params: timeout 300s podman pull docker.io/ceph/daemon:latest-master _uses_shell: false argv: null chdir: null creates: null executable: null removes: null stdin: null warn: true msg: non-zero return code rc: 124 retries: 4 start: '2019-01-25 14:45:33.709833' stderr: Trying to pull docker.io/ceph/daemon:latest-master...Getting image source signatures stderr_lines: - Trying to pull docker.io/ceph/daemon:latest-master...Getting image source signatures stdout: '' stdout_lines: <omitted> 2019-01-25 14:50:43,739 p=93318 u=root | Using module file /usr/lib/python3.6/site-packages/ansible/modules/commands/command.py [2] [root@rhel8 ~]# cd /root/undercloud-ansible-su_6px97/ceph-ansible [root@rhel8 ceph-ansible]# ANSIBLE_ACTION_PLUGINS=/usr/share/ceph-ansible/plugins/actions/ ANSIBLE_CALLBACK_PLUGINS=/usr/share/ceph-ansible/plugins/callback/ ANSIBLE_ROLES_PATH=/usr/share/ceph-ansible/roles/ ANSIBLE_LOG_PATH=\"/root/undercloud-ansible-su_6px97/ceph-ansible/ceph_ansible_command.log\" ANSIBLE_LIBRARY=/usr/share/ceph-ansible/library/ ANSIBLE_CONFIG=/usr/share/ceph-ansible/ansible.cfg ANSIBLE_REMOTE_TEMP=/tmp/ceph_ansible_tmp ANSIBLE_FORKS=25 ansible-playbook -e ansible_python_interpreter=/usr/bin/python3 -vvvv --skip-tags package-install,with_pkg -i /root/undercloud-ansible-su_6px97/ceph-ansible/inventory.yml --extra-vars @/root/undercloud-ansible-su_6px97/ceph-ansible/extra_vars.yml /usr/share/ceph-ansible/site-docker.yml.sample ... TASK [ceph-container-common : pulling docker.io/ceph/daemon:latest-master image] ********************************************************************************************************************************** task path: /root/ceph-ansible/roles/ceph-container-common/tasks/fetch_image.yml:179 Tuesday 29 January 2019 22:38:36 +0000 (0:00:00.045) 0:00:15.088 ******* Using module file /usr/lib/python3.6/site-packages/ansible/modules/commands/command.py <192.168.24.2> ESTABLISH LOCAL CONNECTION FOR USER: root <192.168.24.2> EXEC /bin/sh -c '/usr/bin/python3 && sleep 0' ... [root@rhel8 ~]# ps axu | head -1 USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND [root@rhel8 ~]# ps axu | grep 'podman pull' root 116280 0.0 0.0 221284 952 pts/2 S 22:38 0:00 timeout 300s podman pull docker.io/ceph/daemon:latest-master root 116281 0.0 0.3 2205484 59096 pts/2 Tl 22:38 0:00 podman pull docker.io/ceph/daemon:latest-master root 117093 0.0 0.0 221860 968 pts/1 S+ 22:42 0:00 grep --color=auto podman pull [root@rhel8 ~]# Note the relevant process state codes: S interruptible sleep (waiting for an event to complete) T stopped, either by a job control signal or because it is being traced. l is multi-threaded (using CLONE_THREAD, like NPTL pthreads do) [root@rhel8 ceph-ansible]# podman --version podman version 1.0.0 [root@rhel8 ceph-ansible]# uname -a Linux rhel8.example.com 4.18.0-60.el8.x86_64 #1 SMP Fri Jan 11 19:08:11 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux [root@rhel8 ceph-ansible]# git log --pretty=short | head -5 commit c1d4ab69b54071f765d3279d354d756748e806b2 Author: Guillaume Abrioux <gabrioux> podman: support podman installation on rhel8 [root@rhel8 ceph-ansible]# FWIW I didn't have this issue with podman version 0.0.12