Description of problem: Randomly hit "Timeout (12s) waiting for privilege escalation prompt" when installing with non-root user Version-Release number of selected component (if applicable): ansible-2.2.1.0-2.el7.noarch openshift-ansible-3.5.15-1.git.0.8d2a456.el7.noarch Ansible Host: # uname -r 3.10.0-514.2.2.el7.x86_64 # cat /etc/redhat-release Red Hat Enterprise Linux Server release 7.3 (Maipo) The hosts to be installed: # atomic host status State: idle Deployments: ● rhel-atomic-host:rhel-atomic-host/7/x86_64/standard Version: 7.3.3 (2017-02-23 22:16:59) Commit: fbeed59bb47b14e32a6b28e13aaa1cad96e88188930a5bf880f949728b7f36ea OSName: rhel-atomic-host How reproducible: sometimes Steps to Reproduce: 1.Trigger a installation. #cat inventory_hosts <--snip--> [OSEv3:vars] ansible_ssh_user=cloud-user ansible_become=yes <--snip--> 2. 3. Actual results: Installer might failed at different tasks: 1)########################## TASK [os_firewall : Remove firewalld allow rules] ****************************** TASK [os_firewall : Ensure firewalld service is not enabled] ******************* ok: [openshift-136.lab.sjc.redhat.com] => { "changed": false, "failed": false, "failed_when_result": false } MSG: Could not find the requested service firewalld: cannot mask ok: [openshift-147.lab.sjc.redhat.com] => { "changed": false, "failed": false, "failed_when_result": false } MSG: Could not find the requested service firewalld: cannot mask fatal: [openshift-137.lab.sjc.redhat.com]: FAILED! => { "failed": true } MSG: Timeout (12s) waiting for privilege escalation prompt: NO MORE HOSTS LEFT ************************************************************* to retry, use: --limit @/usr/share/ansible/openshift-ansible/playbooks/byo/config.retry PLAY RECAP ********************************************************************* 2)########################## TASK [etcd : Install etcd container service file] ****************************** task path: /usr/share/ansible/openshift-ansible/roles/etcd/tasks/main.yml:49 <--snip--> <openshift-147.lab.sjc.redhat.com> ESTABLISH SSH CONNECTION FOR USER: cloud-user <openshift-147.lab.sjc.redhat.com> SSH: EXEC ssh -C -o ControlMaster=auto -o ControlPersist=60s -o 'IdentityFile="/home/slave4/workspace/Launch-Environment-Flexy/private/config/keys/libra.pem"' -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o User=cloud-user -o ConnectTimeout=10 -o ControlPath=/root/.ansible/cp/ansible-ssh-%h-%p-%r openshift-147.lab.sjc.redhat.com '/bin/sh -c '"'"'chmod u+x /home/cloud-user/.ansible/tmp/ansible-tmp-1488164379.53-269179720745828/ /home/cloud-user/.ansible/tmp/ansible-tmp-1488164379.53-269179720745828/stat.py && sleep 0'"'"'' <openshift-147.lab.sjc.redhat.com> ESTABLISH SSH CONNECTION FOR USER: cloud-user <openshift-147.lab.sjc.redhat.com> SSH: EXEC ssh -C -o ControlMaster=auto -o ControlPersist=60s -o 'IdentityFile="/home/slave4/workspace/Launch-Environment-Flexy/private/config/keys/libra.pem"' -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o User=cloud-user -o ConnectTimeout=10 -o ControlPath=/root/.ansible/cp/ansible-ssh-%h-%p-%r -tt openshift-147.lab.sjc.redhat.com '/bin/sh -c '"'"'sudo -H -S -n -u root /bin/sh -c '"'"'"'"'"'"'"'"'echo BECOME-SUCCESS-fpiujtgkgmslktrpyfgryucyxyhvubnw; /usr/bin/python /home/cloud-user/.ansible/tmp/ansible-tmp-1488164379.53-269179720745828/stat.py'"'"'"'"'"'"'"'"' && sleep 0'"'"'' fatal: [openshift-147.lab.sjc.redhat.com]: FAILED! => { "changed": false, "failed": true, "invocation": { "module_args": { "dest": "/etc/systemd/system/etcd_container.service", "src": "etcd.docker.service" }, "module_name": "template" } } MSG: Timeout (12s) waiting for privilege escalation prompt: NO MORE HOSTS LEFT ************************************************************* to retry, use: --limit @/usr/share/ansible/openshift-ansible/playbooks/byo/config.retry 3)######################### TASK [openshift_excluder : Determine if docker packages are installed] ********* ok: [openshift-147.lab.sjc.redhat.com] ok: [openshift-136.lab.sjc.redhat.com] ok: [openshift-106.lab.sjc.redhat.com] ok: [openshift-127.lab.sjc.redhat.com] ok: [openshift-138.lab.sjc.redhat.com] fatal: [openshift-137.lab.sjc.redhat.com]: FAILED! => { "failed": true } MSG: Timeout (12s) waiting for privilege escalation prompt: Expected results: No errors. Additional info: Similar issue: https://github.com/ansible/ansible/issues/14426
Can you try the suggested workaround from that issue? Put this in /etc/ansible/ansible.cfg [defaults] timeout = 30
My OCP env was recreated, and I'm not able to reproduce this issue with the default ansible config. I'll give a try in following days to simulate the env in comment 0. Will close it if it still can't be reproduced. Lower the severity and priority.
Cool, thanks for the feedback.
I can only reproduce the issue in the same env as comment 0. After the suggestion in comment 1, the issue was gone. It seems to be a network issue. Let's close it temporarily as I have no sufficient data.