Red Hat Bugzilla – Bug 1274201
oo-install get stuck when running on the host to be deployed
Last modified: 2016-07-03 20:45:55 EDT
Description of problem:
oo-install get stuck when doing "Start and enable iptables service" task.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1.sh <(curl -s https://install.openshift.com/ose)
TASK: [os_firewall | Ensure firewalld service is not enabled] *****************
TASK: [os_firewall | Reload systemd units] ************************************
TASK: [os_firewall | Start and enable iptables service] ***********************
Then no response after waiting for a long time.
Tried to start and enable iptables service manually on the host, both succeed.
Also tried oo-install from https://lachesis-smunilla.rhcloud.com/, which is working well.
Is this an issue that you are hitting reliably? or is it more intermittent?
Are you specifically running all services on a single host? If not, I would expect that any bug that would cause you to hit that error on the master would also affect any nodes being deployed as well.
Does the same error occur when running an advanced install or is it limited to running the installer wrapper?
The problem could also be with the base image you are using for deploying. Do you run into the same issue using a different RHEL7 base image?
Related to this is the following github issue: https://github.com/openshift/openshift-ansible/issues/747
The workaround I suggested will also apply in this case as well.
I tried the 1st workaround in https://github.com/openshift/openshift-ansible/issues/747 today, run oo-install from a separate host could work!
Change the title for more accurate.
I think what we need to do here is add a check to see if the host where the installer is running is one of the hosts that will become part of the environment. If that is the case we need to add ansible_connection=local to the inventory.
I'm a little confused about ansible_sudo=no. I understand if the installer is being run as root that sudo shouldn't be needed. However, I would expect ansible to do the right thing with a local connection and correctly sudo if needed. Jason, do you have any more detail on this?
It is a bit more complicated than that and it really comes down to if the user is really running ansible remotely or not... I believe we explicitly set sudo to false in all of the places that we execute against localhost for those reasons, but I'm not completely sure.
Either way, our goal is to use sudo anywhere the ansible user is not root for operations on a host in the deployment (avoiding sudo for the local actions used for setting groups, variables, transferring files, etc to prevent requiring root on the host running ansible if different than a host in the deployment).
Test this bug with atomic-openshift-utils-3.0.7-1.git.48.75d357c.el7aos.noarch, running installer on the master host, after input the host, the installer quit as below:
Gathering information from hosts...
sudo: invalid option -- '-'
usage: sudo [-D level] -h | -K | -k | -V
usage: sudo -v [-AknS] [-D level] [-g groupname|#gid] [-p prompt] [-u user
usage: sudo -l[l] [-AknS] [-D level] [-g groupname|#gid] [-p prompt] [-U user
name] [-u user name|#uid] [-g groupname|#gid] [command]
usage: sudo [-AbEHknPS] [-r role] [-t type] [-C fd] [-D level] [-g
groupname|#gid] [-p prompt] [-u user name|#uid] [-g groupname|#gid]
[VAR=value] [-i|-s] [<command>]
usage: sudo -e [-AknS] [-r role] [-t type] [-C fd] [-D level] [-g
groupname|#gid] [-p prompt] [-u user name|#uid] file ...
The atomic-openshift-installer requires sudo access without a password.
A lot of bugs were fixed today. This should be fixed in the latest puddle. I tried several multi-host installs as non-root where one of the systems was on of the to-be-deployed hosts.
Verify this bug with atomic-openshift-utils-3.0.7-1.git.76.c73ec7b.el7aos.noarch
Run 'atomic-openshift-installer install' on master host, the environment could be installed.