Description of problem: When running migration from previous containerized etcd to system container etcd, installer failed when trying to mask etcd_container service: For containerized etcd installation, etcd container service file was created as /etc/systemd/system/etcd_container.service https://github.com/openshift/openshift-ansible/blob/openshift-ansible-3.6.112-1/roles/etcd/tasks/main.yml#L21 In etcd system_container.yaml, it will try to mask the etcd_container service https://github.com/openshift/openshift-ansible/blob/openshift-ansible-3.6.112-1/roles/etcd/tasks/system_container.yml#L39 [root@ip-172-18-4-95 ~]# systemctl status etcd_container ● etcd_container.service - The Etcd Server container Loaded: loaded (/etc/systemd/system/etcd_container.service; enabled; vendor preset: disabled) Active: active (running) since Thu 2017-06-15 23:24:10 EDT; 3h 12min ago Main PID: 14689 (docker-current) Memory: 5.9M CGroup: /system.slice/etcd_container.service └─14689 /usr/bin/docker-current run --name etcd_container --rm -v /var/lib/etcd/:/var/lib/etcd/:z -v /etc/etcd:/etc/etcd:ro --env-file=/etc/etcd/etcd.conf --ne... [root@ip-172-18-4-95 ~]# ls -al /etc/systemd/system/etcd_container.service -rw-r--r--. 1 root root 576 Jun 15 22:58 /etc/systemd/system/etcd_container.service [root@ip-172-18-4-95 ~]# ls -al /usr/lib/systemd/system/etcd_container.service ls: cannot access /usr/lib/systemd/system/etcd_container.service: No such file or directory [root@ip-172-18-4-95 ~]# systemctl stop etcd_container [root@ip-172-18-4-95 ~]# systemctl disable etcd_container Removed symlink /etc/systemd/system/docker.service.wants/etcd_container.service. [root@ip-172-18-4-95 ~]# systemctl mask etcd_container Failed to execute operation: Invalid argument Version-Release number of selected component (if applicable): openshift-ansible-3.6.112-1.git.0.1ce58b5.el7.noarch How reproducible: Always Steps to Reproduce: 1.Setup a containerized ocp-3.6 cluster, etcd docker container is running and etcd_container service is running. 2.Add use_etcd_system_container=true into ansible inventory file, run installation playbook again Actual results: TASK [etcd : Disable etcd_container] ******************************************* fatal: [ec2-52-206-163-36.compute-1.amazonaws.com]: FAILED! => { "changed": false, "failed": true, "failed_when_result": true } MSG: Unable to mask service etcd_container: Failed to execute operation: Invalid argument Expected results: Additional info:
I've created a PR here: https://github.com/openshift/openshift-ansible/pull/4503
Met with failure when installing etcd system container, the same error with https://bugzilla.redhat.com/show_bug.cgi?id=1461662#c6 TASK [etcd : Install or Update Etcd system container package] ****************** fatal: [qe-gpei-etcd-sc-etcd-1.0626-35y.qe.rhcloud.com]: FAILED! => { "changed": false, "failed": true, "module_stderr": "Shared connection to qe-gpei-etcd-sc-etcd-1.0626-35y.qe.rhcloud.com closed.\r\n", "module_stdout": "Traceback (most recent call last):\r\n File \"/tmp/ansible_NykF4B/ansible_module_oc_atomic_container.py\", line 214, in <module>\r\n main()\r\n File \"/tmp/ansible_NykF4B/ansible_module_oc_atomic_container.py\", line 202, in main\r\n if atomic_version < StrictVersion('1.17.2'):\r\n File \"/usr/lib64/python2.7/distutils/version.py\", line 140, in __cmp__\r\n compare = cmp(self.version, other.version)\r\nAttributeError: StrictVersion instance has no attribute 'version'\r\n" } MSG: MODULE FAILURE
Verify this bug with openshift-ansible-3.6.126.1-1.git.0.41d2313.el7.noarch Now installer will remove etcd_container service file directly instead of trying to mask etcd_container service. TASK [etcd : Check etcd system container package] ****************************** changed: [qe-gpei-36-con-rhel-2-etcd-1.0629-xhf.qe.rhcloud.com] TASK [etcd : Unmask etcd service] ********************************************** ok: [qe-gpei-36-con-rhel-2-etcd-1.0629-xhf.qe.rhcloud.com] TASK [etcd : Disable etcd_container] ******************************************* changed: [qe-gpei-36-con-rhel-2-etcd-1.0629-xhf.qe.rhcloud.com] TASK [etcd : Remove etcd_container.service] ************************************ changed: [qe-gpei-36-con-rhel-2-etcd-1.0629-xhf.qe.rhcloud.com]
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2017:1716