Bug 1462087 - Unable to mask service etcd_container
Unable to mask service etcd_container
Status: CLOSED ERRATA
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer (Show other bugs)
3.6.0
Unspecified Unspecified
medium Severity medium
: ---
: ---
Assigned To: Giuseppe Scrivano
Gaoyun Pei
:
Depends On: 1461662
Blocks:
  Show dependency treegraph
 
Reported: 2017-06-16 03:28 EDT by Gaoyun Pei
Modified: 2017-08-16 15 EDT (History)
3 users (show)

See Also:
Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2017-08-10 01:28:09 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Gaoyun Pei 2017-06-16 03:28:39 EDT
Description of problem:
When running migration from previous containerized etcd to system container etcd, installer failed when trying to mask etcd_container service:

For containerized etcd installation, etcd container service file was created as /etc/systemd/system/etcd_container.service
https://github.com/openshift/openshift-ansible/blob/openshift-ansible-3.6.112-1/roles/etcd/tasks/main.yml#L21
In etcd system_container.yaml, it will try to mask the etcd_container service
https://github.com/openshift/openshift-ansible/blob/openshift-ansible-3.6.112-1/roles/etcd/tasks/system_container.yml#L39

[root@ip-172-18-4-95 ~]# systemctl status etcd_container
● etcd_container.service - The Etcd Server container
   Loaded: loaded (/etc/systemd/system/etcd_container.service; enabled; vendor preset: disabled)
   Active: active (running) since Thu 2017-06-15 23:24:10 EDT; 3h 12min ago
 Main PID: 14689 (docker-current)
   Memory: 5.9M
   CGroup: /system.slice/etcd_container.service
           └─14689 /usr/bin/docker-current run --name etcd_container --rm -v /var/lib/etcd/:/var/lib/etcd/:z -v /etc/etcd:/etc/etcd:ro --env-file=/etc/etcd/etcd.conf --ne...

[root@ip-172-18-4-95 ~]# ls -al /etc/systemd/system/etcd_container.service
-rw-r--r--. 1 root root 576 Jun 15 22:58 /etc/systemd/system/etcd_container.service

[root@ip-172-18-4-95 ~]# ls -al /usr/lib/systemd/system/etcd_container.service
ls: cannot access /usr/lib/systemd/system/etcd_container.service: No such file or directory

[root@ip-172-18-4-95 ~]# systemctl stop etcd_container

[root@ip-172-18-4-95 ~]# systemctl disable etcd_container
Removed symlink /etc/systemd/system/docker.service.wants/etcd_container.service.

[root@ip-172-18-4-95 ~]# systemctl mask etcd_container
Failed to execute operation: Invalid argument


Version-Release number of selected component (if applicable):
openshift-ansible-3.6.112-1.git.0.1ce58b5.el7.noarch

How reproducible:
Always

Steps to Reproduce:
1.Setup a containerized ocp-3.6 cluster, etcd docker container is running and etcd_container service is running.

2.Add use_etcd_system_container=true into ansible inventory file, run installation playbook again

Actual results:
TASK [etcd : Disable etcd_container] *******************************************
fatal: [ec2-52-206-163-36.compute-1.amazonaws.com]: FAILED! => {
    "changed": false, 
    "failed": true, 
    "failed_when_result": true
}

MSG:

Unable to mask service etcd_container: Failed to execute operation: Invalid argument


Expected results:


Additional info:
Comment 1 Giuseppe Scrivano 2017-06-20 04:50:23 EDT
I've created a PR here:

https://github.com/openshift/openshift-ansible/pull/4503
Comment 3 Gaoyun Pei 2017-06-26 00:17:04 EDT
Met with failure when installing etcd system container, the same error with https://bugzilla.redhat.com/show_bug.cgi?id=1461662#c6

TASK [etcd : Install or Update Etcd system container package] ******************
fatal: [qe-gpei-etcd-sc-etcd-1.0626-35y.qe.rhcloud.com]: FAILED! => {
    "changed": false, 
    "failed": true, 
    "module_stderr": "Shared connection to qe-gpei-etcd-sc-etcd-1.0626-35y.qe.rhcloud.com closed.\r\n", 
    "module_stdout": "Traceback (most recent call last):\r\n  File \"/tmp/ansible_NykF4B/ansible_module_oc_atomic_container.py\", line 214, in <module>\r\n    main()\r\n  File \"/tmp/ansible_NykF4B/ansible_module_oc_atomic_container.py\", line 202, in main\r\n    if atomic_version < StrictVersion('1.17.2'):\r\n  File \"/usr/lib64/python2.7/distutils/version.py\", line 140, in __cmp__\r\n    compare = cmp(self.version, other.version)\r\nAttributeError: StrictVersion instance has no attribute 'version'\r\n"
}

MSG:

MODULE FAILURE
Comment 4 Gaoyun Pei 2017-06-29 03:58:22 EDT
Verify this bug with openshift-ansible-3.6.126.1-1.git.0.41d2313.el7.noarch

Now installer will remove etcd_container service file directly instead of trying to mask etcd_container service. 

TASK [etcd : Check etcd system container package] ******************************
changed: [qe-gpei-36-con-rhel-2-etcd-1.0629-xhf.qe.rhcloud.com]

TASK [etcd : Unmask etcd service] **********************************************
ok: [qe-gpei-36-con-rhel-2-etcd-1.0629-xhf.qe.rhcloud.com]

TASK [etcd : Disable etcd_container] *******************************************
changed: [qe-gpei-36-con-rhel-2-etcd-1.0629-xhf.qe.rhcloud.com]

TASK [etcd : Remove etcd_container.service] ************************************
changed: [qe-gpei-36-con-rhel-2-etcd-1.0629-xhf.qe.rhcloud.com]
Comment 6 errata-xmlrpc 2017-08-10 01:28:09 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:1716

Note You need to log in before you can comment on or make changes to this bug.