Bug 1461662

Summary: Installer should check atomic version before installing etcd system container
Product: OpenShift Container Platform Reporter: Gaoyun Pei <gpei>
Component: InstallerAssignee: Steve Milner <smilner>
Status: CLOSED ERRATA QA Contact: Gan Huang <ghuang>
Severity: medium Docs Contact:
Priority: medium    
Version: 3.6.0CC: aos-bugs, ghuang, gpei, jialiu, jokerman, mmccomas, smilner
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-08-10 05:28:09 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1462087    

Description Gaoyun Pei 2017-06-15 06:15:15 UTC
Description of problem:
For atomic versions prior to atomic-1.17.2-8, such as atomic-1.15.4-2, atomic-1.16.5-1, there's no "--system-package" argument in 'atomic install' command, but we have "--system-package=no" hardcoded in oc_atomic_container module, this will cause installing etcd system container package failure.


Version-Release number of selected component (if applicable):
openshift-ansible-3.6.109-1.git.0.256e658.el7.noarch.rpm

[cloud-user@gpei-test ~]$ atomic host status
State: idle
Deployments:
● rhel-atomic-host:rhel-atomic-host/7/x86_64/standard
       Version: 7.3.3 (2017-02-27 16:31:38)
        Commit: bfc591ba1a4395c6b8e54d34964b05df4a61e0d82d20cc1a2fd817855c7e2da5
        OSName: rhel-atomic-host
atomic-1.15.4-2.el7.x86_64
docker-1.12.6-11.el7.x86_64


How reproducible:
Always

Steps to Reproduce:
1.Enable etcd running as system container
use_etcd_system_container=true

2.Run installation playbook
ansible-playbook -i host /usr/share/ansible/openshift-ansible/playbooks/byo/config.yml

Actual results:
TASK [etcd : Install or Update Etcd system container package] ******************
Thursday 15 June 2017  02:12:58 +0000 (0:00:00.047)       0:04:11.344 ********* 
fatal: [qe-gpei-etcd-sc-2-etcd-1.0615-2cb.qe.rhcloud.com]: FAILED! => {
    "changed": false, 
    "failed": true, 
    "rc": 2
}

MSG:

atomic: unrecognized arguments: --system-package=no
Try 'atomic --help' for more information.



Expected results:
We could check atomic version when system container installation enabled, so that installer could give a friendly prompt about current atomic version is not supported.
Or make "--system-package" argument configurable for different atomic versions.

Additional info:

Comment 1 Scott Dodson 2017-06-19 14:29:07 UTC
Make sure that minimum versions are noted in the 3.6 release notes https://github.com/openshift/openshift-docs/issues/4021

Comment 2 Steve Milner 2017-06-19 15:46:17 UTC
Added info to release notes.

https://github.com/openshift/openshift-docs/issues/4021#issuecomment-309480997

Comment 4 Steve Milner 2017-06-19 19:12:08 UTC
PR: https://github.com/openshift/openshift-ansible/pull/4497

Comment 6 Gan Huang 2017-06-26 02:24:32 UTC
Code merged into openshift-ansible-3.6.124-1.git.0.507a059.el7.noarch.rpm

Failed at:

TASK [docker : Install Container Engine System Container] **********************
Monday 26 June 2017  00:42:47 +0000 (0:00:02.192)       0:04:11.953 *********** 

An exception occurred during task execution. To see the full traceback, use -vvv. The error was: AttributeError: StrictVersion instance has no attribute 'version'
fatal: [openshift-152.lab.sjc.redhat.com]: FAILED! => {
    "changed": false, 
    "failed": true, 
    "module_stderr": "Traceback (most recent call last):\n  File \"/tmp/ansible_IwdLCV/ansible_module_oc_atomic_container.py\", line 214, in <module>\n    main()\n  File \"/tmp/ansible_IwdLCV/ansible_module_oc_atomic_container.py\", line 202, in main\n    if atomic_version < StrictVersion('1.17.2'):\n  File \"/usr/lib64/python2.7/distutils/version.py\", line 140, in __cmp__\n    compare = cmp(self.version, other.version)\nAttributeError: StrictVersion instance has no attribute 'version'\n", 
    "module_stdout": ""
}

MSG:

MODULE FAILURE


[root@qe-ghuang-master-etcd-1 ~]# atomic -v
1.17.1

Comment 7 Gan Huang 2017-06-26 02:57:51 UTC
[root@qe-ghuang-master-etcd-1 ~]# atomic host status
State: idle
Deployments:
● rhel-atomic-host:rhel-atomic-host/7/x86_64/standard
             Version: 7.3.6 (2017-06-13 20:38:25)
              Commit: a71d6dd215e857eca6576500905a3f9533c9e8cbf142679edaa4996c688c7c74

[root@qe-ghuang-master-etcd-1 ~]# atomic -v
1.17.1

[root@qe-ghuang-master-etcd-1 ~]# rpm -q atomic
atomic-1.17.2-8.git2760e30.el7.x86_64


Hi, Steve, so openshift-ansible won't support system containers on Atomic Host 7.3.6?

Comment 8 Gan Huang 2017-06-26 03:02:14 UTC
Based on Comment 6, move back to "assigned"

Comment 9 Gan Huang 2017-06-26 07:19:07 UTC
It is blocking all the testing about system containers.

Comment 10 Steve Milner 2017-06-26 13:19:31 UTC
(In reply to Gan Huang from comment #7)
> [root@qe-ghuang-master-etcd-1 ~]# atomic host status
> State: idle
> Deployments:
> ● rhel-atomic-host:rhel-atomic-host/7/x86_64/standard
>              Version: 7.3.6 (2017-06-13 20:38:25)
>               Commit:
> a71d6dd215e857eca6576500905a3f9533c9e8cbf142679edaa4996c688c7c74
> 
> [root@qe-ghuang-master-etcd-1 ~]# atomic -v
> 1.17.1
> 
> [root@qe-ghuang-master-etcd-1 ~]# rpm -q atomic
> atomic-1.17.2-8.git2760e30.el7.x86_64
> 
> 
> Hi, Steve, so openshift-ansible won't support system containers on Atomic
> Host 7.3.6?

No it absolutely should. This is a good catch though as the atomic version is returning an improper result. It's actually 1.17.2 (as the rpm query shows) yet when asking the atomic command what version it is it returns a wrong version. Because of this it's failing now. I'll make an update to check the rpm version and not the atomic command version until the bug is fixed.

Comment 11 Steve Milner 2017-06-26 13:58:19 UTC
PR by Giuseppe https://github.com/openshift/openshift-ansible/pull/4583

Comment 13 Gan Huang 2017-06-27 07:36:17 UTC
The commit is not in openshift-ansible-3.6.126-1

PS: Tested with openshift-ansible-3.6.123.1001-1 and passed, but this package is not valid.

Moving to `Modified` and remove `testblocker`

Comment 14 Scott Dodson 2017-06-27 21:34:37 UTC
in openshift-ansible-3.6.123.1002-1.git.0.506cfa7.el7

Comment 15 Gan Huang 2017-06-28 07:17:38 UTC
Verified with openshift-ansible-3.6.123.1003-1.git.0.002ceeb.el7.noarch.rpm

Comment 17 errata-xmlrpc 2017-08-10 05:28:09 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:1716