Bug 1495135
Summary: | Upgrade failed due to can not find atomic-openshift-master-api service in non-ha containerized env | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | liujia <jiajliu> |
Component: | Cluster Version Operator | Assignee: | Jan Chaloupka <jchaloup> |
Status: | CLOSED ERRATA | QA Contact: | liujia <jiajliu> |
Severity: | high | Docs Contact: | |
Priority: | high | ||
Version: | 3.7.0 | CC: | aos-bugs, ccoleman, ghuang, jiajliu, jokerman, mmccomas, vlaad, wmeng |
Target Milestone: | --- | ||
Target Release: | 3.7.0 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2017-11-28 22:12:28 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
liujia
2017-09-25 09:24:39 UTC
I am not able to reproduce it, my inventory: ```ini [OSEv3:children] masters nodes etcd [OSEv3:vars] ansible_ssh_user = root deployment_type = openshift-enterprise openshift_deployment_type = openshift-enterprise osm_use_cockpit = false openshift_release = v3.7 openshift_docker_insecure_registries=brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888 openshift_docker_additional_registries="brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888,registry.ops.openshift.com" containerized=True openshift_pkg_version=-3.6.173.0.37 openshift_version=3.6.173.0.39 [masters] 10.8.174.18 ansible_ssh_host=10.8.174.18 [nodes] 10.8.174.183 ansible_ssh_host=10.8.174.183 [etcd] 10.8.174.18 ansible_ssh_host=10.8.174.18 ``` can you share your inventory file? Both openshift_pkg_version and openshift_version are supposed to be commented out, they were used to deploy the v3.6 cluster. Please, see https://github.com/openshift/openshift-ansible/pull/4832. Citing Clayton: "Native clustering is the default configuration mode, even when only one master is configured" [1] "We don't support upgrade from non-HA to HA" [2] All the changes are for OCP 3.7+ so the error message is expected. The only item to complete is to document this case. [1] https://github.com/openshift/openshift-ansible/pull/4832#issue-244862534 [2] https://github.com/openshift/openshift-ansible/pull/4832#discussion_r130642101 Reproduced always with v3.7.0-0.127.0. # rpm -qa|grep openshift openshift-ansible-filter-plugins-3.7.0-0.127.0.git.0.b9941e4.el7.noarch openshift-ansible-playbooks-3.7.0-0.127.0.git.0.b9941e4.el7.noarch atomic-openshift-clients-3.7.0-0.127.0.git.0.459b70b.el7.x86_64 openshift-ansible-docs-3.7.0-0.127.0.git.0.b9941e4.el7.noarch openshift-ansible-lookup-plugins-3.7.0-0.127.0.git.0.b9941e4.el7.noarch openshift-ansible-roles-3.7.0-0.127.0.git.0.b9941e4.el7.noarch atomic-openshift-utils-3.7.0-0.127.0.git.0.b9941e4.el7.noarch openshift-ansible-3.7.0-0.127.0.git.0.b9941e4.el7.noarch openshift-ansible-callback-plugins-3.7.0-0.127.0.git.0.b9941e4.el7.noarch Inventory and upgrade.log in attachment. No any explicit claim saying that installer will not support non-ha containerized ocp upgrade from v3.6 to v3.7 till now. What QE received is just that single master service will be spitted to master-api.service and master-controller.service in 3.7, so for upgrade process, it may need not only a detect but also a transfer to complete this split just as point 2 in [1]. To document this case, it seems only a compromise for this issue but not the best solution, however, it indeed should be tracked as a bug before it come to the last conclusion. [1] https://github.com/openshift/openshift-ansible/issues/4979 Clayton, can you more elaborate on the issue and comment #7? I would expect the upgrade to re-run openshift-master systemd_units.xml task on each master node, which would convert the monolithic master process into api and controller units. Once the control plane check passes, the non-ha master is upgraded to ha without any problems. So only the "Ensure HA Master is running" tasks need to be modified so they check the non-ha service if available. Version: openshift-ansible-docs-3.7.0-0.178.0.git.0.27a1039.el7.noarch Steps: 1. Container install ocp v3.6(one master_etcd+one node) 2. Upgrade ocp to latest v3.7 Upgrade succeed with atomic-openshift-master-api and atomic-openshift-master-controllers services running. # docker ps|grep master 01116881e2bf openshift3/ose:v3.7.0 "/usr/bin/openshift s" 9 minutes ago Up 9 minutes atomic-openshift-master-controllers 897c7f49f878 openshift3/ose:v3.7.0 "/usr/bin/openshift s" 10 minutes ago Up 10 minutes atomic-openshift-master-api 77d439489612 openshift3/ose:v3.6.173.0.59 "/usr/bin/openshift s" 12 minutes ago Up 12 minutes atomic-openshift-master It is strange to keep original atomic-openshift-master service together with api and controllers service after upgrade. Will track it in a new bug if it will cause new problem. As for this bug, upgrade works well against non-ha containerzied ocp. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2017:3188 |