Bug 1523814
Summary: | etcd3 migrate playbook fails with single master topology | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Timothy Rees <trees> |
Component: | Cluster Version Operator | Assignee: | Scott Dodson <sdodson> |
Status: | CLOSED ERRATA | QA Contact: | liujia <jiajliu> |
Severity: | high | Docs Contact: | |
Priority: | high | ||
Version: | 3.7.0 | CC: | aos-bugs, jokerman, mmccomas, sdodson, wmeng |
Target Milestone: | --- | ||
Target Release: | 3.7.z | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: |
When running the etcd v2 to v3 migration playbooks as included in the 3.7 release the playbooks incorrectly assumed that all services were HA services (ie: atomic-openshift-master-api and atomic-openshift-master-controllers rather than atomic-openshift-master) which is the norm on 3.7. However the migration playbooks would be executed prior to upgrading to 3.7 so this was incorrect. The migration playbooks have been updated to start and stop the correct services ensuring proper migration.
|
Story Points: | --- |
Clone Of: | Environment: | ||
Last Closed: | 2018-01-23 17:59:00 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Timothy Rees
2017-12-08 19:30:15 UTC
This technically only happens when using the 3.7 playbooks so I've updated the release to be 3.7.0. Proposed fix https://github.com/openshift/openshift-ansible/pull/6428 @Scott It should not be a bug. It seems user used a wrong playbook. AFAIK, etcd v2 migrate to v3 should be done on v3.6(a previous release) before upgrade to v3.7. So the right steps should be: If current cluster is v3.6 and it is upgraded from v3.5. 1) Migrate v2 to v3 with v3.6 playbook. 2) Update atomic-openshift-utils for v3.7 upgrade. If current cluster is v3.6 and it is new installed. 1) No need migrate anymore. 2) Update atomic-openshift-utils for v3.7 upgrade. (In reply to liujia from comment #2) > @Scott > > It should not be a bug. It seems user used a wrong playbook. AFAIK, etcd v2 > migrate to v3 should be done on v3.6(a previous release) before upgrade to > v3.7. > > So the right steps should be: > If current cluster is v3.6 and it is upgraded from v3.5. > 1) Migrate v2 to v3 with v3.6 playbook. > 2) Update atomic-openshift-utils for v3.7 upgrade. If this is actually the case then the v3.7 docs need amending to reflect this procedure. It is not what is currently outlined. Jia, We didn't deliver embedded to external migration playbooks until the 3.7 release so I think it's highly likely that when an admin attempts to upgrade their environment from 3.6 to 3.7 they're going to first be blocked by needing to migrate from v2 to v3. Then when they attempt to run that they'll be informed that they need to migrate from embedded to external which is currently only possible using the 3.7 playbooks. Therefore I think we should support both migration playbooks in the 3.7 playbooks. (In reply to Scott Dodson from comment #4) > Jia, > > We didn't deliver embedded to external migration playbooks until the 3.7 > release so I think it's highly likely that when an admin attempts to upgrade > their environment from 3.6 to 3.7 they're going to first be blocked by > needing to migrate from v2 to v3. Then when they attempt to run that they'll > be informed that they need to migrate from embedded to external which is > currently only possible using the 3.7 playbooks. Therefore I think we should > support both migration playbooks in the 3.7 playbooks. Scott, Yes, we delivered embedded to external on v3.7, but we delivered etcd v2 to v3 on v3.6. When an admin attempts to upgrade cluster from v3.6 to v3.7, they should migrate v2 to v3 with v3.6 playbook first if they want to upgrade to v3.7. So here the playbook should be still v3.6.Then they can do further upgrade through v3.7 playbook. For example, migrating embedded to external and then upgrade to v3.7 with v3.7 playbook. > "For existing clusters that upgraded to OpenShift Container Platform 3.6, however, the etcd data must be migrated from v2 to v3 as a post-upgrade step. " We had a strong suggestion about migrate v2 to v3 as a post-upgrade step for an upgraded v3.6 in the doc. So I think user should use v3.6 playbook. But we also have many misleading description and steps about migrate in the doc which will result that playbooks have been updated to v3.7. Anyway, considering compatibility, I agree with you that we can support it in the v3.7 and later versions too. Reproduced on openshift-ansible-3.7.14-1.git.0.4b35b2d.el7.noarch 1. Install ocp v3.5 2. Upgrade ocp v3.5 to v3.6 without etcd migrate. 3. Do etcd migrate with v3.7 openshift-ansible. TASK [Stop masters] ********************************************************************************************************************************************************* failed: [x.x.x.x] (item=atomic-openshift-master-controllers) => {"changed": false, "item": "atomic-openshift-master-controllers", "msg": "Could not find the requested service atomic-openshift-master-controllers: host"} failed: [x.x.x.x] (item=atomic-openshift-master-api) => {"changed": false, "item": "atomic-openshift-master-api", "msg": "Could not find the requested service atomic-openshift-master-api: host"} Version: openshift-ansible-3.7.18-1.git.0.a01e769.el7.noarch 1. Install ocp v3.5 2. Upgrade ocp v3.5 to v3.6 without etcd migrate. 3. Do etcd migrate with v3.7 openshift-ansible. Migrate succeed. #master-config.yaml storage-backend: - etcd3 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:0113 |