Bug 1544737
| Summary: | first containerized etcd not upgraded to latest image when migrating etcd v2-> v3 | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | daniel <dmoessne> |
| Component: | Installer | Assignee: | Vadim Rutkovsky <vrutkovs> |
| Status: | CLOSED ERRATA | QA Contact: | liujia <jiajliu> |
| Severity: | medium | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 3.6.1 | CC: | aos-bugs, jokerman, mmccomas, wmeng |
| Target Milestone: | --- | ||
| Target Release: | 3.6.z | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2018-04-12 06:03:40 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
daniel
2018-02-13 12:11:16 UTC
Daniel, We're going to strip out all the extraneous steps that affect the etcd installation. This is surely an interesting side effect, basically we do the migration on the first host and then we currently run the scale up playbooks on the other two which is pretty heavy handed. It will replace certificates and effectively re-install etcd. When we update the playbooks just to re-add those hosts and start them back up the outcome would be that all etcd hosts remain unchanged aside from data migration. Vadim, Not sure whether we should mark these all as dupes or not, they're all different symptoms of the same root cause. I guess for now lets leave them all open that way QE can test to ensure that each different symptom is resolved by our work. I'll check if this is still reproducible with https://github.com/openshift/openshift-ansible/pull/7226 - I guess etcd upgrade during migrate is actually unwanted Fix is available in openshift-ansible-3.6.173.0.104-1-4-g76aa5371e - the etcd migrate playbook no longer includes scaleup, so container version won't change Version: openshift-ansible-3.6.173.0.104-1.git.0.ee43cc5.el7.noarch Steps: 1. HA containerized install ocp v3.5 (v3.5.5.31.48) with container etcd version is v3.2.7. # docker run -it --entrypoint rpm registry.access.redhat.com/rhel7/etcd:3.2.7 -qa etcd etcd-3.2.7-1.el7.x86_64 2. Upgrade v3.5 to latest ocp v3.6.173.0.104 +-----------------------------------------+------------------+---------+---------+-----------+-----------+------------+ | ENDPOINT | ID | VERSION | DB SIZE | IS LEADER | RAFT TERM | RAFT INDEX | +-----------------------------------------+------------------+---------+---------+-----------+-----------+------------+ | https://aos-138.lab.sjc.redhat.com:2379 | b0c1d3f602268dc8 | 3.2.7 | 25 kB | true | 9 | 75544 | | https://aos-152.lab.sjc.redhat.com:2379 | 53c604bc13ce5de3 | 3.2.7 | 25 kB | false | 9 | 75545 | | https://aos-155.lab.sjc.redhat.com:2379 | 122ce3db037f9bf3 | 3.2.7 | 25 kB | false | 9 | 75546 | +-----------------------------------------+------------------+---------+---------+-----------+-----------+------------+ 3. Do etcd migrate,migration succeed. Checked etcd versions in the cluster. +-----------------------------------------+------------------+---------+---------+-----------+-----------+------------+ | ENDPOINT | ID | VERSION | DB SIZE | IS LEADER | RAFT TERM | RAFT INDEX | +-----------------------------------------+------------------+---------+---------+-----------+-----------+------------+ | https://aos-138.lab.sjc.redhat.com:2379 | b0c1d3f602268dc8 | 3.2.7 | 7.7 MB | true | 132 | 81829 | | https://aos-152.lab.sjc.redhat.com:2379 | 658edf5257cc6250 | 3.2.11 | 7.7 MB | false | 132 | 81829 | | https://aos-155.lab.sjc.redhat.com:2379 | 18822de21a345140 | 3.2.11 | 7.7 MB | false | 132 | 81829 | +-----------------------------------------+------------------+---------+---------+-----------+-----------+------------+ Then checked pr was not merged into latest v3.6 build. # rpm -qa|grep openshift-ansible openshift-ansible-playbooks-3.6.173.0.104-1.git.0.ee43cc5.el7.noarch openshift-ansible-docs-3.6.173.0.104-1.git.0.ee43cc5.el7.noarch openshift-ansible-roles-3.6.173.0.104-1.git.0.ee43cc5.el7.noarch openshift-ansible-lookup-plugins-3.6.173.0.104-1.git.0.ee43cc5.el7.noarch openshift-ansible-callback-plugins-3.6.173.0.104-1.git.0.ee43cc5.el7.noarch openshift-ansible-filter-plugins-3.6.173.0.104-1.git.0.ee43cc5.el7.noarch openshift-ansible-3.6.173.0.104-1.git.0.ee43cc5.el7.noarch # grep -r "scaleup" /usr/share/ansible/openshift-ansible/playbooks/common/openshift-etcd/migrate.yml - include: ./scaleup.yml Change modified to wait for the pr merged. Right, the PR is merged, but the release is not yet prepared openshift-ansible-3.6.173.0.105-1 no longer calls scaleup Version: openshift-ansible-3.6.173.0.110-1.git.0.ca81843.el7.noarch Steps: 1. HA containerized install ocp v3.5 with container etcd version is v3.2.7. # docker run -it --entrypoint rpm registry.access.redhat.com/rhel7/etcd:3.2.7 -qa etcd etcd-3.2.7-1.el7.x86_64 2. Upgrade v3.5 to latest ocp v3.6.173.0.110 +-------------------------------------------+------------------+---------+---------+-----------+-----------+------------+ | ENDPOINT | ID | VERSION | DB SIZE | IS LEADER | RAFT TERM | RAFT INDEX | +-------------------------------------------+------------------+---------+---------+-----------+-----------+------------+ | https://ip-172-18-2-113.ec2.internal:2379 | 9459974f264be826 | 3.2.7 | 25 kB | true | 13 | 48877 | | https://ip-172-18-2-175.ec2.internal:2379 | c1b23e750866c037 | 3.2.7 | 25 kB | false | 13 | 48878 | | https://ip-172-18-10-56.ec2.internal:2379 | be6ae0df781edce | 3.2.7 | 25 kB | false | 13 | 48878 | +-------------------------------------------+------------------+---------+---------+-----------+-----------+------------+ 3. Do etcd migrate,migration succeed. Checked etcd versions in the cluster. +-------------------------------------------+------------------+---------+---------+-----------+-----------+------------+ | ENDPOINT | ID | VERSION | DB SIZE | IS LEADER | RAFT TERM | RAFT INDEX | +-------------------------------------------+------------------+---------+---------+-----------+-----------+------------+ | https://ip-172-18-2-113.ec2.internal:2379 | 9459974f264be826 | 3.2.7 | 8.3 MB | true | 18 | 63557 | | https://ip-172-18-2-175.ec2.internal:2379 | b7256a130b6be421 | 3.2.7 | 8.3 MB | false | 18 | 63557 | | https://ip-172-18-10-56.ec2.internal:2379 | 289a4e3dbe9b8ac7 | 3.2.7 | 8.3 MB | false | 18 | 63557 | +-------------------------------------------+------------------+---------+---------+-----------+-----------+------------+ I think it is expected now. Etcd migration should not do etcd upgrade or scaleup. So all etcd version keep the same version with it was before migration and keep the same with each other in the cluster. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:1106 |