Bug 1556936

Summary:	After etcd v2 to v3 migration, masters are restarted before persisting config changes to use storage-backend etcd3
Product:	OpenShift Container Platform	Reporter:	bmorriso
Component:	Installer	Assignee:	Vadim Rutkovsky <vrutkovs>
Status:	CLOSED ERRATA	QA Contact:	liujia <jiajliu>
Severity:	urgent	Docs Contact:
Priority:	unspecified
Version:	3.6.1	CC:	aos-bugs, bleanhar, erich, fcami, jchaloup, jliggitt, jokerman, mgugino, mmccomas, pdwyer, sdodson, wmeng
Target Milestone:	---
Target Release:	3.6.z
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:	openshift-ansible-3.7.38-1.git.0.77e88ab.el7	Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:
Clones:	1557499 (view as bug list)		Environment:
Last Closed:	2018-04-12 06:05:33 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	1557499

Description bmorriso 2018-03-15 14:58:31 UTC

Description of problem:

We have seen two instances now where, after a cluster was migrated from etcd v2 to v3, a master will revert to using v2 data after a restart of the atomic-openshift-master-api and atomic-openshift-master-controller services. 

In both cases, the clusters had been upgraded hours or days prior, and only after a restart of these services did they revert to using the old data.


Version-Release number of selected component (if applicable):
oc v3.6.173.0.96
kubernetes v1.6.1+5115d708d7


How reproducible:
We have seen this twice so far on two different clusters. 

Steps to Reproduce:
1. 
2. 
3.

Actual results:


Expected results:


Additional info:

Comment 1 Jordan Liggitt 2018-03-15 16:23:47 UTC

etcd is at v3.1.3

Comment 3 Jan Chaloupka 2018-03-16 16:25:55 UTC

Upstream PR that fixes it: https://github.com/openshift/openshift-ansible/pull/7551

Comment 4 Michael Gugino 2018-03-16 17:04:51 UTC

This is already fixed in 3.7 and 3.6.  Fix for master has been picked: https://github.com/openshift/openshift-ansible/pull/7556

Comment 5 Michael Gugino 2018-03-16 17:33:58 UTC

Fix for 3.9: https://github.com/openshift/openshift-ansible/pull/7559

Comment 6 Michael Gugino 2018-03-16 17:34:34 UTC

3.6 and 3.7 merged 8 days ago:

3.7: https://github.com/openshift/openshift-ansible/pull/7313

3.6: https://github.com/openshift/openshift-ansible/pull/7226

Comment 7 Michael Gugino 2018-03-16 17:35:32 UTC

Related: https://bugzilla.redhat.com/show_bug.cgi?id=1544399

Comment 16 liujia 2018-03-20 10:08:22 UTC

Tried both rpm and containerized etcd migration. Works well on openshift-ansible-3.6.173.0.110-1.git.0.ca81843.el7.noarch. After migration, new created data was stored in etcdv3 only.

Combined comment11&comment13&comment15, change bug status.

Comment 19 errata-xmlrpc 2018-04-12 06:05:33 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:1106