Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 1557499 - After etcd v2 to v3 migration, masters are restarted before persisting config changes to use storage-backend etcd3
After etcd v2 to v3 migration, masters are restarted before persisting config...
Status: CLOSED ERRATA
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer (Show other bugs)
3.7.0
Unspecified Unspecified
high Severity high
: ---
: 3.7.z
Assigned To: Vadim Rutkovsky
Weihua Meng
:
Depends On: 1556936
Blocks:
  Show dependency treegraph
 
Reported: 2018-03-16 13:36 EDT by Brenton Leanhardt
Modified: 2018-04-05 05:41 EDT (History)
11 users (show)

See Also:
Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1556936
Environment:
Last Closed: 2018-04-05 05:40:50 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2018:0636 None None None 2018-04-05 05:41 EDT

  None (edit)
Comment 1 Brenton Leanhardt 2018-03-16 13:42:17 EDT
https://github.com/openshift/openshift-ansible/pull/7313
Comment 4 Weihua Meng 2018-03-20 12:12:46 EDT
Fixed.

openshift-ansible-3.7.39-1.git.0.75ad335.el7.noarch
etcd-3.2.15-1.el7.x86_64

Steps:
1. fresh install HA OCP v3.5.5.31.63
2. upgrade to 3.6 with openshift-ansible-3.6.173.0.110-1.git.0.ca81843.el7.noarch
3. create sa sa123 in project wmeng1
data in etcd2 and no etcd3 data on all etcd hosts
[root@wmengetcdv2-master-etcd-2 ~]# etcdctl3 get /kubernetes.io/serviceaccounts/wmeng1/sa123 --prefix --keys-only --endpoints=wmengetcdv2-master-etcd-1:2379
[root@wmengetcdv2-master-etcd-2 ~]#  etcdctl2 get /kubernetes.io/serviceaccounts/wmeng1/sa123
{"kind":"ServiceAccount","apiVersion":"v1","metadata":{"name":"sa123","namespace":"wmeng1","selfLink":"/api/v1/namespaces/wmeng1/serviceaccounts/sa123","uid":"10dc656a-2c3d-11e8-b8e4-42010af00037","creationTimestamp":"2018-03-20T12:49:02Z"},"secrets":[{"name":"sa123-token-b7kf5"},{"name":"sa123-dockercfg-pf1zd"}],"imagePullSecrets":[{"name":"sa123-dockercfg-pf1zd"}]}

4. etcd migration v2 to v3
with /usr/share/ansible/openshift-ansible/playbooks/byo/openshift-etcd/migrate.yml in rpm openshift-ansible-3.7.39-1.git.0.75ad335.el7.noarch
finish successfuly.

5.  etcd check
# ansible -i rpmetcdgce35.inv masters -m shell -a "cat /etc/origin/master/master-config.yaml |grep -A 1 backend"
wmengetcdv2-master-etcd-3.0320-ybh.qe.rhcloud.com | SUCCESS | rc=0 >>
    storage-backend:
    - etcd3

wmengetcdv2-master-etcd-1.0320-ybh.qe.rhcloud.com | SUCCESS | rc=0 >>
    storage-backend:
    - etcd3

wmengetcdv2-master-etcd-2.0320-ybh.qe.rhcloud.com | SUCCESS | rc=0 >>
    storage-backend:
    - etcd3

6. master api is restarted and running
# ansible -i rpmetcdgce35.inv masters -m shell -a "systemctl status atomic-openshift-master-api | grep Active"
wmengetcdv2-master-etcd-1.0320-ybh.qe.rhcloud.com | SUCCESS | rc=0 >>
   Active: active (running) since 二 2018-03-20 09:29:34 EDT; 1min 58s ago

wmengetcdv2-master-etcd-3.0320-ybh.qe.rhcloud.com | SUCCESS | rc=0 >>
   Active: active (running) since 二 2018-03-20 09:29:34 EDT; 1min 58s ago

wmengetcdv2-master-etcd-2.0320-ybh.qe.rhcloud.com | SUCCESS | rc=0 >>
   Active: active (running) since 二 2018-03-20 09:29:34 EDT; 1min 59s ago

7. data in both etcd2 and etcd3 in all etcd hosts
[root@wmengetcdv2-master-etcd-3 ~]# etcdctl2 get /kubernetes.io/serviceaccounts/wmeng1/sa123
{"kind":"ServiceAccount","apiVersion":"v1","metadata":{"name":"sa123","namespace":"wmeng1","selfLink":"/api/v1/namespaces/wmeng1/serviceaccounts/sa123","uid":"10dc656a-2c3d-11e8-b8e4-42010af00037","creationTimestamp":"2018-03-20T12:49:02Z"},"secrets":[{"name":"sa123-token-b7kf5"},{"name":"sa123-dockercfg-pf1zd"}],"imagePullSecrets":[{"name":"sa123-dockercfg-pf1zd"}]}

[root@wmengetcdv2-master-etcd-3 ~]# etcdctl3 get /kubernetes.io/serviceaccounts/wmeng1/sa123 --prefix --keys-only --endpoints=wmengetcdv2-master-etcd-1:2379
/kubernetes.io/serviceaccounts/wmeng1/sa123

8. restart all master api and controllors
# ansible -i rpmetcdgce35.inv masters -m service -a 'name=atomic-openshift-master-api state=restarted'
wmengetcdv2-master-etcd-3.0320-ybh.qe.rhcloud.com | SUCCESS => {
# ansible -i rpmetcdgce35.inv masters -m service -a 'name=atomic-openshift-master-controllers state=restarted'
wmengetcdv2-master-etcd-3.0320-ybh.qe.rhcloud.com | SUCCESS => {

9. master config is still etcd3
# ansible -i rpmetcdgce35.inv masters -m shell -a "cat /etc/origin/master/master-config.yaml | grep -A 1 backend"
wmengetcdv2-master-etcd-3.0320-ybh.qe.rhcloud.com | SUCCESS | rc=0 >>
    storage-backend:
    - etcd3

wmengetcdv2-master-etcd-1.0320-ybh.qe.rhcloud.com | SUCCESS | rc=0 >>
    storage-backend:
    - etcd3

wmengetcdv2-master-etcd-2.0320-ybh.qe.rhcloud.com | SUCCESS | rc=0 >>
    storage-backend:
    - etcd3

10. all master api and controllers are  restarted and running

11. create sa sa789 in project wmeng3, data is in etcd3 and not in etcd2

12. upgrade cluster ocp 3.7 with openshift-ansible-3.7.39-1.git.0.75ad335.el7.noarch

13. check etcd version
# ansible -i rpmetcdgce35.inv masters -m shell -a "cat /etc/origin/master/master-config.yaml | grep -A 1 backend"
wmengetcdv2-master-etcd-3.0320-ybh.qe.rhcloud.com | SUCCESS | rc=0 >>
    storage-backend:
    - etcd3

wmengetcdv2-master-etcd-2.0320-ybh.qe.rhcloud.com | SUCCESS | rc=0 >>
    storage-backend:
    - etcd3

wmengetcdv2-master-etcd-1.0320-ybh.qe.rhcloud.com | SUCCESS | rc=0 >>
    storage-backend:
    - etcd3

14. check all master api are restarted.
# ansible -i rpmetcdgce35.inv masters -m shell -a "systemctl status atomic-openshift-master-api | grep Active"
wmengetcdv2-master-etcd-3.0320-ybh.qe.rhcloud.com | SUCCESS | rc=0 >>
   Active: active (running) since Tue 2018-03-20 10:24:13 EDT; 1h 9min ago

wmengetcdv2-master-etcd-2.0320-ybh.qe.rhcloud.com | SUCCESS | rc=0 >>
   Active: active (running) since Tue 2018-03-20 10:24:12 EDT; 1h 9min ago

wmengetcdv2-master-etcd-1.0320-ybh.qe.rhcloud.com | SUCCESS | rc=0 >>
   Active: active (running) since Tue 2018-03-20 10:24:13 EDT; 1h 9min ago

15. check sa123 in etcd2 and etcd3
[root@wmengetcdv2-master-etcd-1 ~]# etcdctl2 get /kubernetes.io/serviceaccounts/wmeng1/sa123
{"kind":"ServiceAccount","apiVersion":"v1","metadata":{"name":"sa123","namespace":"wmeng1","selfLink":"/api/v1/namespaces/wmeng1/serviceaccounts/sa123","uid":"10dc656a-2c3d-11e8-b8e4-42010af00037","creationTimestamp":"2018-03-20T12:49:02Z"},"secrets":[{"name":"sa123-token-b7kf5"},{"name":"sa123-dockercfg-pf1zd"}],"imagePullSecrets":[{"name":"sa123-dockercfg-pf1zd"}]}

[root@wmengetcdv2-master-etcd-1 ~]# etcdctl3 get /kubernetes.io/serviceaccounts/wmeng1/sa123 --prefix --keys-only --endpoints=wmengetcdv2-master-etcd-1:2379
/kubernetes.io/serviceaccounts/wmeng1/sa123

16. check sa456
# etcdctl3 get /kubernetes.io/serviceaccounts/wmeng2/sa456 --prefix --keys-only --endpoints=wmengetcdv2-master-etcd-1:2379
/kubernetes.io/serviceaccounts/wmeng2/sa456

# etcdctl2 get /kubernetes.io/serviceaccounts/wmeng2/sa456
Error:  100: Key not found (/kubernetes.io/serviceaccounts/wmeng2) [31130]

17. deploy s2i and check
# oc get pods
NAME                            READY     STATUS      RESTARTS   AGE
cakephp-mysql-example-1-build   0/1       Completed   0          8m
cakephp-mysql-example-1-gqz6p   1/1       Running     0          5m
mysql-1-t5xkg                   1/1       Running     0          8m

the Fix Looks Good to me.
No regression issue found.
Comment 8 errata-xmlrpc 2018-04-05 05:40:50 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:0636

Note You need to log in before you can comment on or make changes to this bug.