Bug 1557499
| Summary: | After etcd v2 to v3 migration, masters are restarted before persisting config changes to use storage-backend etcd3 | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Brenton Leanhardt <bleanhar> |
| Component: | Installer | Assignee: | Vadim Rutkovsky <vrutkovs> |
| Status: | CLOSED ERRATA | QA Contact: | Weihua Meng <wmeng> |
| Severity: | high | Docs Contact: | |
| Priority: | high | ||
| Version: | 3.7.0 | CC: | aos-bugs, bleanhar, bmorriso, jchaloup, jialiu, jliggitt, jokerman, mgugino, mmccomas, sdodson, wmeng |
| Target Milestone: | --- | ||
| Target Release: | 3.7.z | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | 1556936 | Environment: | |
| Last Closed: | 2018-04-05 09:40:50 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | 1556936 | ||
| Bug Blocks: | |||
|
Comment 1
Brenton Leanhardt
2018-03-16 17:42:17 UTC
Fixed.
openshift-ansible-3.7.39-1.git.0.75ad335.el7.noarch
etcd-3.2.15-1.el7.x86_64
Steps:
1. fresh install HA OCP v3.5.5.31.63
2. upgrade to 3.6 with openshift-ansible-3.6.173.0.110-1.git.0.ca81843.el7.noarch
3. create sa sa123 in project wmeng1
data in etcd2 and no etcd3 data on all etcd hosts
[root@wmengetcdv2-master-etcd-2 ~]# etcdctl3 get /kubernetes.io/serviceaccounts/wmeng1/sa123 --prefix --keys-only --endpoints=wmengetcdv2-master-etcd-1:2379
[root@wmengetcdv2-master-etcd-2 ~]# etcdctl2 get /kubernetes.io/serviceaccounts/wmeng1/sa123
{"kind":"ServiceAccount","apiVersion":"v1","metadata":{"name":"sa123","namespace":"wmeng1","selfLink":"/api/v1/namespaces/wmeng1/serviceaccounts/sa123","uid":"10dc656a-2c3d-11e8-b8e4-42010af00037","creationTimestamp":"2018-03-20T12:49:02Z"},"secrets":[{"name":"sa123-token-b7kf5"},{"name":"sa123-dockercfg-pf1zd"}],"imagePullSecrets":[{"name":"sa123-dockercfg-pf1zd"}]}
4. etcd migration v2 to v3
with /usr/share/ansible/openshift-ansible/playbooks/byo/openshift-etcd/migrate.yml in rpm openshift-ansible-3.7.39-1.git.0.75ad335.el7.noarch
finish successfuly.
5. etcd check
# ansible -i rpmetcdgce35.inv masters -m shell -a "cat /etc/origin/master/master-config.yaml |grep -A 1 backend"
wmengetcdv2-master-etcd-3.0320-ybh.qe.rhcloud.com | SUCCESS | rc=0 >>
storage-backend:
- etcd3
wmengetcdv2-master-etcd-1.0320-ybh.qe.rhcloud.com | SUCCESS | rc=0 >>
storage-backend:
- etcd3
wmengetcdv2-master-etcd-2.0320-ybh.qe.rhcloud.com | SUCCESS | rc=0 >>
storage-backend:
- etcd3
6. master api is restarted and running
# ansible -i rpmetcdgce35.inv masters -m shell -a "systemctl status atomic-openshift-master-api | grep Active"
wmengetcdv2-master-etcd-1.0320-ybh.qe.rhcloud.com | SUCCESS | rc=0 >>
Active: active (running) since 二 2018-03-20 09:29:34 EDT; 1min 58s ago
wmengetcdv2-master-etcd-3.0320-ybh.qe.rhcloud.com | SUCCESS | rc=0 >>
Active: active (running) since 二 2018-03-20 09:29:34 EDT; 1min 58s ago
wmengetcdv2-master-etcd-2.0320-ybh.qe.rhcloud.com | SUCCESS | rc=0 >>
Active: active (running) since 二 2018-03-20 09:29:34 EDT; 1min 59s ago
7. data in both etcd2 and etcd3 in all etcd hosts
[root@wmengetcdv2-master-etcd-3 ~]# etcdctl2 get /kubernetes.io/serviceaccounts/wmeng1/sa123
{"kind":"ServiceAccount","apiVersion":"v1","metadata":{"name":"sa123","namespace":"wmeng1","selfLink":"/api/v1/namespaces/wmeng1/serviceaccounts/sa123","uid":"10dc656a-2c3d-11e8-b8e4-42010af00037","creationTimestamp":"2018-03-20T12:49:02Z"},"secrets":[{"name":"sa123-token-b7kf5"},{"name":"sa123-dockercfg-pf1zd"}],"imagePullSecrets":[{"name":"sa123-dockercfg-pf1zd"}]}
[root@wmengetcdv2-master-etcd-3 ~]# etcdctl3 get /kubernetes.io/serviceaccounts/wmeng1/sa123 --prefix --keys-only --endpoints=wmengetcdv2-master-etcd-1:2379
/kubernetes.io/serviceaccounts/wmeng1/sa123
8. restart all master api and controllors
# ansible -i rpmetcdgce35.inv masters -m service -a 'name=atomic-openshift-master-api state=restarted'
wmengetcdv2-master-etcd-3.0320-ybh.qe.rhcloud.com | SUCCESS => {
# ansible -i rpmetcdgce35.inv masters -m service -a 'name=atomic-openshift-master-controllers state=restarted'
wmengetcdv2-master-etcd-3.0320-ybh.qe.rhcloud.com | SUCCESS => {
9. master config is still etcd3
# ansible -i rpmetcdgce35.inv masters -m shell -a "cat /etc/origin/master/master-config.yaml | grep -A 1 backend"
wmengetcdv2-master-etcd-3.0320-ybh.qe.rhcloud.com | SUCCESS | rc=0 >>
storage-backend:
- etcd3
wmengetcdv2-master-etcd-1.0320-ybh.qe.rhcloud.com | SUCCESS | rc=0 >>
storage-backend:
- etcd3
wmengetcdv2-master-etcd-2.0320-ybh.qe.rhcloud.com | SUCCESS | rc=0 >>
storage-backend:
- etcd3
10. all master api and controllers are restarted and running
11. create sa sa789 in project wmeng3, data is in etcd3 and not in etcd2
12. upgrade cluster ocp 3.7 with openshift-ansible-3.7.39-1.git.0.75ad335.el7.noarch
13. check etcd version
# ansible -i rpmetcdgce35.inv masters -m shell -a "cat /etc/origin/master/master-config.yaml | grep -A 1 backend"
wmengetcdv2-master-etcd-3.0320-ybh.qe.rhcloud.com | SUCCESS | rc=0 >>
storage-backend:
- etcd3
wmengetcdv2-master-etcd-2.0320-ybh.qe.rhcloud.com | SUCCESS | rc=0 >>
storage-backend:
- etcd3
wmengetcdv2-master-etcd-1.0320-ybh.qe.rhcloud.com | SUCCESS | rc=0 >>
storage-backend:
- etcd3
14. check all master api are restarted.
# ansible -i rpmetcdgce35.inv masters -m shell -a "systemctl status atomic-openshift-master-api | grep Active"
wmengetcdv2-master-etcd-3.0320-ybh.qe.rhcloud.com | SUCCESS | rc=0 >>
Active: active (running) since Tue 2018-03-20 10:24:13 EDT; 1h 9min ago
wmengetcdv2-master-etcd-2.0320-ybh.qe.rhcloud.com | SUCCESS | rc=0 >>
Active: active (running) since Tue 2018-03-20 10:24:12 EDT; 1h 9min ago
wmengetcdv2-master-etcd-1.0320-ybh.qe.rhcloud.com | SUCCESS | rc=0 >>
Active: active (running) since Tue 2018-03-20 10:24:13 EDT; 1h 9min ago
15. check sa123 in etcd2 and etcd3
[root@wmengetcdv2-master-etcd-1 ~]# etcdctl2 get /kubernetes.io/serviceaccounts/wmeng1/sa123
{"kind":"ServiceAccount","apiVersion":"v1","metadata":{"name":"sa123","namespace":"wmeng1","selfLink":"/api/v1/namespaces/wmeng1/serviceaccounts/sa123","uid":"10dc656a-2c3d-11e8-b8e4-42010af00037","creationTimestamp":"2018-03-20T12:49:02Z"},"secrets":[{"name":"sa123-token-b7kf5"},{"name":"sa123-dockercfg-pf1zd"}],"imagePullSecrets":[{"name":"sa123-dockercfg-pf1zd"}]}
[root@wmengetcdv2-master-etcd-1 ~]# etcdctl3 get /kubernetes.io/serviceaccounts/wmeng1/sa123 --prefix --keys-only --endpoints=wmengetcdv2-master-etcd-1:2379
/kubernetes.io/serviceaccounts/wmeng1/sa123
16. check sa456
# etcdctl3 get /kubernetes.io/serviceaccounts/wmeng2/sa456 --prefix --keys-only --endpoints=wmengetcdv2-master-etcd-1:2379
/kubernetes.io/serviceaccounts/wmeng2/sa456
# etcdctl2 get /kubernetes.io/serviceaccounts/wmeng2/sa456
Error: 100: Key not found (/kubernetes.io/serviceaccounts/wmeng2) [31130]
17. deploy s2i and check
# oc get pods
NAME READY STATUS RESTARTS AGE
cakephp-mysql-example-1-build 0/1 Completed 0 8m
cakephp-mysql-example-1-gqz6p 1/1 Running 0 5m
mysql-1-t5xkg 1/1 Running 0 8m
the Fix Looks Good to me.
No regression issue found.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:0636 |