1475351 – [3.6] API server results inconsistent after migration to etcdv3

Bug 1475351 - [3.6] API server results inconsistent after migration to etcdv3

Summary: [3.6] API server results inconsistent after migration to etcdv3

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Cluster Version Operator
Sub Component:
Version:	3.6.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	high
Target Milestone:	---
Target Release:	3.6.z
Assignee:	Scott Dodson
QA Contact:	Anping Li
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2017-07-26 13:33 UTC by Justin Pierce
Modified:	2018-08-28 00:38 UTC (History)
CC List:	14 users (show)
Fixed In Version:
Doc Type:	Enhancement
Doc Text:	An etcd v3 migration playbook has been added allowing users to migrate to etcd v3 storage after they've upgraded to OCP 3.6. Please see the upgrade documentation for more details.
Clone Of:
Environment:
Last Closed:	2017-09-05 17:42:58 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
api server log (281.28 KB, text/x-vhdl) 2017-07-26 13:57 UTC, Scott Dodson	no flags	Details
Part of ansible logs (53.14 KB, text/plain) 2017-08-25 12:09 UTC, Anping Li	no flags	Details
migrate log and inventory file (170.00 KB, application/x-tar) 2017-08-25 15:15 UTC, Anping Li	no flags	Details
The inventory file migrate log and etcd jouranl file (203.78 KB, application/x-gzip) 2017-08-28 11:04 UTC, Anping Li	no flags	Details
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Bugzilla	1622336	0	high	CLOSED	etcd migrate playbook fail if controllerLeaseTTL has 0s in master-config.yaml	2023-09-15 00:11:44 UTC
Red Hat Product Errata	RHBA-2017:2639	0	normal	SHIPPED_LIVE	OpenShift Container Platform atomic-openshift-utils bug fix and enhancement	2017-09-05 21:42:36 UTC

Internal Links: 1622336

Description Justin Pierce 2017-07-26 13:33:43 UTC

Description of problem:
After a migration to etcdv3, one OpenShift API server in the HA configuration seems to return old data (the other two API servers return current data when queried). The results of this can vary from a user being unable to access their project sporadically to builds randomly being unable to push to the docker registry.

Version-Release number of selected component (if applicable):
OCP v3.6.170
etcd 3.1.9


How reproducible:
100% (4 attempts to date)

Steps to Reproduce:
1. Migrate etcdv2 to v3 on an HA cluster (3 masters/etcd). We used the openshift-ansible playbook: playbooks/byo/openshift-etcd/migrate.yml
2. Create a new project X as a non-admin user.
3. Run oc new-app to create a new application in the project (if you can).

Actual results:
Usually, after creating the project, the non-admin user will get permission errors if they try to list resources within the newly created project. Run 'oc get all' several times as the non-admin user:
$ oc get all -n X

Depending on which master is out of date, the "get all" will return "Forbidden" for at least one of the invocations. The root cause appears to be one server in the API server cluster returning "old" data. e.g. it may not have a record of the user's permissions created for the project.

Expected results:
Consistent results from all API servers.

Additional info:
It has been observed that restarting atomic-openshift-master-api temporarily corrects this condition. However, even after doing so, inconsistent results began to crop up.


Examples of inconsistency captured in current state of free-int:
[root@free-int-master-3c664 ~]# oc project
Using project "scd-1" on server "https://internal.api.free-int.openshift.com:443".
[root@free-int-master-3c664 ~]# oc get builds
NAME       TYPE      FROM          STATUS                               STARTED        DURATION
test15-3   Source    Git@855ab2d   Failed (PushImageToRegistryFailed)   17 hours ago   34s
test15-4   Source    Git@855ab2d   Complete                             16 hours ago   1m15s
test15-5   Source    Git@855ab2d   Complete                             16 hours ago   52s
[root@free-int-master-3c664 ~]# oc get builds
NAME       TYPE      FROM         STATUS                        STARTED        DURATION
test15-5   Source    Git@master   Failed (GenericBuildFailed)   14 hours ago   13h42m11s
[root@free-int-master-3c664 ~]# oc get builds
NAME       TYPE      FROM         STATUS                        STARTED        DURATION
test15-5   Source    Git@master   Failed (GenericBuildFailed)   14 hours ago   13h42m11s
[root@free-int-master-3c664 ~]# oc get builds
NAME       TYPE      FROM          STATUS                               STARTED        DURATION
test15-3   Source    Git@855ab2d   Failed (PushImageToRegistryFailed)   17 hours ago   34s
test15-4   Source    Git@855ab2d   Complete                             16 hours ago   1m15s
test15-5   Source    Git@855ab2d   Complete                             16 hours ago   52s
[root@free-int-master-3c664 ~]# oc get builds
NAME       TYPE      FROM         STATUS                        STARTED        DURATION
test15-5   Source    Git@master   Failed (GenericBuildFailed)   14 hours ago   13h42m11s
[root@free-int-master-3c664 ~]# oc get builds
NAME       TYPE      FROM         STATUS                        STARTED        DURATION
test15-5   Source    Git@master   Failed (GenericBuildFailed)   14 hours ago   13h42m11s
[root@free-int-master-3c664 ~]# oc get builds
NAME       TYPE      FROM         STATUS                        STARTED        DURATION
test15-5   Source    Git@master   Failed (GenericBuildFailed)   14 hours ago   13h42m11s
[root@free-int-master-3c664 ~]# oc get builds
NAME       TYPE      FROM         STATUS                        STARTED        DURATION
test15-5   Source    Git@master   Failed (GenericBuildFailed)   14 hours ago   13h42m11s
[root@free-int-master-3c664 ~]# oc get builds
NAME       TYPE      FROM          STATUS                               STARTED        DURATION
test15-3   Source    Git@855ab2d   Failed (PushImageToRegistryFailed)   17 hours ago   34s
test15-4   Source    Git@855ab2d   Complete                             16 hours ago   1m15s
test15-5   Source    Git@855ab2d   Complete                             16 hours ago   52s

Comment 1 Scott Dodson 2017-07-26 13:46:37 UTC

In this particular case yesterday we were able to `oc get pods` against each individual API server and two returned expected results while the third returned the same list of pods but all pods were in a 'pending' state rather than running / error as the other two api servers were returning. At this point we restarted the etcd service on the leader and the API server started returning valid results.

Comment 2 Scott Dodson 2017-07-26 13:47:35 UTC

https://github.com/coreos/etcd/issues/8305 upstream issue clayton reported

Comment 3 Scott Dodson 2017-07-26 13:57:17 UTC

Created attachment 1304820 [details]
api server log

Log from affected api server, etcd on the leader was restarted at Tue 2017-07-25 21:31:45 UTC

Comment 4 Jordan Liggitt 2017-07-26 16:07:11 UTC

when we are in this state, what do we see running this from each API server?

etcdctl -w table endpoint --cluster status

Comment 5 Justin Pierce 2017-07-26 16:26:51 UTC

--cluster does not appear to be a valid option. Running without on each:

[root@free-int-master-3c664 ~]# etcdctl3 -w table endpoint status
+--------------------------------------------+------------------+---------+---------+-----------+-----------+------------+
|                  ENDPOINT                  |        ID        | VERSION | DB SIZE | IS LEADER | RAFT TERM | RAFT INDEX |
+--------------------------------------------+------------------+---------+---------+-----------+-----------+------------+
| https://ip-172-31-50-177.ec2.internal:2379 | f8647d77edbb333b | 3.1.9   | 105 MB  | false     |       715 | 1043370179 |
+--------------------------------------------+------------------+---------+---------+-----------+-----------+------------+


[root@free-int-master-5470f ~]# etcdctl3 -w table endpoint status
+--------------------------------------------+------------------+---------+---------+-----------+-----------+------------+
|                  ENDPOINT                  |        ID        | VERSION | DB SIZE | IS LEADER | RAFT TERM | RAFT INDEX |
+--------------------------------------------+------------------+---------+---------+-----------+-----------+------------+
| https://ip-172-31-56-130.ec2.internal:2379 | 46c194b7a9bde0fd | 3.1.9   | 742 MB  | false     |       715 | 1043370500 |
+--------------------------------------------+------------------+---------+---------+-----------+-----------+------------+


[root@free-int-master-de987 ~]# etcdctl3 -w table endpoint status
+--------------------------------------------+------------------+---------+---------+-----------+-----------+------------+
|                  ENDPOINT                  |        ID        | VERSION | DB SIZE | IS LEADER | RAFT TERM | RAFT INDEX |
+--------------------------------------------+------------------+---------+---------+-----------+-----------+------------+
| https://ip-172-31-60-182.ec2.internal:2379 | 6bd52a956766015a | 3.1.9   | 105 MB  | true      |       715 | 1043373103 |
+--------------------------------------------+------------------+---------+---------+-----------+-----------+------------+

Comment 6 Justin Pierce 2017-07-26 16:35:08 UTC

[root@free-int-master-5470f ~]# etcdctl3 -w table endpoint status --endpoints=ip-172-31-50-177.ec2.internal:2379,ip-172-31-56-130.ec2.internal:2379,ip-172-31-60-182.ec2.internal:2379
2017-07-26 16:33:51.690146 I | warning: ignoring ServerName for user-provided CA for backwards compatibility is deprecated
+--------------------------------------------+------------------+---------+---------+-----------+-----------+------------+
|                  ENDPOINT                  |        ID        | VERSION | DB SIZE | IS LEADER | RAFT TERM | RAFT INDEX |
+--------------------------------------------+------------------+---------+---------+-----------+-----------+------------+
| https://ip-172-31-56-130.ec2.internal:2379 | 46c194b7a9bde0fd | 3.1.9   | 754 MB  | false     |       715 | 1043389842 |
| ip-172-31-50-177.ec2.internal:2379         | f8647d77edbb333b | 3.1.9   | 105 MB  | false     |       715 | 1043389842 |
| ip-172-31-56-130.ec2.internal:2379         | 46c194b7a9bde0fd | 3.1.9   | 754 MB  | false     |       715 | 1043389842 |
| ip-172-31-60-182.ec2.internal:2379         | 6bd52a956766015a | 3.1.9   | 105 MB  | true      |       715 | 1043389842 |
+--------------------------------------------+------------------+---------+---------+-----------+-----------+------------+

Comment 7 Justin Pierce 2017-07-26 19:37:13 UTC

It looks like the etcd node returning inconsistent data (https://172.31.50.177:2380 in the data below) is returning 'Error:  grpc: the client connection is closing' when directly queried with etcdctl3


[root@free-int-master-3c664 ~]# etcdctl3 member list
2017-07-26 19:21:33.618440 I | warning: ignoring ServerName for user-provided CA for backwards compatibility is deprecated
46c194b7a9bde0fd, started, ip-172-31-56-130.ec2.internal, https://172.31.56.130:2380, https://172.31.56.130:2379
6bd52a956766015a, started, ip-172-31-60-182.ec2.internal, https://172.31.60.182:2380, https://172.31.60.182:2379
f8647d77edbb333b, started, ip-172-31-50-177.ec2.internal, https://172.31.50.177:2380, https://172.31.50.177:2379


[root@free-int-master-3c664 ~]# etcdctl3 get /openshift.io/builds/scd-1 --prefix --keys-only --endpoints=https://172.31.56.130:2380
2017-07-26 19:21:55.719966 I | warning: ignoring ServerName for user-provided CA for backwards compatibility is deprecated
/openshift.io/builds/scd-1/test15-3
/openshift.io/builds/scd-1/test15-4
/openshift.io/builds/scd-1/test15-5


[root@free-int-master-3c664 ~]# etcdctl3 get /openshift.io/builds/scd-1 --prefix --keys-only --endpoints=https://172.31.60.182:2380
2017-07-26 19:22:13.598223 I | warning: ignoring ServerName for user-provided CA for backwards compatibility is deprecated
/openshift.io/builds/scd-1/test15-3
/openshift.io/builds/scd-1/test15-4
/openshift.io/builds/scd-1/test15-5


[root@free-int-master-3c664 ~]# etcdctl3 get /openshift.io/builds/scd-1 --prefix --keys-only --endpoints=https://172.31.50.177:2380
2017-07-26 19:22:21.980899 I | warning: ignoring ServerName for user-provided CA for backwards compatibility is deprecated
Error:  grpc: the client connection is closing



[root@free-int-master-3c664 ~]# oc get builds --server=https://172.31.60.182
NAME       TYPE      FROM         STATUS                        STARTED        DURATION
test15-5   Source    Git@master   Failed (GenericBuildFailed)   20 hours ago   19h42m11s

[root@free-int-master-3c664 ~]# oc get builds --server=https://172.31.56.130
NAME       TYPE      FROM         STATUS                        STARTED        DURATION
test15-5   Source    Git@master   Failed (GenericBuildFailed)   20 hours ago   19h42m11s

[root@free-int-master-3c664 ~]# oc get builds --server=https://172.31.50.177
NAME       TYPE      FROM          STATUS                               STARTED        DURATION
test15-3   Source    Git@855ab2d   Failed (PushImageToRegistryFailed)   23 hours ago   34s
test15-4   Source    Git@855ab2d   Complete                             22 hours ago   1m15s
test15-5   Source    Git@855ab2d   Complete                             22 hours ago   52s

Comment 8 Justin Pierce 2017-07-26 20:15:45 UTC

Observations:
- Database size increasing on *only* one server over time
- Leader staying consistent
- All nodes reporting healthy
- Raft index increasing in sync

[root@free-int-master-de987 ~]# etcdctl7 endpoint status -w table --endpoints=https://ip-172-31-60-182.ec2.internal:2379,https://ip-172-31-56-130.ec2.internal:2379,https://ip-172-31-50-177.ec2.internal:2379
2017-07-26 20:13:59.841880 I | warning: ignoring ServerName for user-provided CA for backwards compatibility is deprecated
+--------------------------------------------+------------------+---------+---------+-----------+-----------+------------+
|                  ENDPOINT                  |        ID        | VERSION | DB SIZE | IS LEADER | RAFT TERM | RAFT INDEX |
+--------------------------------------------+------------------+---------+---------+-----------+-----------+------------+
| https://ip-172-31-60-182.ec2.internal:2379 | 6bd52a956766015a | 3.1.9   | 105 MB  | true      |       715 | 1043474524 |
| https://ip-172-31-56-130.ec2.internal:2379 | 46c194b7a9bde0fd | 3.1.9   | 935 MB  | false     |       715 | 1043474524 |
| https://ip-172-31-50-177.ec2.internal:2379 | f8647d77edbb333b | 3.1.9   | 105 MB  | false     |       715 | 1043474524 |
+--------------------------------------------+------------------+---------+---------+-----------+-----------+------------+

Comment 9 Justin Pierce 2017-07-26 20:20:51 UTC

Another correlation: The instance with the growing DB is the instance that returns the different results:

[root@free-int-master-5470f ~]# etcdctl7 get /openshift.io/builds/scd-1 --prefix --keys-only --endpoints=https://ip-172-31-56-130.ec2.internal:2379
2017-07-26 20:18:42.292502 I | warning: ignoring ServerName for user-provided CA for backwards compatibility is deprecated
/openshift.io/builds/scd-1/test15-5


[root@free-int-master-de987 ~]# etcdctl7 get /openshift.io/builds/scd-1 --prefix --keys-only --endpoints=https://ip-172-31-60-182.ec2.internal:2379
2017-07-26 20:18:49.607159 I | warning: ignoring ServerName for user-provided CA for backwards compatibility is deprecated
/openshift.io/builds/scd-1/test15-3
/openshift.io/builds/scd-1/test15-4
/openshift.io/builds/scd-1/test15-5


[root@free-int-master-3c664 ~]# etcdctl7 get /openshift.io/builds/scd-1 --prefix --keys-only --endpoints=https://ip-172-31-50-177.ec2.internal:2379
2017-07-26 20:18:35.608056 I | warning: ignoring ServerName for user-provided CA for backwards compatibility is deprecated
/openshift.io/builds/scd-1/test15-3
/openshift.io/builds/scd-1/test15-4

using --consistency=s does not change this output. 
/openshift.io/builds/scd-1/test15-5

Comment 10 Justin Pierce 2017-07-26 20:59:29 UTC

Experiments: 

- Shutdown all openshift components (api, controllers, node) for all openshift masters
Observed: still bad results / still inconsistent database sizes


- Restarted etcd on ip-172-31-56-130
Observed: still bad results / still inconsistent database sizes


- Stopped etcd on all masters and then started them up again
Observed: leader changed to ip-172-31-56-130 / still bad results / still inconsistent database sizes


- Created a key from each etcd node specifying a particular endpoint and a unique key name. ip-172-31-56-130 was the master at the time. 
Observed: gets using each respective endpoint returned all expected keys.


- Restarted ip-172-31-56-130 etcd to force master change. ip-172-31-50-177 is now master. 
Observed: gets still returned expected keys created in previous steps. 

- Created three new keys with ip-172-31-50-177 as master using each respective endpoint specified for the put operation. 
Observed: puts succeeded and gets from all endpoints returned all expected keys.

Comment 11 Justin Pierce 2017-07-26 21:13:12 UTC

related? https://github.com/coreos/etcd/issues/8214

Comment 13 Jordan Liggitt 2017-07-27 20:50:07 UTC

in the free-int environment, we restored v2 data from backup, and re-ran the migration, capturing the following:
* v2 keys from all members prior to migrate
* v3 keys from all members post migrate
* hashes of all v3 data from all members post migrate

all members returned identical keys to each other prior to migrate
all members returned identical keys to each other post migrate
the hash of all data from each member matched post migrate

next step is to see if using the free-int environment post-migration triggers the same condition.

Comment 14 Justin Pierce 2017-07-27 21:36:10 UTC

Presently unable to build in free-int environment. Builds stay in "Pending" due to "Error syncing pod".

Comment 16 Jordan Liggitt 2017-07-28 20:12:40 UTC

our process for migrating an HA etcd cluster did not work when there was actively expiring TTL data in the store.

Despite stopping all writers (controllers/apiservers), and ensuring all etcd members were at the same raft index, TTL data (events, tokens, leases, etc) continues expiring in the etcd v2 stores. That means when we shut down the etcd members, their data stores are likely to be inconsistent.

Our migration process ran migrate on each etcd member's data store, moving that inconsistent data into the mvcc store.

We then started up the etcd members, and ran a tool to re-establish TTL leases on the TTL keys. That would query one of the etcd members for keys, and run a transaction to assign each one a TTL lease:

txnResp, err := client.KV.Txn(ctx).If(
	clientv3.Compare(clientv3.ModRevision(string(kv.Key)), "=", kv.ModRevision),
).Then(
	clientv3.OpPut(string(kv.Key), string(kv.Value), clientv3.WithLease(lease.ID)),
).Commit()


If this transaction was accepted by the leader, it had the potential to be applied to the mvcc store on some members and not others. If the key was present on one member, it would be applied, increasing the mvcc revision. If it was missing on another member, it would not be applied.

In this way, the mvcc revision could get drastically out of sync for the same raft index among the cluster members. On a cluster with thousands of events, we saw mvcc versions off by 1000 or more.

Because the mvcc version is used as the resourceVersion by kubernetes, mismatches between cluster members break things like watch... you could list from one etcd member and get a resourceVersion of 1000, then ask to watch from another etcd member who thought the store was still at 500.

Comment 17 Scott Dodson 2017-07-29 02:20:10 UTC

We intend to amend our migration process so that we 

1. Stop api/controllers
2. backup etcd1, etcd2, etcd3
3. stop etcd1, etcd2, etcd3
4. migrate etcd1
5. purge /var/lib/etcd/member on etcd2 etcd3
6. start etcd1 as a new cluster
7. member add etcd2
8. start etcd2
9. verify cluster is healthy
10. member add etcd3
11. start etcd3
12. verify cluster is healthy
13. perform TTL migration
14. Remove v2 keys
15. reconfig masters
16. start api/controllers

Comment 18 Jordan Liggitt 2017-07-31 20:42:24 UTC

Prior to performing the TTL migration, we should ensure the v3 stores are consistent by running this on each member (the hash, revision, and totalKey fields should all match):

ETCDCTL_API=3 etcdctl snapshot status -w json /path/to/snap/db

And this on each endpoint (the Status.raftIndex and Status.header.revision from each endpoint should match):

ETCDCTL_API=3 etcdctl endpoint status -w json

Comment 19 Jordan Liggitt 2017-07-31 20:59:38 UTC

the etcd migrate is also not re-establishing TTLs on all keys that had TTLs in v2

Additional prefixes that use leases:

/openshift.io/leases/controllers
set to controllerLeaseTTL

/openshift.io/oauth/accesstokens
set to oauthConfig.tokenConfig.accessTokenMaxAgeSeconds

/openshift.io/oauth/authorizetokens
set to oauthConfig.tokenConfig.authorizeTokenMaxAgeSeconds


/kubernetes.io/masterleases should not be set to a 1 hour ttl... it is fixed to a 10 second TTL currently.

Comment 20 Seth Jennings 2017-08-08 17:57:25 UTC

Reassigning to Jordan since he is the active dev on this issue. Trying to take pressure off Derek's ever growing list.  Feel free to adjust the severity if this situation has evolved.

Comment 21 Jordan Liggitt 2017-08-08 19:46:25 UTC

cause is known, remaining work is in the ansible migration task

Comment 22 Anping Li 2017-08-09 02:56:18 UTC

What is the correct step to migrate the clustered etcd? Shall we only re-establish TTL leases on one etcd member and wait until the data are synced in clustered?

Comment 23 Jordan Liggitt 2017-08-09 03:30:58 UTC

The process outlined in https://bugzilla.redhat.com/show_bug.cgi?id=1475351#c17 will work

The important change is to only migrate one member, then rejoin the other two as if they were new members and let them obtain the current data from the one migrated member

Comment 24 Anping Li 2017-08-09 14:51:50 UTC

Jordan, As you mentioned in https://bugzilla.redhat.com/show_bug.cgi?id=1475351#c19. three TTLs are absent and /kubernetes.io/masterleases should be 1 hours. what is the default TTL for different keys?  Is there a dictionary?

Comment 25 Scott Dodson 2017-08-09 15:18:00 UTC

Anping,

Here's my pull request that implements comment 17. I'd like to clean this up but this works at least for 3 node clusters. Please feel free to test it.

https://github.com/openshift/openshift-ansible/pull/4980

It should be attaching leases per comment 19 now, it will pull the values from config files if they're set or it will use the defaults.

Comment 26 Scott Dodson 2017-08-18 20:46:57 UTC

PR updated with the latest implementation just waiting for review.

Comment 28 Anping Li 2017-08-25 12:01:04 UTC

1. migrate works on clustered rpm etcd env
2. migrate failed on clustered containerized etcd Env

TASK [nickhammond.logrotate : nickhammond.logrotate | Setup logrotate.d scripts] ***

RUNNING HANDLER [etcd : restart etcd] ******************************************
skipping: [qe-anlioloh-master-etcd-zone2-1.fixed-001.qe.rhcloud.com] => {
    "changed": false, 
    "skip_reason": "Conditional check failed", 
    "skipped": true
}

TASK [Verify cluster is stable] ************************************************
fatal: [qe-anlioloh-master-etcd-zone2-1.fixed-001.qe.rhcloud.com]: FAILED! => {
    "changed": true, 
    "cmd": [
        "/usr/bin/etcdctl", 
        "--cert-file", 
        "/etc/etcd/peer.crt", 
        "--key-file", 
        "/etc/etcd/peer.key", 
        "--ca-file", 
        "/etc/etcd/ca.crt", 
        "-C", 
        "https://qe-anlioloh-master-etcd-zone1-1:2379", 
        "cluster-health"
    ], 
    "delta": "0:00:00.051522", 
    "end": "2017-08-25 07:39:54.817599", 
    "failed": true, 
    "rc": 5, 
    "start": "2017-08-25 07:39:54.766077", 
    "warnings": []
}

STDOUT:

member 939e401bfb987941 is unhealthy: got unhealthy result from https://10.240.0.24:2379
member de5cd42de39d68f4 is unreachable: no available published client urls
cluster is unhealthy


STDERR:

2017-08-25 07:39:54.785019 I | warning: ignoring ServerName for user-provided CA for backwards compatibility is deprecated
2017-08-25 07:39:54.785697 I | warning: ignoring ServerName for user-provided CA for backwards compatibility is deprecated



journal log on the failure members

Aug 25 05:37:49 qe-anlioloh-master-etcd-zone2-1.c.openshift-gce-devel.internal etcd_container[14941]: 2017-08-25 09:37:49.142335 I | raft: raft.node: 6a51b64f562ca241 elected leader 16db1390acd1368b at term 2
Aug 25 05:37:49 qe-anlioloh-master-etcd-zone2-1.c.openshift-gce-devel.internal etcd_container[14941]: 2017-08-25 09:37:49.151729 I | etcdserver: published {Name:qe-anlioloh-master-etcd-zone2-1 ClientURLs:[https:
Aug 25 05:37:49 qe-anlioloh-master-etcd-zone2-1.c.openshift-gce-devel.internal etcd_container[14941]: 2017-08-25 09:37:49.151791 I | embed: ready to serve client requests
Aug 25 05:37:49 qe-anlioloh-master-etcd-zone2-1.c.openshift-gce-devel.internal etcd_container[14941]: 2017-08-25 09:37:49.152138 I | embed: serving client requests on 10.240.0.25:2379
Aug 25 05:37:49 qe-anlioloh-master-etcd-zone2-1.c.openshift-gce-devel.internal etcd_container[14941]: 2017-08-25 09:37:49.168541 N | etcdserver/membership: set the initial cluster version to 3.1
Aug 25 05:37:49 qe-anlioloh-master-etcd-zone2-1.c.openshift-gce-devel.internal etcd_container[14941]: 2017-08-25 09:37:49.168599 I | etcdserver/api: enabled capabilities for version 3.1
Aug 25 05:46:44 qe-anlioloh-master-etcd-zone2-1.c.openshift-gce-devel.internal etcd_container[14941]: 2017-08-25 09:46:44.401059 W | rafthttp: lost the TCP streaming connection with peer 939e401bfb987941 (stream
Aug 25 05:46:44 qe-anlioloh-master-etcd-zone2-1.c.openshift-gce-devel.internal etcd_container[14941]: 2017-08-25 09:46:44.401128 W | rafthttp: lost the TCP streaming connection with peer 939e401bfb987941 (stream
Aug 25 05:46:44 qe-anlioloh-master-etcd-zone2-1.c.openshift-gce-devel.internal etcd_container[14941]: 2017-08-25 09:46:44.514002 E | rafthttp: failed to dial 939e401bfb987941 on stream Message (dial tcp 10.240.0
Aug 25 05:46:44 qe-anlioloh-master-etcd-zone2-1.c.openshift-gce-devel.internal etcd_container[14941]: 2017-08-25 09:46:44.514021 I | rafthttp: peer 939e401bfb987941 became inactive
Aug 25 05:46:47 qe-anlioloh-master-etcd-zone2-1.c.openshift-gce-devel.internal etcd_container[14941]: 2017-08-25 09:46:47.332197 I | rafthttp: peer 939e401bfb987941 became active
Aug 25 05:46:47 qe-anlioloh-master-etcd-zone2-1.c.openshift-gce-devel.internal etcd_container[14941]: 2017-08-25 09:46:47.332237 W | rafthttp: closed an existing TCP streaming connection with peer 939e401bfb9879
Aug 25 05:46:47 qe-anlioloh-master-etcd-zone2-1.c.openshift-gce-devel.internal etcd_container[14941]: 2017-08-25 09:46:47.332247 I | rafthttp: established a TCP streaming connection with peer 939e401bfb987941 (s
Aug 25 05:46:47 qe-anlioloh-master-etcd-zone2-1.c.openshift-gce-devel.internal etcd_container[14941]: 2017-08-25 09:46:47.341355 W | rafthttp: closed an existing TCP streaming connection with peer 939e401bfb9879
Aug 25 05:46:47 qe-anlioloh-master-etcd-zone2-1.c.openshift-gce-devel.internal etcd_container[14941]: 2017-08-25 09:46:47.341379 I | rafthttp: established a TCP streaming connection with peer 939e401bfb987941 (s
Aug 25 05:46:47 qe-anlioloh-master-etcd-zone2-1.c.openshift-gce-devel.internal etcd_container[14941]: 2017-08-25 09:46:47.360488 I | rafthttp: established a TCP streaming connection with peer 939e401bfb987941 (s
Aug 25 05:46:47 qe-anlioloh-master-etcd-zone2-1.c.openshift-gce-devel.internal etcd_container[14941]: 2017-08-25 09:46:47.361072 I | rafthttp: established a TCP streaming connection with peer 939e401bfb987941 (s
Aug 25 05:49:02 qe-anlioloh-master-etcd-zone2-1.c.openshift-gce-devel.internal systemd[1]: Stopping The Etcd Server container...
Aug 25 05:49:02 qe-anlioloh-master-etcd-zone2-1.c.openshift-gce-devel.internal etcd_container[14941]: 2017-08-25 09:49:02.161750 N | pkg/osutil: received terminated signal, shutting down...
Aug 25 05:49:02 qe-anlioloh-master-etcd-zone2-1.c.openshift-gce-devel.internal etcd_container[14941]: 2017-08-25 09:49:02.162283 I | etcdserver: skipped leadership transfer for stopping non-leader member
Aug 25 05:49:02 qe-anlioloh-master-etcd-zone2-1.c.openshift-gce-devel.internal etcd_container[14941]: 2017-08-25 09:49:02.162379 I | rafthttp: stopping peer 16db1390acd1368b...
Aug 25 05:49:02 qe-anlioloh-master-etcd-zone2-1.c.openshift-gce-devel.internal etcd_container[14941]: 2017-08-25 09:49:02.162711 I | rafthttp: closed the TCP streaming connection with peer 16db1390acd1368b (stre
Aug 25 05:49:02 qe-anlioloh-master-etcd-zone2-1.c.openshift-gce-devel.internal etcd_container[14941]: 2017-08-25 09:49:02.162720 I | rafthttp: stopped streaming with peer 16db1390acd1368b (writer)
Aug 25 05:49:02 qe-anlioloh-master-etcd-zone2-1.c.openshift-gce-devel.internal etcd_container[14941]: 2017-08-25 09:49:02.164008 I | rafthttp: closed the TCP streaming connection with peer 16db1390acd1368b (stre
Aug 25 05:49:02 qe-anlioloh-master-etcd-zone2-1.c.openshift-gce-devel.internal etcd_container[14941]: 2017-08-25 09:49:02.164019 I | rafthttp: stopped streaming with peer 16db1390acd1368b (writer)
Aug 25 05:49:02 qe-anlioloh-master-etcd-zone2-1.c.openshift-gce-devel.internal etcd_container[14941]: 2017-08-25 09:49:02.164067 I | rafthttp: stopped HTTP pipelining with peer 16db1390acd1368b
Aug 25 05:49:02 qe-anlioloh-master-etcd-zone2-1.c.openshift-gce-devel.internal etcd_container[14941]: 2017-08-25 09:49:02.164163 W | rafthttp: lost the TCP streaming connection with peer 16db1390acd1368b (stream
Aug 25 05:49:02 qe-anlioloh-master-etcd-zone2-1.c.openshift-gce-devel.internal etcd_container[14941]: 2017-08-25 09:49:02.164179 E | rafthttp: failed to read 16db1390acd1368b on stream MsgApp v2 (net/http: reque
Aug 25 05:49:02 qe-anlioloh-master-etcd-zone2-1.c.openshift-gce-devel.internal etcd_container[14941]: 2017-08-25 09:49:02.164184 I | rafthttp: peer 16db1390acd1368b became inactive
Aug 25 05:49:02 qe-anlioloh-master-etcd-zone2-1.c.openshift-gce-devel.internal etcd_container[14941]: 2017-08-25 09:49:02.164191 I | rafthttp: stopped streaming with peer 16db1390acd1368b (stream MsgApp v2 reade
Aug 25 05:49:02 qe-anlioloh-master-etcd-zone2-1.c.openshift-gce-devel.internal etcd_container[14941]: 2017-08-25 09:49:02.164296 W | rafthttp: lost the TCP streaming connection with peer 16db1390acd1368b (stream
Aug 25 05:49:02 qe-anlioloh-master-etcd-zone2-1.c.openshift-gce-devel.internal etcd_container[14941]: 2017-08-25 09:49:02.164344 I | rafthttp: stopped streaming with peer 16db1390acd1368b (stream Message reader)
Aug 25 05:49:02 qe-anlioloh-master-etcd-zone2-1.c.openshift-gce-devel.internal etcd_container[14941]: 2017-08-25 09:49:02.164361 I | rafthttp: stopped peer 16db1390acd1368b
Aug 25 05:49:02 qe-anlioloh-master-etcd-zone2-1.c.openshift-gce-devel.internal etcd_container[14941]: 2017-08-25 09:49:02.164373 I | rafthttp: stopping peer 939e401bfb987941...
Aug 25 05:49:02 qe-anlioloh-master-etcd-zone2-1.c.openshift-gce-devel.internal etcd_container[14941]: 2017-08-25 09:49:02.165024 I | rafthttp: closed the TCP streaming connection with peer 939e401bfb987941 (stre
Aug 25 05:49:02 qe-anlioloh-master-etcd-zone2-1.c.openshift-gce-devel.internal etcd_container[14941]: 2017-08-25 09:49:02.165046 I | rafthttp: stopped streaming with peer 939e401bfb987941 (writer)
Aug 25 05:49:02 qe-anlioloh-master-etcd-zone2-1.c.openshift-gce-devel.internal etcd_container[14941]: 2017-08-25 09:49:02.165380 I | rafthttp: closed the TCP streaming connection with peer 939e401bfb987941 (stre
Aug 25 05:49:02 qe-anlioloh-master-etcd-zone2-1.c.openshift-gce-devel.internal etcd_container[14941]: 2017-08-25 09:49:02.165396 I | rafthttp: stopped streaming with peer 939e401bfb987941 (writer)
Aug 25 05:49:02 qe-anlioloh-master-etcd-zone2-1.c.openshift-gce-devel.internal etcd_container[14941]: 2017-08-25 09:49:02.165417 I | rafthttp: stopped HTTP pipelining with peer 939e401bfb987941
Aug 25 05:49:02 qe-anlioloh-master-etcd-zone2-1.c.openshift-gce-devel.internal etcd_container[14941]: 2017-08-25 09:49:02.165496 W | rafthttp: lost the TCP streaming connection with peer 939e401bfb987941 (stream
Aug 25 05:49:02 qe-anlioloh-master-etcd-zone2-1.c.openshift-gce-devel.internal etcd_container[14941]: 2017-08-25 09:49:02.165508 E | rafthttp: failed to read 939e401bfb987941 on stream MsgApp v2 (net/http: reque
Aug 25 05:49:02 qe-anlioloh-master-etcd-zone2-1.c.openshift-gce-devel.internal etcd_container[14941]: 2017-08-25 09:49:02.165512 I | rafthttp: peer 939e401bfb987941 became inactive
Aug 25 05:49:02 qe-anlioloh-master-etcd-zone2-1.c.openshift-gce-devel.internal etcd_container[14941]: 2017-08-25 09:49:02.165520 I | rafthttp: stopped streaming with peer 939e401bfb987941 (stream MsgApp v2 reade

Comment 29 Anping Li 2017-08-25 12:09:49 UTC

Created attachment 1318151 [details]
Part of ansible logs

Will provide more detail logs when I reproduce it.

Comment 30 Anping Li 2017-08-25 15:15:28 UTC

Created attachment 1318263 [details]
migrate log and inventory file

Comment 31 Scott Dodson 2017-08-25 20:36:58 UTC

https://github.com/openshift/openshift-ansible/pull/5229 proposed fix I need to test this some more and then i'll backport to release-3.6 and produce new builds tonight.

Comment 32 Scott Dodson 2017-08-28 02:00:08 UTC

openshift-ansible-3.6.173.0.19-2.git.0.eb719a4.el7 should fix that

Comment 33 Anping Li 2017-08-28 11:03:42 UTC

The etcd scaleup still failed on the second etcd member. I think we shouldn't add “quotation marks” for ETCD_INITIAL_CLUSTER and ETCD_DEBUG.

etcdctl3  member list
2017-08-28 07:01:48.242808 I | warning: ignoring ServerName for user-provided CA for backwards compatibility is deprecated
5e28099eb201, started, qe-anlimotr-master-etcd-zone1-1, https://10.240.0.7:2380, https://10.240.0.7:2379
aaa0eb2ebbef33de, unstarted, , https://10.240.0.8:2380, 

ETCD_NAME=qe-anlimotr-master-etcd-zone2-1
ETCD_LISTEN_PEER_URLS=https://10.240.0.8:2380
ETCD_DATA_DIR=/var/lib/etcd/
ETCD_HEARTBEAT_INTERVAL=500
ETCD_ELECTION_TIMEOUT=2500
ETCD_LISTEN_CLIENT_URLS=https://10.240.0.8:2379


ETCD_INITIAL_ADVERTISE_PEER_URLS=https://10.240.0.8:2380
ETCD_INITIAL_CLUSTER="qe-anlimotr-master-etcd-zone1-1=https://10.240.0.7:2380,qe-anlimotr-master-etcd-zone2-1=https://10.240.0.8:2380"
ETCD_INITIAL_CLUSTER_STATE=existing
ETCD_INITIAL_CLUSTER_TOKEN=etcd-cluster-1
ETCD_ADVERTISE_CLIENT_URLS=https://10.240.0.8:2379


ETCD_CA_FILE=/etc/etcd/ca.crt
ETCD_CERT_FILE=/etc/etcd/server.crt
ETCD_KEY_FILE=/etc/etcd/server.key
ETCD_PEER_CA_FILE=/etc/etcd/ca.crt
ETCD_PEER_CERT_FILE=/etc/etcd/peer.crt
ETCD_PEER_KEY_FILE=/etc/etcd/peer.key
ETCD_DEBUG="False"

Comment 34 Anping Li 2017-08-28 11:04:29 UTC

Created attachment 1319057 [details]
The inventory file migrate log and etcd jouranl file

Comment 35 Scott Dodson 2017-08-28 12:25:13 UTC

(In reply to Anping Li from comment #33)
> The etcd scaleup still failed on the second etcd member. I think we
> shouldn't add “quotation marks” for ETCD_INITIAL_CLUSTER and ETCD_DEBUG.

Yeah, that's what my PR fixed. Can you confirm which version of openshift-ansible you used?

Comment 36 Anping Li 2017-08-29 08:43:56 UTC

Test pass with openshift3/ose-ansible:v3.6.173.0.21
1) Single RPM master/etcd               Pass
2) Single containerized master/etcd     Pass
3) Clustered RPM master/etcd            Pass
4) Clustered Containerized master/etcd  Pass
5) Clustered Atomic master/etcd         Pass
6) single master with external clustered Containerizedetcd   Fail
7) single master with external clustered RPM etcd            on_going

Note: master/etcd = master and etcd are located in same host.

Comment 37 Anping Li 2017-08-29 08:47:06 UTC

6) single master with external clustered Containerizedetcd   Fail

### Inventory file###
[masters]
qe-anliayjy-master-1.0829-brg.qe.rhcloud.com ansible_user=root ansible_ssh_user=root openshift_public_hostname=qe-anliayjy-master-1.0829-brg.qe.rhcloud.com openshift_hostname=qe-anliayjy-master-1
[nodes]
qe-anliayjy-master-1.0829-brg.qe.rhcloud.com ansible_user=root ansible_ssh_user=root openshift_public_hostname=qe-anliayjy-master-1.0829-brg.qe.rhcloud.com openshift_hostname=qe-anliayjy-master-1 openshift_node_labels="{'role': 'node'}" openshift_schedulable=true
qe-anliayjy-node-registry-router-1.0829-brg.qe.rhcloud.com ansible_user=root ansible_ssh_user=root openshift_public_hostname=qe-anliayjy-node-registry-router-1.0829-brg.qe.rhcloud.com openshift_hostname=qe-anliayjy-node-registry-router-1 openshift_node_labels="{'role': 'node','registry': 'enabled','router': 'enabled'}"
[etcd]
qe-anliayjy-etcd-1.0829-brg.qe.rhcloud.com ansible_user=root ansible_ssh_user=root openshift_public_hostname=qe-anliayjy-etcd-1.0829-brg.qe.rhcloud.com openshift_hostname=qe-anliayjy-etcd-1
qe-anliayjy-etcd-2.0829-brg.qe.rhcloud.com ansible_user=root ansible_ssh_user=root openshift_public_hostname=qe-anliayjy-etcd-2.0829-brg.qe.rhcloud.com openshift_hostname=qe-anliayjy-etcd-2
qe-anliayjy-etcd-3.0829-brg.qe.rhcloud.com ansible_user=root ansible_ssh_user=root openshift_public_hostname=qe-anliayjy-etcd-3.0829-brg.qe.rhcloud.com openshift_hostname=qe-anliayjy-etcd-3



##########migrade log##############
TASK [etcd_migrate : set_fact] *************************************************
ok: [qe-anliayjy-master-1.0829-brg.qe.rhcloud.com] => {
    "ansible_facts": {
        "accessTokenMaxAgeSeconds": "86400", 
        "authroizeTokenMaxAgeSeconds": "500", 
        "controllerLeaseTTL": "30"
    }, 
    "changed": false
}

TASK [etcd_migrate : Re-introduce leases (as a replacement for key TTLs)] ******
failed: [qe-anliayjy-master-1.0829-brg.qe.rhcloud.com] (item={u'keys': u'/kubernetes.io/events', u'ttl': u'1h'}) => {
    "changed": true, 
    "cmd": [
        "oadm", 
        "migrate", 
        "etcd-ttl", 
        "--cert", 
        "/etc/origin/master/master.etcd-client.crt", 
        "--key", 
        "/etc/origin/master/master.etcd-client.key", 
        "--cacert", 
        "/etc/origin/master/master.etcd-ca.crt", 
        "--etcd-address", 
        "https://10.240.0.41:2379", 
        "--ttl-keys-prefix", 
        "<built-in", 
        "method", 
        "keys", 
        "of", 
        "dict", 
        "object", 
        "at", 
        "0x3cab9d0>", 
        "--lease-duration", 
        "1h"
    ], 
    "delta": "0:00:00.177166", 
    "end": "2017-08-29 04:25:45.610543", 
    "failed": true, 
    "item": {
        "keys": "/kubernetes.io/events", 
        "ttl": "1h"
    }, 
    "rc": 1, 
    "start": "2017-08-29 04:25:45.433377", 
    "warnings": []
}

STDERR:

Error: unknown flag: --cert


Usage:
  oadm migrate [options]

Available Commands:
  image-references Update embedded Docker image references
  storage          Update the stored version of API objects

Use "oadm <command> --help" for more information about a given command.
Use "oadm options" for a list of global command-line options (applies to all commands).

failed: [qe-anliayjy-master-1.0829-brg.qe.rhcloud.com] (item={u'keys': u'/kubernetes.io/masterleases', u'ttl': u'10s'}) => {
    "changed": true, 
    "cmd": [
        "oadm", 
        "migrate", 
        "etcd-ttl", 
        "--cert", 
        "/etc/origin/master/master.etcd-client.crt", 
        "--key", 
        "/etc/origin/master/master.etcd-client.key", 
        "--cacert", 
        "/etc/origin/master/master.etcd-ca.crt", 
        "--etcd-address", 
        "https://10.240.0.41:2379", 
        "--ttl-keys-prefix", 
        "<built-in", 
        "method", 
        "keys", 
        "of", 
        "dict", 
        "object", 
        "at", 
        "0x3cb5fa0>", 
        "--lease-duration", 
        "10s"
    ], 
    "delta": "0:00:00.189811", 
    "end": "2017-08-29 04:25:46.859481", 
    "failed": true, 
    "item": {
        "keys": "/kubernetes.io/masterleases", 
        "ttl": "10s"
    }, 
    "rc": 1, 
    "start": "2017-08-29 04:25:46.669670", 
    "warnings": []
}

STDERR:

Error: unknown flag: --cert


Usage:
  oadm migrate [options]

Available Commands:
  image-references Update embedded Docker image references
  storage          Update the stored version of API objects

Use "oadm <command> --help" for more information about a given command.
Use "oadm options" for a list of global command-line options (applies to all commands).

failed: [qe-anliayjy-master-1.0829-brg.qe.rhcloud.com] (item={u'keys': u'/openshift.io/oauth/accesstokens', u'ttl': u'86400s'}) => {
    "changed": true, 
    "cmd": [
        "oadm", 
        "migrate", 
        "etcd-ttl", 
        "--cert", 
        "/etc/origin/master/master.etcd-client.crt", 
        "--key", 
        "/etc/origin/master/master.etcd-client.key", 
        "--cacert", 
        "/etc/origin/master/master.etcd-ca.crt", 
        "--etcd-address", 
        "https://10.240.0.41:2379", 
        "--ttl-keys-prefix", 
        "<built-in", 
        "method", 
        "keys", 
        "of", 
        "dict", 
        "object", 
        "at", 
        "0x391fd60>", 
        "--lease-duration", 
        "86400s"
    ], 
    "delta": "0:00:00.191488", 
    "end": "2017-08-29 04:25:48.115620", 
    "failed": true, 
    "item": {
        "keys": "/openshift.io/oauth/accesstokens", 
        "ttl": "86400s"
    }, 
    "rc": 1, 
    "start": "2017-08-29 04:25:47.924132", 
    "warnings": []
}

STDERR:

Error: unknown flag: --cert


Usage:
  oadm migrate [options]

Available Commands:
  image-references Update embedded Docker image references
  storage          Update the stored version of API objects

Use "oadm <command> --help" for more information about a given command.
Use "oadm options" for a list of global command-line options (applies to all commands).

failed: [qe-anliayjy-master-1.0829-brg.qe.rhcloud.com] (item={u'keys': u'/openshift.io/oauth/authorizetokens', u'ttl': u'500s'}) => {
    "changed": true, 
    "cmd": [
        "oadm", 
        "migrate", 
        "etcd-ttl", 
        "--cert", 
        "/etc/origin/master/master.etcd-client.crt", 
        "--key", 
        "/etc/origin/master/master.etcd-client.key", 
        "--cacert", 
        "/etc/origin/master/master.etcd-ca.crt", 
        "--etcd-address", 
        "https://10.240.0.41:2379", 
        "--ttl-keys-prefix", 
        "<built-in", 
        "method", 
        "keys", 
        "of", 
        "dict", 
        "object", 
        "at", 
        "0x3ee3af0>", 
        "--lease-duration", 
        "500s"
    ], 
    "delta": "0:00:00.166911", 
    "end": "2017-08-29 04:25:49.356639", 
    "failed": true, 
    "item": {
        "keys": "/openshift.io/oauth/authorizetokens", 
        "ttl": "500s"
    }, 
    "rc": 1, 
    "start": "2017-08-29 04:25:49.189728", 
    "warnings": []
}

STDERR:

Error: unknown flag: --cert


Usage:
  oadm migrate [options]

Available Commands:
  image-references Update embedded Docker image references
  storage          Update the stored version of API objects

Use "oadm <command> --help" for more information about a given command.
Use "oadm options" for a list of global command-line options (applies to all commands).

failed: [qe-anliayjy-master-1.0829-brg.qe.rhcloud.com] (item={u'keys': u'/openshift.io/leases/controllers', u'ttl': u'30s'}) => {
    "changed": true, 
    "cmd": [
        "oadm", 
        "migrate", 
        "etcd-ttl", 
        "--cert", 
        "/etc/origin/master/master.etcd-client.crt", 
        "--key", 
        "/etc/origin/master/master.etcd-client.key", 
        "--cacert", 
        "/etc/origin/master/master.etcd-ca.crt", 
        "--etcd-address", 
        "https://10.240.0.41:2379", 
        "--ttl-keys-prefix", 
        "<built-in", 
        "method", 
        "keys", 
        "of", 
        "dict", 
        "object", 
        "at", 
        "0x1df1c10>", 
        "--lease-duration", 
        "30s"
    ], 
    "delta": "0:00:00.177770", 
    "end": "2017-08-29 04:25:50.593262", 
    "failed": true, 
    "item": {
        "keys": "/openshift.io/leases/controllers", 
        "ttl": "30s"
    }, 
    "rc": 1, 
    "start": "2017-08-29 04:25:50.415492", 
    "warnings": []
}

STDERR:

Error: unknown flag: --cert


Usage:
  oadm migrate [options]

Available Commands:
  image-references Update embedded Docker image references
  storage          Update the stored version of API objects

Use "oadm <command> --help" for more information about a given command.
Use "oadm options" for a list of global command-line options (applies to all commands).

Comment 38 Scott Dodson 2017-08-29 14:22:59 UTC

Anping,

That looks like the oadm command wasn't updated prior to running the migration. Can you verify that openshift was upgraded to 3.6 prior to running the migration?
Can you gather the output of these commands?

which oadm oc openshift
oadm version
oc version
openshift version

Comment 39 Anping Li 2017-08-29 14:30:06 UTC

Scott, I will retest 6) & 7), will update the comment once finished.

6) single master with external clustered Containerizedetcd
7) single master with external clustered RPM etcd

Comment 40 Anping Li 2017-08-30 22:46:23 UTC

The migrate works for 6) 7). 
6) single master with external clustered Containerizedetcd
7) single master with external clustered RPM etcd

Comment 42 errata-xmlrpc 2017-09-05 17:42:58 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:2639

Note You need to log in before you can comment on or make changes to this bug.