| Summary: | Unexpected object conversion after migration to etcd3 storage | ||||||
|---|---|---|---|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Mike Fiedler <mifiedle> | ||||
| Component: | Node | Assignee: | Timothy St. Clair <tstclair> | ||||
| Status: | CLOSED NOTABUG | QA Contact: | Mike Fiedler <mifiedle> | ||||
| Severity: | medium | Docs Contact: | |||||
| Priority: | unspecified | ||||||
| Version: | 3.4.0 | CC: | aos-bugs, jokerman, mifiedle, mmccomas, tstclair | ||||
| Target Milestone: | --- | Keywords: | UpcomingRelease | ||||
| Target Release: | --- | ||||||
| Hardware: | x86_64 | ||||||
| OS: | Linux | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2016-10-31 21:08:00 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Attachments: |
|
||||||
Q1: Pre 'etcdctl migrate' were you seeing large # of conversion traces? C1: the "failed to handle multiple devices for container. Skipping Filesystem stats" is orthogonal to the api-conversion data. C2: There are Storage errors on leases which was expected, because we used the default migrator. So in chatting with clayton, there is indeed a 'v2mode:json -> v3mode:protobuf' conversion that is implicit. We will simply need to document that the cluster should be brought back online slowly to allow for the data conversion. We will also need to be certain that TTL keys are wiped on the migration. xref (https://github.com/coreos/etcd/issues/6767) |
Created attachment 1215015 [details] master log - search for "About to convert" and "failed to handle" Description of problem: After migrating cluster storage from etcd v2 to etcd v3, there are a large number of object conversion and conversion failure messages in the logs when the masters are re-started. Opening this BZ for confirmation that conversion is expected when the only change is to etcd storage - i.e. no format/schema/version change in the underlying OCP data. Example (see attached log) Oct 27 20:23:02 192 atomic-openshift-node: I1027 20:23:02.339957 2922 conversion.go:133] failed to handle multiple devices for container. Skipping Filesystem stats Oct 27 20:23:02 192 atomic-openshift-node: I1027 20:23:02.340009 2922 conversion.go:133] failed to handle multiple devices for container. Skipping Filesystem stats Oct 27 20:23:02 192 atomic-openshift-master-api: I1027 20:23:02.485103 120864 trace.go:61] Trace "Update /api/v1/nodes/192.1.18.20/status" (started 2016-10-27 20:22:55.868264451 -0400 EDT): Oct 27 20:23:02 192 atomic-openshift-master-api: [57.021µs] [57.021µs] About to convert to expected version Oct 27 20:23:02 192 atomic-openshift-master-api: [174.003µs] [116.982µs] Conversion done Oct 27 20:23:02 192 atomic-openshift-master-api: [180.27µs] [6.267µs] About to store object in database Oct 27 20:23:02 192 atomic-openshift-master-api: [6.616439956s] [6.616259686s] Object stored in database Oct 27 20:23:02 192 atomic-openshift-master-api: [6.616456216s] [16.26µs] Self-link added Oct 27 20:23:02 192 atomic-openshift-master-api: [6.616754191s] [297.975µs] END Oct 27 20:23:02 192 atomic-openshift-master-api: I1027 20:23:02.485155 120864 trace.go:61] Trace "Update /api/v1/nodes/192.1.18.218/status" (started 2016-10-27 20:22:55.813897248 -0400 EDT): Oct 27 20:23:02 192 atomic-openshift-master-api: [113.948µs] [113.948µs] About to convert to expected version Oct 27 20:23:02 192 atomic-openshift-master-api: [306.242µs] [192.294µs] Conversion done Oct 27 20:23:02 192 atomic-openshift-master-api: [315.259µs] [9.017µs] About to store object in database Oct 27 20:23:02 192 atomic-openshift-master-api: [6.670909865s] [6.670594606s] Object stored in database Oct 27 20:23:02 192 atomic-openshift-master-api: [6.670922331s] [12.466µs] Self-link added Oct 27 20:23:02 192 atomic-openshift-master-api: [6.671170995s] [248.664µs] END Version-Release number of selected component (if applicable): 3.4.0.16 and etcd 3.0.12-3 How reproducible: always on first restart after etcd data migration to V3 Steps to Reproduce: 1. Install an HA cluster (3 masters, 3 etcd) with OCP 3.4.0.16 + etcd 2.3.7 2. Create projects with running deployments 3. Shutdown masters and etcd. Leave OpenShift nodes running. 4. On each etcd: yum swap etcd3 etcd to install etcd3 3.0.12-3. 5. On each etcd: etcdctl migrate --data-dir /var/lib/etcd 6. Start etcd on each 7. Start OpenShift masters Actual results: OpenShift master logs have many messages (see above) for object conversion and conversion failures. Expected results: Unknown - possibly unexpected that conversions are taking place when the objects are unchanged in terms of version or schema. Looking for confirmation on correct behavior.