Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 1402771 - The database backed by etcd-3.x can't be used by etcd-2.x
The database backed by etcd-3.x can't be used by etcd-2.x
Status: CLOSED ERRATA
Product: OpenShift Container Platform
Classification: Red Hat
Component: Upgrade (Show other bugs)
3.4.0
Unspecified Unspecified
unspecified Severity medium
: ---
: ---
Assigned To: Scott Dodson
Anping Li
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2016-12-08 05:15 EST by Anping Li
Modified: 2017-08-16 15 EDT (History)
5 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
If the version of etcd used to produce the etcd backup was version 3.x the backup can only be loaded by etcd 3.x. This occurs when running etcd in a containerized install and the version of the rpm installed on the host differs from that running inside the container. We have updated the backup playbooks to use the version of etcd from within the container which ensures that a matching version of etcd is used.
Story Points: ---
Clone Of:
Environment:
Last Closed: 2017-08-10 01:17:28 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2017:1716 normal SHIPPED_LIVE Red Hat OpenShift Container Platform 3.6 RPM Release Advisory 2017-08-10 05:02:50 EDT

  None (edit)
Description Anping Li 2016-12-08 05:15:54 EST
Description of problem:
if the etcd data was backed by the etcd-3.x which is done by upgrade_etcd.yml, it failed to rollback to etcd-2.x using the backup data,  

There are similar error with the etcd backup issue.

"Dec 08 04:27:39 ha1-ose-1-4.novalocal etcd_container[25602]: panic: runtime error: makeslice: len out of range"

Version-Release number of selected component (if applicable):
openshift-ansible-playbooks-3.4.35-1.git.0.2e13650.el7.noarch

How reproducible:
always

Steps to Reproduce:
1. install ocp v3.3 with standalone etcd 
2. upgrade etcd
ansile-playbook -i hosts /usr/share/ansible/openshift-ansible/playbooks/byo/openshift-cluster/upgrades/upgrade_etcd.yml
3 systemctl stop etcd_container
4 copy backup etcd data file
# ETCD_DIR=/var/lib/etcd/
# mv $ETCD_DIR /var/lib/etcd.orig
# cp -Rp /var/lib/origin/etcd-backup-<timestamp>/ $ETCD_DIR
# chcon -R --reference /var/lib/etcd.orig/ $ETCD_DIR
# chown -R etcd:etcd $ETCD_DIR

5 Modify /etc/systemd/system/etcd_container.service;
  3.1)  set correct image version
  3.2)  add --force-new-cluster
  For example
   ExecStart=/usr/bin/docker run --name etcd_container --rm -v /var/lib/etcd:/var/lib/etcd:z -v /etc/etcd:/etc/etcd:z --env-file=/etc/etcd/etcd.conf --net=host --entrypoint=/usr/bin/etcd registry.access.redhat.com/rhel7/etcd3:3.0.14
   ExecStart=/usr/bin/docker run --name etcd_container --rm -v /var/lib/etcd:/var/lib/etcd:z -v /etc/etcd:/etc/etcd:z --env-file=/etc/etcd/etcd.conf --net=host --entrypoint=/usr/bin/etcd registry.access.redhat.com/rhel7/etcd --force-new-cluster
6 systemctl daemon-reload ; systemctl start etcd_container
7 check the etcd_container service status

Actual results:
Dec 08 04:27:39 ha1-ose-1-4.novalocal etcd_container[25602]: 2016-12-08 09:27:39.148588 I | etcdmain: listening for client requests on https://192.168.1.172:2379
Dec 08 04:27:39 ha1-ose-1-4.novalocal etcd_container[25602]: 2016-12-08 09:27:39.150030 I | etcdserver: name = default
Dec 08 04:27:39 ha1-ose-1-4.novalocal etcd_container[25602]: 2016-12-08 09:27:39.150041 I | etcdserver: force new cluster
Dec 08 04:27:39 ha1-ose-1-4.novalocal etcd_container[25602]: 2016-12-08 09:27:39.150047 I | etcdserver: data dir = /var/lib/etcd/
Dec 08 04:27:39 ha1-ose-1-4.novalocal etcd_container[25602]: 2016-12-08 09:27:39.150056 I | etcdserver: member dir = /var/lib/etcd/member
Dec 08 04:27:39 ha1-ose-1-4.novalocal etcd_container[25602]: 2016-12-08 09:27:39.150063 I | etcdserver: heartbeat = 500ms
Dec 08 04:27:39 ha1-ose-1-4.novalocal etcd_container[25602]: 2016-12-08 09:27:39.150068 I | etcdserver: election = 2500ms
Dec 08 04:27:39 ha1-ose-1-4.novalocal etcd_container[25602]: 2016-12-08 09:27:39.150074 I | etcdserver: snapshot count = 10000
Dec 08 04:27:39 ha1-ose-1-4.novalocal etcd_container[25602]: 2016-12-08 09:27:39.150087 I | etcdserver: advertise client URLs = https://192.168.1.172:2379
Dec 08 04:27:39 ha1-ose-1-4.novalocal etcd_container[25602]: panic: runtime error: makeslice: len out of range
Dec 08 04:27:39 ha1-ose-1-4.novalocal etcd_container[25602]: goroutine 1 [running]:
Dec 08 04:27:39 ha1-ose-1-4.novalocal etcd_container[25602]: panic(0xdbd840, 0xc8201dfa90)
Dec 08 04:27:39 ha1-ose-1-4.novalocal etcd_container[25602]: /usr/lib/golang/src/runtime/panic.go:481 +0x3e6
Dec 08 04:27:39 ha1-ose-1-4.novalocal etcd_container[25602]: github.com/coreos/etcd/wal.(*decoder).decode(0xc82021a990, 0xc820187e58, 0x0, 0x0)
Dec 08 04:27:39 ha1-ose-1-4.novalocal etcd_container[25602]: /builddir/build/BUILD/etcd-2.3.7/src/github.com/coreos/etcd/wal/decoder.go:55 +0x142
Dec 08 04:27:39 ha1-ose-1-4.novalocal etcd_container[25602]: github.com/coreos/etcd/wal.(*WAL).ReadAll(0xc8201e2410, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
Dec 08 04:27:39 ha1-ose-1-4.novalocal etcd_container[25602]: /builddir/build/BUILD/etcd-2.3.7/src/github.com/coreos/etcd/wal/wal.go:237 +0x214
Dec 08 04:27:39 ha1-ose-1-4.novalocal etcd_container[25602]: github.com/coreos/etcd/etcdserver.readWAL(0xc8202228c0, 0x18, 0x0, 0x0, 0x0, 0x0, 0x0, 0xc8201e2410, 0xc820010200, 0x0, ...)
Dec 08 04:27:39 ha1-ose-1-4.novalocal etcd_container[25602]: /builddir/build/BUILD/etcd-2.3.7/src/github.com/coreos/etcd/etcdserver/storage.go:87 +0x228
Dec 08 04:27:39 ha1-ose-1-4.novalocal etcd_container[25602]: github.com/coreos/etcd/etcdserver.restartAsStandaloneNode(0xc820075e00, 0x0, 0x7f654e095028, 0xc82015e1c0, 0x0, 0x0, 0x0, 0x0)
Dec 08 04:27:39 ha1-ose-1-4.novalocal etcd_container[25602]: /builddir/build/BUILD/etcd-2.3.7/src/github.com/coreos/etcd/etcdserver/raft.go:371 +0x11a
Dec 08 04:27:39 ha1-ose-1-4.novalocal etcd_container[25602]: github.com/coreos/etcd/etcdserver.NewServer(0xc820075e00, 0xc820075eb8, 0x0, 0x0)
Dec 08 04:27:39 ha1-ose-1-4.novalocal etcd_container[25602]: /builddir/build/BUILD/etcd-2.3.7/src/github.com/coreos/etcd/etcdserver/server.go:335 +0x3ca6
Dec 08 04:27:39 ha1-ose-1-4.novalocal etcd_container[25602]: github.com/coreos/etcd/etcdmain.startEtcd(0xc820077400, 0x0, 0x0, 0x0)
Dec 08 04:27:39 ha1-ose-1-4.novalocal etcd_container[25602]: /builddir/build/BUILD/etcd-2.3.7/src/github.com/coreos/etcd/etcdmain/etcd.go:302 +0x1b40
Dec 08 04:27:39 ha1-ose-1-4.novalocal etcd_container[25602]: github.com/coreos/etcd/etcdmain.Main()
Dec 08 04:27:39 ha1-ose-1-4.novalocal etcd_container[25602]: /builddir/build/BUILD/etcd-2.3.7/src/github.com/coreos/etcd/etcdmain/etcd.go:118 +0x2142
Dec 08 04:27:39 ha1-ose-1-4.novalocal etcd_container[25602]: main.main()
Dec 08 04:27:39 ha1-ose-1-4.novalocal etcd_container[25602]: /builddir/build/BUILD/etcd-2.3.7/src/github.com/coreos/etcd/main.go:37 +0xe3
Dec 08 04:27:39 ha1-ose-1-4.novalocal systemd[1]: etcd_container.service: main process exited, code=exited, status=2/INVALIDARGUMENT
Dec 08 04:27:39 ha1-ose-1-4.novalocal etcd_container[25679]: Failed to stop container (etcd_container): Error response from daemon: No such container: etcd_container
Dec 08 04:27:39 ha1-ose-1-4.novalocal systemd[1]: etcd_container.service: control process exited, code=exited status=1
Dec 08 04:27:39 ha1-ose-1-4.novalocal systemd[1]: Unit etcd_container.service entered failed state.
Dec 08 04:27:39 ha1-ose-1-4.novalocal systemd[1]: etcd_container.service failed.

Expected results:


Additional info:
Comment 8 Scott Dodson 2017-06-09 00:04:40 EDT
This should be fixed in the current playbooks because we perform the backup from within the container now. Please confirm
Comment 9 Anping Li 2017-06-09 07:26:39 EDT
The database was backed up by etcdctl in containers. no such issue in v3.6
Comment 11 errata-xmlrpc 2017-08-10 01:17:28 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:1716

Note You need to log in before you can comment on or make changes to this bug.