Description of problem: if the etcd data was backed by the etcd-3.x which is done by upgrade_etcd.yml, it failed to rollback to etcd-2.x using the backup data, There are similar error with the etcd backup issue. "Dec 08 04:27:39 ha1-ose-1-4.novalocal etcd_container[25602]: panic: runtime error: makeslice: len out of range" Version-Release number of selected component (if applicable): openshift-ansible-playbooks-3.4.35-1.git.0.2e13650.el7.noarch How reproducible: always Steps to Reproduce: 1. install ocp v3.3 with standalone etcd 2. upgrade etcd ansile-playbook -i hosts /usr/share/ansible/openshift-ansible/playbooks/byo/openshift-cluster/upgrades/upgrade_etcd.yml 3 systemctl stop etcd_container 4 copy backup etcd data file # ETCD_DIR=/var/lib/etcd/ # mv $ETCD_DIR /var/lib/etcd.orig # cp -Rp /var/lib/origin/etcd-backup-<timestamp>/ $ETCD_DIR # chcon -R --reference /var/lib/etcd.orig/ $ETCD_DIR # chown -R etcd:etcd $ETCD_DIR 5 Modify /etc/systemd/system/etcd_container.service; 3.1) set correct image version 3.2) add --force-new-cluster For example ExecStart=/usr/bin/docker run --name etcd_container --rm -v /var/lib/etcd:/var/lib/etcd:z -v /etc/etcd:/etc/etcd:z --env-file=/etc/etcd/etcd.conf --net=host --entrypoint=/usr/bin/etcd registry.access.redhat.com/rhel7/etcd3:3.0.14 ExecStart=/usr/bin/docker run --name etcd_container --rm -v /var/lib/etcd:/var/lib/etcd:z -v /etc/etcd:/etc/etcd:z --env-file=/etc/etcd/etcd.conf --net=host --entrypoint=/usr/bin/etcd registry.access.redhat.com/rhel7/etcd --force-new-cluster 6 systemctl daemon-reload ; systemctl start etcd_container 7 check the etcd_container service status Actual results: Dec 08 04:27:39 ha1-ose-1-4.novalocal etcd_container[25602]: 2016-12-08 09:27:39.148588 I | etcdmain: listening for client requests on https://192.168.1.172:2379 Dec 08 04:27:39 ha1-ose-1-4.novalocal etcd_container[25602]: 2016-12-08 09:27:39.150030 I | etcdserver: name = default Dec 08 04:27:39 ha1-ose-1-4.novalocal etcd_container[25602]: 2016-12-08 09:27:39.150041 I | etcdserver: force new cluster Dec 08 04:27:39 ha1-ose-1-4.novalocal etcd_container[25602]: 2016-12-08 09:27:39.150047 I | etcdserver: data dir = /var/lib/etcd/ Dec 08 04:27:39 ha1-ose-1-4.novalocal etcd_container[25602]: 2016-12-08 09:27:39.150056 I | etcdserver: member dir = /var/lib/etcd/member Dec 08 04:27:39 ha1-ose-1-4.novalocal etcd_container[25602]: 2016-12-08 09:27:39.150063 I | etcdserver: heartbeat = 500ms Dec 08 04:27:39 ha1-ose-1-4.novalocal etcd_container[25602]: 2016-12-08 09:27:39.150068 I | etcdserver: election = 2500ms Dec 08 04:27:39 ha1-ose-1-4.novalocal etcd_container[25602]: 2016-12-08 09:27:39.150074 I | etcdserver: snapshot count = 10000 Dec 08 04:27:39 ha1-ose-1-4.novalocal etcd_container[25602]: 2016-12-08 09:27:39.150087 I | etcdserver: advertise client URLs = https://192.168.1.172:2379 Dec 08 04:27:39 ha1-ose-1-4.novalocal etcd_container[25602]: panic: runtime error: makeslice: len out of range Dec 08 04:27:39 ha1-ose-1-4.novalocal etcd_container[25602]: goroutine 1 [running]: Dec 08 04:27:39 ha1-ose-1-4.novalocal etcd_container[25602]: panic(0xdbd840, 0xc8201dfa90) Dec 08 04:27:39 ha1-ose-1-4.novalocal etcd_container[25602]: /usr/lib/golang/src/runtime/panic.go:481 +0x3e6 Dec 08 04:27:39 ha1-ose-1-4.novalocal etcd_container[25602]: github.com/coreos/etcd/wal.(*decoder).decode(0xc82021a990, 0xc820187e58, 0x0, 0x0) Dec 08 04:27:39 ha1-ose-1-4.novalocal etcd_container[25602]: /builddir/build/BUILD/etcd-2.3.7/src/github.com/coreos/etcd/wal/decoder.go:55 +0x142 Dec 08 04:27:39 ha1-ose-1-4.novalocal etcd_container[25602]: github.com/coreos/etcd/wal.(*WAL).ReadAll(0xc8201e2410, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...) Dec 08 04:27:39 ha1-ose-1-4.novalocal etcd_container[25602]: /builddir/build/BUILD/etcd-2.3.7/src/github.com/coreos/etcd/wal/wal.go:237 +0x214 Dec 08 04:27:39 ha1-ose-1-4.novalocal etcd_container[25602]: github.com/coreos/etcd/etcdserver.readWAL(0xc8202228c0, 0x18, 0x0, 0x0, 0x0, 0x0, 0x0, 0xc8201e2410, 0xc820010200, 0x0, ...) Dec 08 04:27:39 ha1-ose-1-4.novalocal etcd_container[25602]: /builddir/build/BUILD/etcd-2.3.7/src/github.com/coreos/etcd/etcdserver/storage.go:87 +0x228 Dec 08 04:27:39 ha1-ose-1-4.novalocal etcd_container[25602]: github.com/coreos/etcd/etcdserver.restartAsStandaloneNode(0xc820075e00, 0x0, 0x7f654e095028, 0xc82015e1c0, 0x0, 0x0, 0x0, 0x0) Dec 08 04:27:39 ha1-ose-1-4.novalocal etcd_container[25602]: /builddir/build/BUILD/etcd-2.3.7/src/github.com/coreos/etcd/etcdserver/raft.go:371 +0x11a Dec 08 04:27:39 ha1-ose-1-4.novalocal etcd_container[25602]: github.com/coreos/etcd/etcdserver.NewServer(0xc820075e00, 0xc820075eb8, 0x0, 0x0) Dec 08 04:27:39 ha1-ose-1-4.novalocal etcd_container[25602]: /builddir/build/BUILD/etcd-2.3.7/src/github.com/coreos/etcd/etcdserver/server.go:335 +0x3ca6 Dec 08 04:27:39 ha1-ose-1-4.novalocal etcd_container[25602]: github.com/coreos/etcd/etcdmain.startEtcd(0xc820077400, 0x0, 0x0, 0x0) Dec 08 04:27:39 ha1-ose-1-4.novalocal etcd_container[25602]: /builddir/build/BUILD/etcd-2.3.7/src/github.com/coreos/etcd/etcdmain/etcd.go:302 +0x1b40 Dec 08 04:27:39 ha1-ose-1-4.novalocal etcd_container[25602]: github.com/coreos/etcd/etcdmain.Main() Dec 08 04:27:39 ha1-ose-1-4.novalocal etcd_container[25602]: /builddir/build/BUILD/etcd-2.3.7/src/github.com/coreos/etcd/etcdmain/etcd.go:118 +0x2142 Dec 08 04:27:39 ha1-ose-1-4.novalocal etcd_container[25602]: main.main() Dec 08 04:27:39 ha1-ose-1-4.novalocal etcd_container[25602]: /builddir/build/BUILD/etcd-2.3.7/src/github.com/coreos/etcd/main.go:37 +0xe3 Dec 08 04:27:39 ha1-ose-1-4.novalocal systemd[1]: etcd_container.service: main process exited, code=exited, status=2/INVALIDARGUMENT Dec 08 04:27:39 ha1-ose-1-4.novalocal etcd_container[25679]: Failed to stop container (etcd_container): Error response from daemon: No such container: etcd_container Dec 08 04:27:39 ha1-ose-1-4.novalocal systemd[1]: etcd_container.service: control process exited, code=exited status=1 Dec 08 04:27:39 ha1-ose-1-4.novalocal systemd[1]: Unit etcd_container.service entered failed state. Dec 08 04:27:39 ha1-ose-1-4.novalocal systemd[1]: etcd_container.service failed. Expected results: Additional info:
This should be fixed in the current playbooks because we perform the backup from within the container now. Please confirm
The database was backed up by etcdctl in containers. no such issue in v3.6
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2017:1716