1997376 – Etcd goes to degraded when downgrading to 4.8 from 4.9 because etcd downgrade to 3.4.14 from 3.5 is invalid

Bug 1997376 - Etcd goes to degraded when downgrading to 4.8 from 4.9 because etcd downgrade to 3.4.14 from 3.5 is invalid

Summary: Etcd goes to degraded when downgrading to 4.8 from 4.9 because etcd downgrade...

Keywords:
Status:	CLOSED DUPLICATE of bug 1997347
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Etcd
Sub Component:
Version:	4.9
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	medium
Target Milestone:	---
Target Release:	---
Assignee:	Sam Batschelet
QA Contact:	ge liu
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2021-08-25 05:55 UTC by Yang Yang
Modified:	2022-06-01 07:53 UTC (History)
CC List:	1 user (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2021-09-21 19:57:59 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Description Yang Yang 2021-08-25 05:55:42 UTC

Description of problem:

When updating cluster from 4.8.6 -> 4.9.0-0.nightly-2021-08-20-115908 -> 4.8.6, etcd fails to downgrade.
# oc get co | grep etcd
etcd                       4.9.0-0.nightly-2021-08-20-115908   True        True          True       25h     EtcdMembersDegraded: 2 of 3 members are available, ip-10-0-214-109.us-east-2.compute.internal is unhealthy
StaticPodsDegraded: pod/etcd-ip-10-0-214-109.us-east-2.compute.internal container "etcd" is terminated: Error: 8-24T07:56:45.418Z","caller":"etcdserver/server.go:469","msg":"recovered v3 backend from snapshot","backend-size-bytes":94380032,"backend-size":"94 MB","backend-size-in-use-bytes":83808256,"backend-size-in-use":"84 MB"}
StaticPodsDegraded: {"level":"info","ts":"2021-08-24T07:56:46.375Z","caller":"etcdserver/raft.go:536","msg":"restarting local member","cluster-id":"36742a9a7bf6628e","local-member-id":"5c920c9b2c47ac62","commit-index":695531}
StaticPodsDegraded: {"level":"fatal","ts":"2021-08-24T07:56:46.379Z","caller":"membership/cluster.go:790","msg":"invalid downgrade; server version is lower than determined cluster version","current-server-version":"3.4.14","determined-cluster-version":"3.5","stacktrace":"go.etcd.io/etcd/etcdserver/api/membership.mustDetectDowngrade\n\t/go/src/go.etcd.io/etcd/etcdserver/api/membership/cluster.go:790\ngo.etcd.io/etcd/etcdserver/api/membership.(*RaftCluster).Recover\n\t/go/src/go.etcd.io/etcd/etcdserver/api/membership/cluster.go:251\ngo.etcd.io/etcd/etcdserver.NewServer\n\t/go/src/go.etcd.io/etcd/etcdserver/server.go:487\ngo.etcd.io/etcd/embed.StartEtcd\n\t/go/src/go.etcd.io/etcd/embed/etcd.go:223\ngo.etcd.io/etcd/etcdmain.startEtcd\n\t/go/src/go.etcd.io/etcd/etcdmai


Version-Release number of selected component (if applicable):
4.9.0-0.nightly-2021-08-20-115908

How reproducible:
1/1

Steps to Reproduce:
1. Install a cluster with 4.8.6
2. Upgrade to 4.9.0-0.nightly-2021-08-20-115908
3. Downgrade to 4.8.6


Actual results:
pod/etcd-ip-10-0-214-109.us-east-2.compute.internal container "etcd" is terminated. Etcd downgrade to 3.4.14 from 3.5 is invalid


Expected results:
Downgrade is successful


Additional info:

Comment 1 Sam Batschelet 2021-09-21 19:57:59 UTC

This is expected marking dupe of 1997347

*** This bug has been marked as a duplicate of bug 1997347 ***

Comment 2 Yang Yang 2021-09-24 06:15:55 UTC

Sam,

If I update a cluster from 4.8 to 4.9 then roll back to 4.8, this issue still happens. I was wondering how can I get the downgrade worked. Do we need to manually restore etcd before the cluster downgrade gets started? Thanks.

Comment 3 Sam Batschelet 2021-09-24 15:33:51 UTC

You can not rollback it will fail if you attempted rollback and it failed I would try to manually restore using a backup taken prior to upgrade. This is covered in https://bugzilla.redhat.com/show_bug.cgi?id=1997347.

Note You need to log in before you can comment on or make changes to this bug.