Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1997376

Summary: Etcd goes to degraded when downgrading to 4.8 from 4.9 because etcd downgrade to 3.4.14 from 3.5 is invalid
Product: OpenShift Container Platform Reporter: Yang Yang <yanyang>
Component: EtcdAssignee: Sam Batschelet <sbatsche>
Status: CLOSED DUPLICATE QA Contact: ge liu <geliu>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 4.9CC: sbatsche
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-09-21 19:57:59 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Yang Yang 2021-08-25 05:55:42 UTC
Description of problem:

When updating cluster from 4.8.6 -> 4.9.0-0.nightly-2021-08-20-115908 -> 4.8.6, etcd fails to downgrade.
# oc get co | grep etcd
etcd                       4.9.0-0.nightly-2021-08-20-115908   True        True          True       25h     EtcdMembersDegraded: 2 of 3 members are available, ip-10-0-214-109.us-east-2.compute.internal is unhealthy
StaticPodsDegraded: pod/etcd-ip-10-0-214-109.us-east-2.compute.internal container "etcd" is terminated: Error: 8-24T07:56:45.418Z","caller":"etcdserver/server.go:469","msg":"recovered v3 backend from snapshot","backend-size-bytes":94380032,"backend-size":"94 MB","backend-size-in-use-bytes":83808256,"backend-size-in-use":"84 MB"}
StaticPodsDegraded: {"level":"info","ts":"2021-08-24T07:56:46.375Z","caller":"etcdserver/raft.go:536","msg":"restarting local member","cluster-id":"36742a9a7bf6628e","local-member-id":"5c920c9b2c47ac62","commit-index":695531}
StaticPodsDegraded: {"level":"fatal","ts":"2021-08-24T07:56:46.379Z","caller":"membership/cluster.go:790","msg":"invalid downgrade; server version is lower than determined cluster version","current-server-version":"3.4.14","determined-cluster-version":"3.5","stacktrace":"go.etcd.io/etcd/etcdserver/api/membership.mustDetectDowngrade\n\t/go/src/go.etcd.io/etcd/etcdserver/api/membership/cluster.go:790\ngo.etcd.io/etcd/etcdserver/api/membership.(*RaftCluster).Recover\n\t/go/src/go.etcd.io/etcd/etcdserver/api/membership/cluster.go:251\ngo.etcd.io/etcd/etcdserver.NewServer\n\t/go/src/go.etcd.io/etcd/etcdserver/server.go:487\ngo.etcd.io/etcd/embed.StartEtcd\n\t/go/src/go.etcd.io/etcd/embed/etcd.go:223\ngo.etcd.io/etcd/etcdmain.startEtcd\n\t/go/src/go.etcd.io/etcd/etcdmai


Version-Release number of selected component (if applicable):
4.9.0-0.nightly-2021-08-20-115908

How reproducible:
1/1

Steps to Reproduce:
1. Install a cluster with 4.8.6
2. Upgrade to 4.9.0-0.nightly-2021-08-20-115908
3. Downgrade to 4.8.6


Actual results:
pod/etcd-ip-10-0-214-109.us-east-2.compute.internal container "etcd" is terminated. Etcd downgrade to 3.4.14 from 3.5 is invalid


Expected results:
Downgrade is successful


Additional info:

Comment 1 Sam Batschelet 2021-09-21 19:57:59 UTC
This is expected marking dupe of 1997347

*** This bug has been marked as a duplicate of bug 1997347 ***

Comment 2 Yang Yang 2021-09-24 06:15:55 UTC
Sam,

If I update a cluster from 4.8 to 4.9 then roll back to 4.8, this issue still happens. I was wondering how can I get the downgrade worked. Do we need to manually restore etcd before the cluster downgrade gets started? Thanks.

Comment 3 Sam Batschelet 2021-09-24 15:33:51 UTC
You can not rollback it will fail if you attempted rollback and it failed I would try to manually restore using a backup taken prior to upgrade. This is covered in https://bugzilla.redhat.com/show_bug.cgi?id=1997347.