1815637 – etcd: mvcc/backend: Fix corruption bug in defrag

Bug 1815637 - etcd: mvcc/backend: Fix corruption bug in defrag

Summary: etcd: mvcc/backend: Fix corruption bug in defrag

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Etcd
Sub Component:
Version:	4.4
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	high
Target Milestone:	---
Target Release:	4.4.z
Assignee:	Sam Batschelet
QA Contact:	ge liu
Docs Contact:
URL:
Whiteboard:
Duplicates (2):	1815638 1822832 (view as bug list)
Depends On:	1855897
Blocks:	1815638
TreeView+	depends on / blocked

Reported:	2020-03-20 19:09 UTC by Sam Batschelet
Modified:	2020-08-21 16:08 UTC (History)
CC List:	4 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Clones:	1815638 1855896 (view as bug list)
Environment:
Last Closed:	2020-07-21 10:31:05 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Github	openshift etcd pull 47	0	None	closed	Bug 1815637: bump etcd v3.3.22	2020-09-10 09:05:14 UTC
Red Hat Product Errata	RHBA-2020:2913	0	None	None	None	2020-07-21 10:31:46 UTC

Description Sam Batschelet 2020-03-20 19:09:38 UTC

Description of problem:

If etcd is terminated during the defrag operation, the db.tmp file that it creates can be orphaned. If this happens, the next defragmentation operation that occurs will open the orphaned db.tmp instead of creating an empty db.tmp file, and starting with a fresh slate, as it should.

Once the defragmentation operation opens db.tmp , it traverses all key-values in the main db file and writes them to db.tmp. Any key-values already in the db.tmp file that are not overwritten by this copy remain in it, corrupting the boltdb keyspace. When the defragmentation operation completes successfully, db.tmp replaces db via file move and the main db file is now corrupt.

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results: etcd state is not corrupted by defrag.


Additional info:

Comment 3 Sam Batschelet 2020-04-10 03:37:36 UTC

*** Bug 1822832 has been marked as a duplicate of this bug. ***

Comment 9 Dan Mace 2020-06-18 17:41:36 UTC

Fixed upstream in v3.3.19 (commit b0a4038).

Comment 12 Michal Fojtik 2020-06-19 13:08:59 UTC

This is not 4.5.0 blocker, moving to target 4.6.0 and we backport to 4.4.0.

Comment 18 ge liu 2020-07-17 08:49:32 UTC

Execute regression test have not hit issues.

Comment 20 errata-xmlrpc 2020-07-21 10:31:05 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:2913

Comment 21 Sam Batschelet 2020-08-21 16:08:10 UTC

*** Bug 1815638 has been marked as a duplicate of this bug. ***

Note You need to log in before you can comment on or make changes to this bug.