Description of problem: If etcd is terminated during the defrag operation, the db.tmp file that it creates can be orphaned. If this happens, the next defragmentation operation that occurs will open the orphaned db.tmp instead of creating an empty db.tmp file, and starting with a fresh slate, as it should. Once the defragmentation operation opens db.tmp , it traverses all key-values in the main db file and writes them to db.tmp. Any key-values already in the db.tmp file that are not overwritten by this copy remain in it, corrupting the boltdb keyspace. When the defragmentation operation completes successfully, db.tmp replaces db via file move and the main db file is now corrupt. Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: etcd state is not corrupted by defrag. Additional info:
*** Bug 1822832 has been marked as a duplicate of this bug. ***
Fixed upstream in v3.3.19 (commit b0a4038).
This is not 4.5.0 blocker, moving to target 4.6.0 and we backport to 4.4.0.
Execute regression test have not hit issues.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:2913
*** Bug 1815638 has been marked as a duplicate of this bug. ***