Bug 1481085

Summary: LVM metadata archives ( /etc/lvm/archive) expiry / retention misbehaves after index #100,000.
Product: [Community] LVM and device-mapper Reporter: Mark Mielke <mark>
Component: lvm2Assignee: Peter Rajnoha <prajnoha>
lvm2 sub component: Metadata QA Contact: cluster-qe <cluster-qe>
Status: CLOSED UPSTREAM Docs Contact: Alasdair Kergon <agk>
Severity: low    
Priority: unspecified CC: agk, heinzm, jan.skarvall, jbrassow, kmoradha, msnitzer, noriyuki.shiota, prajnoha, zkabelac
Version: unspecifiedFlags: rule-engine: lvm-technical-solution?
rule-engine: lvm-test-coverage?
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Unspecified   
Whiteboard:
Fixed In Version: 2.02.178 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-03-20 11:37:42 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Patch to fix bug when inserting at the end of non-empty list none

Description Mark Mielke 2017-08-14 03:48:05 UTC
Description of problem:

Once the metadata archive index hits 100,000, like /etc/lvm/archive/vg_sys_100000-4234212.vg, the archive sorting algorithm fails and certain misbehaviour ensues:

1) New archives may all have index 100,000. It stops incrementing. In our case, we had around 40,000 files with index "_100000-" in the file name.
2) The archive expiration / retention policy stops working correctly, and it begins to accumulate archives much faster than it prunes them. In our case, we had around 42,000 archives, some going back to 2015 (it is 2017 now).

The archive file names are generated like:

                if (dm_snprintf(archive_name, sizeof(archive_name),
                                 "%s/%s_%05u-%d.vg",
                                 dir, vg->name, ix, rnum) < 0) {

The directory scanning code that loads the archive file names into memory recognizes a problem, although it isn't explicit about what the problem is:

        /* Sort fails beyond 5-digit indexes */
        if ((count = scandir(dir, &dirent, NULL, alphasort)) < 0) {
                log_error("Couldn't scan the archive directory (%s).", dir);
                return 0;
        }

The file names encode the index like "00000". The sorting code uses "alphasort", which will only work properly as long as the index stays within 5 digits. As soon as it exceeds 5 digits, it begins to sort the "100000" to the beginning, and "99999" to the end. Then, new archives seems to *all* be "100000". We had some 40,000 indexes with "100000" before we noticed. And, because the index is followed by a random number, it would only expire a few of the "100000" before it would hit one that was younger than the 30 days retention period set by default. When I reduced the retention period to 7 days, it expired only about 12 archive files of 40,000 archive files. This behaviour is probably due to random number distribution ensuring that there are always some recent records near 0?

This issue eventually affects everyone, although obviously the people that use features like snapshots more frequently (we use it every 15 minutes, across multiple volumes) will hit it sooner, 


Version-Release number of selected component (if applicable):

All versions of lvm2 going back to 2002 (according to "git blame").


How reproducible:

This is probably 100% reproducible, although it requires that you have archive files with index number approaching 100,000. Many installations may change LVM metadata infrequently and never come close to 100,000. Installations that frequently extended LVM partitions and/or make use of frequent snapshotting operations, will hit this issue first.


Steps to Reproduce:
1. Either make 100,000 LVM changes, or probably you can rename an existing index to use "_99999-" to hit the issue quicker.
2. Do a number of LVM changes to take it past 100,000, and also past the configured archive retention policy for archive files.

Actual results:

All new archives will have index "_100000-".

The "_99999-" index will not get pruned very easily, although depending upon the random number distribution it may still happen.

Expected results:

New archives will have index "_100001-" and beyond, each unique. The archive files should be properly pruned according to the configured retention policy.

Comment 1 Noriyuki Shiota 2018-02-06 11:23:42 UTC
Created attachment 1392022 [details]
Patch to fix bug when inserting at the end of non-empty list

I've fixed this problem.

The list of archive files is sorted by index in descending order.
But, _insert_archive_file () adds the element with the smallest index to the beginning of the list. This patch fixes sorting.

With this patch, the problem is not reproduced in my environment. Perhaps it will not be necessary to include random numbers in file names.

Comment 2 Alasdair Kergon 2018-02-09 01:04:55 UTC
What about something like this?

--- a/lib/format_text/archive.c
+++ b/lib/format_text/archive.c
@@ -137,7 +137,7 @@ static struct dm_list *_scan_archive(struct dm_pool *mem,
 	dm_list_init(results);
 
 	/* Sort fails beyond 5-digit indexes */
-	if ((count = scandir(dir, &dirent, NULL, alphasort)) < 0) {
+	if ((count = scandir(dir, &dirent, NULL, versionsort)) < 0) {
 		log_error("Couldn't scan the archive directory (%s).", dir);
 		return 0;
 	}

Comment 4 Jan Skarvall 2018-03-29 19:03:00 UTC
I wonder if this fix will eventually purge all those archive files with index 100000, or if I will have to manually delete those, and if so, how.

I also wonder how the next index is chosen. Is it continously increasing? If so, is there a way to start over from a low index?

Comment 5 Alasdair Kergon 2018-10-24 13:10:50 UTC
It should clean them up automatically once the fix is in, yes.  And to start from a low index, just remove all the older files and rename the ones you want to keep to low numbers.  (It just looks for the highest number and adds one to find the next number to use.)