Hide Forgot
Created attachment 1594822 [details] simple C program demonstrating descriptor leak in libblkid when blkid_get_dev is called with devname=<somepartition> Description of problem: We had a customer report an issue with the nfs server's rpc.mountd where it would retain a reference to an underlying block device even after a filesystem was unexported and no one was using it, and the only way to release it was to restart rpc.mountd. The block device had one partition and on top of that was a filesystem. The NFS reproducer is fairly simple but as it turns out we can reproduce this with a simple C program calling into libblkid and show it will leak a reference. This bug appears to be in at least: RHEL7 (libblkid-2.23.2-59.el7.x86_64, nfs-utils-1.3.0-0.61.el7.x86_64 RHEL8 (nfs-utils-2.3.3-14.el8.x86_64 f30 (libblkid-2.33.2-2.fc30.x86_64, nfs-utils-2.4.1-0.fc30.x86_64) But is not in RHEL6 (libblkid-2.17.2-12.28.el6_9.2.x86_64, nfs-utils-1.2.3-75.el6_9.x86_64) Version-Release number of selected component (if applicable): libblkid-2.23.2-59.el7.x86_64 or above How reproducible: Everytime with reproducer Steps to Reproduce: util-linux / libblkid 1. Create a block device with one partition on it, then a filesystem on that (or just use /boot) 2. Run attached C program to show the file descriptor of the underlying block device remains open / is leaked Example: ./blkid-test /boot calling blkid_get_dev(cache, devname=/dev/vda1, BLKID_DEV_NORMAL) BUG - file descriptor leak - before=6 after=7 uuid = e2b3e658-cc82-4fad-9ac2-2cf384ecc49e # gdb ./blkid-test (gdb) 98 uuid = get_uuid_blkdev(argv[1]); 99 after = get_num_fds(); 100 // printf("AFTER get_uuid_blkdev num_fds = %d\n", after); 101 if (before != after) 102 printf("BUG - file descriptor leak - before=%d after=%d\n", 103 before, after); 104 if (!uuid) { 105 printf("ERROR: uuid = NULL\n"); 106 exit(1); 107 } (gdb) b 98 Breakpoint 1 at 0x401455: file blkid-test.c, line 98. (gdb) b 101 Breakpoint 2 at 0x401479: file blkid-test.c, line 101. (gdb) run /boot Starting program: /root/blkid-test /boot warning: Loadable section ".note.gnu.property" outside of ELF segments warning: Loadable section ".note.gnu.property" outside of ELF segments warning: Loadable section ".note.gnu.property" outside of ELF segments Breakpoint 1, main (argc=2, argv=0x7fffffffe478) at blkid-test.c:98 98 uuid = get_uuid_blkdev(argv[1]); (gdb) [1]+ Stopped gdb ./blkid-test # ps PID TTY TIME CMD 8870 pts/1 00:00:01 bash 13406 pts/1 00:00:01 gdb 13408 pts/1 00:00:00 blkid-test 13412 pts/1 00:00:00 ps # ls -lh /proc/13408/fd total 0 lrwx------. 1 root root 64 Jul 30 15:52 0 -> /dev/pts/1 lrwx------. 1 root root 64 Jul 30 15:52 1 -> /dev/pts/1 lrwx------. 1 root root 64 Jul 30 15:52 2 -> /dev/pts/1 # fg gdb ./blkid-test c c Continuing. calling blkid_get_dev(cache, devname=/dev/vda1, BLKID_DEV_NORMAL) Breakpoint 2, main (argc=2, argv=0x7fffffffe478) at blkid-test.c:101 101 if (before != after) (gdb) [1]+ Stopped gdb ./blkid-test # ls -lh /proc/13408/fd total 0 lrwx------. 1 root root 64 Jul 30 15:52 0 -> /dev/pts/1 lrwx------. 1 root root 64 Jul 30 15:52 1 -> /dev/pts/1 lrwx------. 1 root root 64 Jul 30 15:52 2 -> /dev/pts/1 lr-x------. 1 root root 64 Jul 30 15:52 4 -> /dev/vda nfs-server reproducer (calls into libblkid with same calls as above) 1. Run attached mountd-test.sh (for NFS server reproducer) Actual results: reference to underlying block device remains open after following call is made: blkid_get_dev(cache, devname=/dev/vda1, BLKID_DEV_NORMAL) Expected results: no reference to underlying block device after the following call is made: blkid_get_dev(cache, devname=/dev/vda1, BLKID_DEV_NORMAL) Additional info: If for some reason I've misunderstood and this is somehow misuse of the interface by nfs-utils (rpc.mountd) please let us know. NOTE: on fc30, the blkid.h file is inside libblkid-debugsource, and I did this to compile / link: ln -s /usr/lib64/libblkid.so.1 /usr/lib64/libblkid.so gcc -g -o blkid-test -I/usr/src/debug/util-linux-2.33.2-2.fc30.x86_64/libblkid/src blkid-test.c -lblkid On RHEL6/RHEL7, there is libblkid-devel and I did this: ln -s /usr/lib64/libblkid.so.1 /usr/lib64/libblkid.so gcc -g -o blkid-test -I/usr/include/blkid blkid-test.c -lblkid In all instances you should be able to repro easily with: # ./blkid-test /boot If the bug exists, we see: calling blkid_get_dev(cache, devname=/dev/vda1, BLKID_DEV_NORMAL) BUG - file descriptor leak - before=6 after=7 uuid = 3bcb7235-6b3b-494f-ad49-c4a0e18d67b6 If it does not, there is no "BUG" line above, as on RHEL6: calling blkid_get_dev(cache, devname=/dev/vda1, BLKID_DEV_NORMAL) uuid = e2107ad2-f3f5-4704-94f3-2fe7606ee947
Created attachment 1594823 [details] the nfs-server test that originally uncovered the problem
Not too familiar with this code so I'm not sure about this but I think libblkid may just be missing a call to blkid_free_probe(pr->disk_probe); inside blkid_probe_get_wholedisk_probe() or thereabouts.
Actually it's inside blkid_partitions_probe_partition since that is the caller of blkid_probe_get_wholedisk_probe. This is along the lines of I think we need commit c90afaf082f76d2d7446001353c90bf2cb662c18 Author: Dave Wysochanski <dwysocha@redhat.com> Date: Tue Jul 30 17:41:03 2019 -0400 blkid: free blkid_probe at end of partition probe to avoid leak fixes https://bugzilla.redhat.com/show_bug.cgi?id=1734545 Signed-off-by: Dave Wysochanski <dwysocha@redhat.com> diff --git a/libblkid/src/partitions/partitions.c b/libblkid/src/partitions/partitions.c index f12638e..8e2540d 100644 --- a/libblkid/src/partitions/partitions.c +++ b/libblkid/src/partitions/partitions.c @@ -797,10 +797,14 @@ static int blkid_partitions_probe_partition(blkid_probe pr) major(disk), minor(disk)); } + if (disk_pr) + blkid_free_probe(pr->disk_probe); DBG(LOWPROBE, ul_debug("parts: end probing for partition entry [success]")); return BLKID_PROBE_OK; nothing: + if (disk_pr) + blkid_free_probe(pr->disk_probe); DBG(LOWPROBE, ul_debug("parts: end probing for partition entry [nothing]")); return BLKID_PROBE_NONE;
( > --- a/libblkid/src/partitions/partitions.c > +++ b/libblkid/src/partitions/partitions.c > @@ -797,10 +797,14 @@ static int > blkid_partitions_probe_partition(blkid_probe pr) > major(disk), minor(disk)); > } > > + if (disk_pr) > + blkid_free_probe(pr->disk_probe); It seems incorrect. See libblkid/src/probe.c:blkid_free_probe(), it deallocates the disk_probe when the main @pr struct is deallocated. It also means that your patch introduces double free as it does not set pr->disk_probe to NULL.
It seems the problem is in the application rather than in the library. The probing struct is associated with the cache. You need to call blkid_put_cache() to close and free all related stuff. And there is a small typo in the test; you wants to use st_rdev, no st_dev. Try this patch for your test: -- blkid-test-org.c 2019-07-31 12:38:26.486762301 +0200 +++ blkid-test.c 2019-07-31 12:42:13.028521484 +0200 @@ -40,12 +40,11 @@ } /* END from https://stackoverflow.com/questions/6583158/finding-open-file-descriptors-for-a-process-linux-c-code */ -static const char *get_uuid_blkdev(char *path) +static const char *get_uuid_blkdev(char *path, blkid_cache *cache) { /* We set *safe if we know that we need the * fsid from statfs too. */ - static blkid_cache cache = NULL; struct stat stb; char *devname; blkid_tag_iterate iter; @@ -53,17 +52,17 @@ const char *type; const char *val, *uuid = NULL; - if (cache == NULL) - blkid_get_cache(&cache, NULL); + if (!*cache) + blkid_get_cache(cache, NULL); if (stat(path, &stb) != 0) return NULL; - devname = blkid_devno_to_devname(stb.st_dev); + devname = blkid_devno_to_devname(stb.st_rdev); if (!devname) return NULL; printf("calling blkid_get_dev(cache, devname=%s, BLKID_DEV_NORMAL)\n", devname); - dev = blkid_get_dev(cache, devname, BLKID_DEV_NORMAL); + dev = blkid_get_dev(*cache, devname, BLKID_DEV_NORMAL); free(devname); if (!dev) return NULL; @@ -88,6 +87,7 @@ { const char *uuid; int before, after; + blkid_cache cache = NULL;; if (argc < 1) { printf("Please enter path to device"); @@ -95,7 +95,9 @@ } before = get_num_fds(); // printf("BEFORE get_uuid_blkdev num_fds = %d\n", before); - uuid = get_uuid_blkdev(argv[1]); + uuid = get_uuid_blkdev(argv[1], &cache); + printf("uuid = %s\n", uuid); + blkid_put_cache(cache); after = get_num_fds(); // printf("AFTER get_uuid_blkdev num_fds = %d\n", after); if (before != after) @@ -105,6 +107,5 @@ printf("ERROR: uuid = NULL\n"); exit(1); } - printf("uuid = %s\n", uuid); exit(0); } Note that after blkid_put_cache() the uuid string will not be available, you have to use it before blkid_put_cache() (or strdup()). I don't see any leak. Not a bug from my point of view.
(In reply to Karel Zak from comment #4) > ( > > --- a/libblkid/src/partitions/partitions.c > > +++ b/libblkid/src/partitions/partitions.c > > @@ -797,10 +797,14 @@ static int > > blkid_partitions_probe_partition(blkid_probe pr) > > major(disk), minor(disk)); > > } > > > > + if (disk_pr) > > + blkid_free_probe(pr->disk_probe); > > It seems incorrect. > > See libblkid/src/probe.c:blkid_free_probe(), it deallocates the disk_probe > when the main @pr > struct is deallocated. It also means that your patch introduces double free > as it does not > set pr->disk_probe to NULL. Yes I agree this is all wrong - you beat me to it. (In reply to Karel Zak from comment #5) > It seems the problem is in the application rather than in the library. > > The probing struct is associated with the cache. You need to call > blkid_put_cache() to close and > free all related stuff. And there is a small typo in the test; you wants to > use st_rdev, no st_dev. > > Try this patch for your test: > I wonder why RHEL6 does not have this problem. I cut/pasted that from rpc.mountd cache.c code though so if you're right it means this is an nfs-utils bug. > -- blkid-test-org.c 2019-07-31 12:38:26.486762301 +0200 > +++ blkid-test.c 2019-07-31 12:42:13.028521484 +0200 > @@ -40,12 +40,11 @@ > } > /* END from > https://stackoverflow.com/questions/6583158/finding-open-file-descriptors- > for-a-process-linux-c-code */ > > -static const char *get_uuid_blkdev(char *path) > +static const char *get_uuid_blkdev(char *path, blkid_cache *cache) > { > /* We set *safe if we know that we need the > * fsid from statfs too. > */ > - static blkid_cache cache = NULL; > struct stat stb; > char *devname; > blkid_tag_iterate iter; > @@ -53,17 +52,17 @@ > const char *type; > const char *val, *uuid = NULL; > > - if (cache == NULL) > - blkid_get_cache(&cache, NULL); > + if (!*cache) > + blkid_get_cache(cache, NULL); > > if (stat(path, &stb) != 0) > return NULL; > - devname = blkid_devno_to_devname(stb.st_dev); > + devname = blkid_devno_to_devname(stb.st_rdev); > if (!devname) > return NULL; > printf("calling blkid_get_dev(cache, devname=%s, BLKID_DEV_NORMAL)\n", > devname); > - dev = blkid_get_dev(cache, devname, BLKID_DEV_NORMAL); > + dev = blkid_get_dev(*cache, devname, BLKID_DEV_NORMAL); > free(devname); > if (!dev) > return NULL; > @@ -88,6 +87,7 @@ > { > const char *uuid; > int before, after; > + blkid_cache cache = NULL;; > > if (argc < 1) { > printf("Please enter path to device"); > @@ -95,7 +95,9 @@ > } > before = get_num_fds(); > // printf("BEFORE get_uuid_blkdev num_fds = %d\n", before); > - uuid = get_uuid_blkdev(argv[1]); > + uuid = get_uuid_blkdev(argv[1], &cache); > + printf("uuid = %s\n", uuid); > + blkid_put_cache(cache); > after = get_num_fds(); > // printf("AFTER get_uuid_blkdev num_fds = %d\n", after); > if (before != after) > @@ -105,6 +107,5 @@ > printf("ERROR: uuid = NULL\n"); > exit(1); > } > - printf("uuid = %s\n", uuid); > exit(0); > } > > Note that after blkid_put_cache() the uuid string will not be available, you > have to use it before blkid_put_cache() (or strdup()). > > I don't see any leak. Not a bug from my point of view. Can you explain why the library needs to hold the underlying block device open even after the call to blkid_get_dev()? Is it something to do with blkid_cache validity?
SteveD - it does not look like rpc.mountd calls blkid_put_cache() in get_uuid_blkdev() but Karel is saying we need to do that (see comment #5). Also we need to somehow copy the UUID string here to retain its value or it will go away apparently or we need to rework get_uuid_blkdev() to retain blkid_cache for the lifetime of its use. Left unexplained still is why libblkid keeps underlying device open (maybe for cache validity or optimization?) or why RHEL6 does not have this problem (maybe a blkid_put_cache was not required prior and/or some bug was fixed in the library?)
Ok it looks intentional get_uuid_blkdev() is not calling blkid_put_cache() because we have a static variable and we're checking for NULL before calling blkid_get_cache(). Unclear where to go from here...
Good question Dave, ... I've checked another part of the library (where it re-validate the cache) and it seems there is a leak in blkid_verify(). It does not call blkid_free_probe() because the probe handler is later re-used, unfortunately if does not care about internally used disk_prober. So, blkid_put_cache() will fix the problem for rpc.mountd, but it seems we need to improve libblkid too.
Fixed by util-linux upstream commit: https://github.com/karelzak/util-linux/commit/c4d6d1c54dcd0eff701236d396d88b1fc6251768 I'm going to fix it in RHEL7 and RHEL8 too. Thanks for help and test program, it's pretty old bug (v2.20, year 2011).
Thanks Karel! I also confirm with a test build that patch in comment #10 fixes the problem.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:1102