Bug 1734545 - RHEL-7: blkid_get_dev() leak file descriptor to underlying block device
Summary: RHEL-7: blkid_get_dev() leak file descriptor to underlying block device
Keywords:
Status: VERIFIED
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: util-linux
Version: 7.6
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: rc
: ---
Assignee: Karel Zak
QA Contact: Radka Skvarilova
URL:
Whiteboard:
Depends On:
Blocks: 1734553
TreeView+ depends on / blocked
 
Reported: 2019-07-30 20:18 UTC by Dave Wysochanski
Modified: 2019-10-07 12:54 UTC (History)
4 users (show)

Fixed In Version: util-linux-2.23.2-62.el7
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1734553 (view as bug list)
Environment:
Last Closed:
Target Upstream Version:


Attachments (Terms of Use)
simple C program demonstrating descriptor leak in libblkid when blkid_get_dev is called with devname=<somepartition> (2.45 KB, text/x-csrc)
2019-07-30 20:18 UTC, Dave Wysochanski
no flags Details
the nfs-server test that originally uncovered the problem (536 bytes, text/plain)
2019-07-30 20:20 UTC, Dave Wysochanski
no flags Details


Links
System ID Priority Status Summary Last Updated
Red Hat Bugzilla 1734553 'unspecified' 'CLOSED' 'RHEL-8: blkid_get_dev() leak file descriptor to underlying block device' 2019-11-14 09:25:17 UTC

Description Dave Wysochanski 2019-07-30 20:18:04 UTC
Created attachment 1594822 [details]
simple C program demonstrating descriptor leak in libblkid when blkid_get_dev is called with devname=<somepartition>

Description of problem:
We had a customer report an issue with the nfs server's rpc.mountd where it would retain a reference to an underlying block device even after a filesystem was unexported and no one was using it, and the only way to release it was to restart rpc.mountd.  The block device had one partition and on top of that was a filesystem.  The NFS reproducer is fairly simple but as it turns out we can reproduce this with a simple C program calling into libblkid and show it will leak a reference.

This bug appears to be in at least:
RHEL7 (libblkid-2.23.2-59.el7.x86_64, nfs-utils-1.3.0-0.61.el7.x86_64
RHEL8 (nfs-utils-2.3.3-14.el8.x86_64
f30 (libblkid-2.33.2-2.fc30.x86_64, nfs-utils-2.4.1-0.fc30.x86_64)

But is not in RHEL6 (libblkid-2.17.2-12.28.el6_9.2.x86_64, nfs-utils-1.2.3-75.el6_9.x86_64)


Version-Release number of selected component (if applicable):
libblkid-2.23.2-59.el7.x86_64 or above

How reproducible:
Everytime with reproducer

Steps to Reproduce:
util-linux / libblkid
1. Create a block device with one partition on it, then a filesystem on that (or just use /boot)
2. Run attached C program to show the file descriptor of the underlying block device remains open / is leaked
Example:
./blkid-test /boot
calling blkid_get_dev(cache, devname=/dev/vda1, BLKID_DEV_NORMAL)
BUG - file descriptor leak - before=6 after=7
uuid = e2b3e658-cc82-4fad-9ac2-2cf384ecc49e
# gdb ./blkid-test
(gdb) 
98              uuid = get_uuid_blkdev(argv[1]);
99              after = get_num_fds();
100             // printf("AFTER get_uuid_blkdev num_fds = %d\n", after);
101             if (before != after)
102                     printf("BUG - file descriptor leak - before=%d after=%d\n",
103                             before, after);
104             if (!uuid) {
105                     printf("ERROR: uuid = NULL\n");
106                     exit(1);
107             }
(gdb) b 98
Breakpoint 1 at 0x401455: file blkid-test.c, line 98.
(gdb) b 101
Breakpoint 2 at 0x401479: file blkid-test.c, line 101.
(gdb) run /boot
Starting program: /root/blkid-test /boot
warning: Loadable section ".note.gnu.property" outside of ELF segments
warning: Loadable section ".note.gnu.property" outside of ELF segments
warning: Loadable section ".note.gnu.property" outside of ELF segments

Breakpoint 1, main (argc=2, argv=0x7fffffffe478) at blkid-test.c:98
98              uuid = get_uuid_blkdev(argv[1]);
(gdb) 
[1]+  Stopped                 gdb ./blkid-test
# ps
  PID TTY          TIME CMD
 8870 pts/1    00:00:01 bash
13406 pts/1    00:00:01 gdb
13408 pts/1    00:00:00 blkid-test
13412 pts/1    00:00:00 ps
# ls -lh /proc/13408/fd
total 0
lrwx------. 1 root root 64 Jul 30 15:52 0 -> /dev/pts/1
lrwx------. 1 root root 64 Jul 30 15:52 1 -> /dev/pts/1
lrwx------. 1 root root 64 Jul 30 15:52 2 -> /dev/pts/1
# fg
gdb ./blkid-test
c
c
Continuing.
calling blkid_get_dev(cache, devname=/dev/vda1, BLKID_DEV_NORMAL)

Breakpoint 2, main (argc=2, argv=0x7fffffffe478) at blkid-test.c:101
101             if (before != after)
(gdb) 
[1]+  Stopped                 gdb ./blkid-test
# ls -lh /proc/13408/fd
total 0
lrwx------. 1 root root 64 Jul 30 15:52 0 -> /dev/pts/1
lrwx------. 1 root root 64 Jul 30 15:52 1 -> /dev/pts/1
lrwx------. 1 root root 64 Jul 30 15:52 2 -> /dev/pts/1
lr-x------. 1 root root 64 Jul 30 15:52 4 -> /dev/vda


nfs-server reproducer (calls into libblkid with same calls as above)
1. Run attached mountd-test.sh (for NFS server reproducer)


Actual results:
reference to underlying block device remains open after following call is made:
blkid_get_dev(cache, devname=/dev/vda1, BLKID_DEV_NORMAL)


Expected results:
no reference to underlying block device after the following call is made:
blkid_get_dev(cache, devname=/dev/vda1, BLKID_DEV_NORMAL)


Additional info:
If for some reason I've misunderstood and this is somehow misuse of the interface by nfs-utils (rpc.mountd) please let us know.

NOTE: on fc30, the blkid.h file is inside libblkid-debugsource, and I did this to compile / link:
ln -s /usr/lib64/libblkid.so.1 /usr/lib64/libblkid.so
gcc -g -o blkid-test -I/usr/src/debug/util-linux-2.33.2-2.fc30.x86_64/libblkid/src blkid-test.c -lblkid


On RHEL6/RHEL7, there is libblkid-devel and I did this:
ln -s /usr/lib64/libblkid.so.1 /usr/lib64/libblkid.so
gcc -g -o blkid-test -I/usr/include/blkid blkid-test.c -lblkid

In all instances you should be able to repro easily with:
# ./blkid-test /boot

If the bug exists, we see:
calling blkid_get_dev(cache, devname=/dev/vda1, BLKID_DEV_NORMAL)
BUG - file descriptor leak - before=6 after=7
uuid = 3bcb7235-6b3b-494f-ad49-c4a0e18d67b6

If it does not, there is no "BUG" line above, as on RHEL6:
calling blkid_get_dev(cache, devname=/dev/vda1, BLKID_DEV_NORMAL)
uuid = e2107ad2-f3f5-4704-94f3-2fe7606ee947

Comment 1 Dave Wysochanski 2019-07-30 20:20:46 UTC
Created attachment 1594823 [details]
the nfs-server test that originally uncovered the problem

Comment 2 Dave Wysochanski 2019-07-30 21:37:39 UTC
Not too familiar with this code so I'm not sure about this but I think libblkid may just be missing a call to blkid_free_probe(pr->disk_probe); inside blkid_probe_get_wholedisk_probe() or thereabouts.

Comment 3 Dave Wysochanski 2019-07-30 21:45:53 UTC
Actually it's inside blkid_partitions_probe_partition since that is the caller of blkid_probe_get_wholedisk_probe. This is along the lines of I think we need

commit c90afaf082f76d2d7446001353c90bf2cb662c18
Author: Dave Wysochanski <dwysocha@redhat.com>
Date:   Tue Jul 30 17:41:03 2019 -0400

    blkid: free blkid_probe at end of partition probe to avoid leak
    
    fixes https://bugzilla.redhat.com/show_bug.cgi?id=1734545
    
    Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>

diff --git a/libblkid/src/partitions/partitions.c b/libblkid/src/partitions/partitions.c
index f12638e..8e2540d 100644
--- a/libblkid/src/partitions/partitions.c
+++ b/libblkid/src/partitions/partitions.c
@@ -797,10 +797,14 @@ static int blkid_partitions_probe_partition(blkid_probe pr)
                                major(disk), minor(disk));
        }
 
+       if (disk_pr)
+               blkid_free_probe(pr->disk_probe);
        DBG(LOWPROBE, ul_debug("parts: end probing for partition entry [success]"));
        return BLKID_PROBE_OK;
 
 nothing:
+       if (disk_pr)
+               blkid_free_probe(pr->disk_probe);
        DBG(LOWPROBE, ul_debug("parts: end probing for partition entry [nothing]"));
        return BLKID_PROBE_NONE;

Comment 4 Karel Zak 2019-07-31 10:20:08 UTC
(
> --- a/libblkid/src/partitions/partitions.c
> +++ b/libblkid/src/partitions/partitions.c
> @@ -797,10 +797,14 @@ static int
> blkid_partitions_probe_partition(blkid_probe pr)
>                                 major(disk), minor(disk));
>         }
>  
> +       if (disk_pr)
> +               blkid_free_probe(pr->disk_probe);

It seems incorrect.

See libblkid/src/probe.c:blkid_free_probe(), it deallocates the disk_probe when the main @pr 
struct is deallocated. It also means that your patch introduces double free as it does not
set pr->disk_probe to NULL.

Comment 5 Karel Zak 2019-07-31 10:48:55 UTC
It seems the problem is in the application rather than in the library.

The probing struct is associated with the cache. You need to call blkid_put_cache() to close and 
free all related stuff. And there is a small typo in the test; you wants to use st_rdev, no st_dev.

Try this patch for your test:

-- blkid-test-org.c	2019-07-31 12:38:26.486762301 +0200
+++ blkid-test.c	2019-07-31 12:42:13.028521484 +0200
@@ -40,12 +40,11 @@
 }
 /* END from https://stackoverflow.com/questions/6583158/finding-open-file-descriptors-for-a-process-linux-c-code */
 
-static const char *get_uuid_blkdev(char *path)
+static const char *get_uuid_blkdev(char *path, blkid_cache *cache)
 {
 	/* We set *safe if we know that we need the
 	 * fsid from statfs too.
 	 */
-	static blkid_cache cache = NULL;
 	struct stat stb;
 	char *devname;
 	blkid_tag_iterate iter;
@@ -53,17 +52,17 @@
 	const char *type;
 	const char *val, *uuid = NULL;
 
-	if (cache == NULL)
-		blkid_get_cache(&cache, NULL);
+	if (!*cache)
+		blkid_get_cache(cache, NULL);
 
 	if (stat(path, &stb) != 0)
 		return NULL;
-	devname = blkid_devno_to_devname(stb.st_dev);
+	devname = blkid_devno_to_devname(stb.st_rdev);
 	if (!devname)
 		return NULL;
 	printf("calling blkid_get_dev(cache, devname=%s, BLKID_DEV_NORMAL)\n",
 		devname);
-	dev = blkid_get_dev(cache, devname, BLKID_DEV_NORMAL);
+	dev = blkid_get_dev(*cache, devname, BLKID_DEV_NORMAL);
 	free(devname);
 	if (!dev)
 		return NULL;
@@ -88,6 +87,7 @@
 {
 	const char *uuid;
 	int before, after;
+	blkid_cache cache = NULL;;
 
 	if (argc < 1) {
 		printf("Please enter path to device");
@@ -95,7 +95,9 @@
 	}
 	before = get_num_fds();
 	// printf("BEFORE get_uuid_blkdev num_fds = %d\n", before);
-	uuid = get_uuid_blkdev(argv[1]);
+	uuid = get_uuid_blkdev(argv[1], &cache);
+	printf("uuid = %s\n", uuid);
+	blkid_put_cache(cache);
 	after = get_num_fds();
 	// printf("AFTER get_uuid_blkdev num_fds = %d\n", after);
 	if (before != after)
@@ -105,6 +107,5 @@
 		printf("ERROR: uuid = NULL\n");
 		exit(1);
 	}
-	printf("uuid = %s\n", uuid);
 	exit(0);
 }

Note that after blkid_put_cache() the uuid string will not be available, you have to use it before blkid_put_cache() (or strdup()).

I don't see any leak. Not a bug from my point of view.

Comment 6 Dave Wysochanski 2019-07-31 13:18:51 UTC
(In reply to Karel Zak from comment #4)
> (
> > --- a/libblkid/src/partitions/partitions.c
> > +++ b/libblkid/src/partitions/partitions.c
> > @@ -797,10 +797,14 @@ static int
> > blkid_partitions_probe_partition(blkid_probe pr)
> >                                 major(disk), minor(disk));
> >         }
> >  
> > +       if (disk_pr)
> > +               blkid_free_probe(pr->disk_probe);
> 
> It seems incorrect.
> 
> See libblkid/src/probe.c:blkid_free_probe(), it deallocates the disk_probe
> when the main @pr 
> struct is deallocated. It also means that your patch introduces double free
> as it does not
> set pr->disk_probe to NULL.

Yes I agree this is all wrong - you beat me to it.

(In reply to Karel Zak from comment #5)
> It seems the problem is in the application rather than in the library.
> 
> The probing struct is associated with the cache. You need to call
> blkid_put_cache() to close and 
> free all related stuff. And there is a small typo in the test; you wants to
> use st_rdev, no st_dev.
> 
> Try this patch for your test:
> 

I wonder why RHEL6 does not have this problem.

I cut/pasted that from rpc.mountd cache.c code though so if you're right it means this is an nfs-utils bug.


> -- blkid-test-org.c	2019-07-31 12:38:26.486762301 +0200
> +++ blkid-test.c	2019-07-31 12:42:13.028521484 +0200
> @@ -40,12 +40,11 @@
>  }
>  /* END from
> https://stackoverflow.com/questions/6583158/finding-open-file-descriptors-
> for-a-process-linux-c-code */
>  
> -static const char *get_uuid_blkdev(char *path)
> +static const char *get_uuid_blkdev(char *path, blkid_cache *cache)
>  {
>  	/* We set *safe if we know that we need the
>  	 * fsid from statfs too.
>  	 */
> -	static blkid_cache cache = NULL;
>  	struct stat stb;
>  	char *devname;
>  	blkid_tag_iterate iter;
> @@ -53,17 +52,17 @@
>  	const char *type;
>  	const char *val, *uuid = NULL;
>  
> -	if (cache == NULL)
> -		blkid_get_cache(&cache, NULL);
> +	if (!*cache)
> +		blkid_get_cache(cache, NULL);
>  
>  	if (stat(path, &stb) != 0)
>  		return NULL;
> -	devname = blkid_devno_to_devname(stb.st_dev);
> +	devname = blkid_devno_to_devname(stb.st_rdev);
>  	if (!devname)
>  		return NULL;
>  	printf("calling blkid_get_dev(cache, devname=%s, BLKID_DEV_NORMAL)\n",
>  		devname);
> -	dev = blkid_get_dev(cache, devname, BLKID_DEV_NORMAL);
> +	dev = blkid_get_dev(*cache, devname, BLKID_DEV_NORMAL);
>  	free(devname);
>  	if (!dev)
>  		return NULL;
> @@ -88,6 +87,7 @@
>  {
>  	const char *uuid;
>  	int before, after;
> +	blkid_cache cache = NULL;;
>  
>  	if (argc < 1) {
>  		printf("Please enter path to device");
> @@ -95,7 +95,9 @@
>  	}
>  	before = get_num_fds();
>  	// printf("BEFORE get_uuid_blkdev num_fds = %d\n", before);
> -	uuid = get_uuid_blkdev(argv[1]);
> +	uuid = get_uuid_blkdev(argv[1], &cache);
> +	printf("uuid = %s\n", uuid);
> +	blkid_put_cache(cache);
>  	after = get_num_fds();
>  	// printf("AFTER get_uuid_blkdev num_fds = %d\n", after);
>  	if (before != after)
> @@ -105,6 +107,5 @@
>  		printf("ERROR: uuid = NULL\n");
>  		exit(1);
>  	}
> -	printf("uuid = %s\n", uuid);
>  	exit(0);
>  }
> 
> Note that after blkid_put_cache() the uuid string will not be available, you
> have to use it before blkid_put_cache() (or strdup()).
> 
> I don't see any leak. Not a bug from my point of view.

Can you explain why the library needs to hold the underlying block device open even after the call to blkid_get_dev()?  Is it something to do with blkid_cache validity?

Comment 7 Dave Wysochanski 2019-07-31 13:25:27 UTC
SteveD - it does not look like rpc.mountd calls blkid_put_cache() in get_uuid_blkdev() but Karel is saying we need to do that (see comment #5).  Also we need to somehow copy the UUID string here to retain its value or it will go away apparently or we need to rework get_uuid_blkdev() to retain blkid_cache for the lifetime of its use.

Left unexplained still is why libblkid keeps underlying device open (maybe for cache validity or optimization?) or why RHEL6 does not have this problem (maybe a blkid_put_cache was not required prior and/or some bug was fixed in the library?)

Comment 8 Dave Wysochanski 2019-07-31 13:52:58 UTC
Ok it looks intentional get_uuid_blkdev() is not calling blkid_put_cache() because we have a static variable and we're checking for NULL before calling blkid_get_cache().  Unclear where to go from here...

Comment 9 Karel Zak 2019-07-31 14:01:45 UTC
Good question Dave, ... I've checked another part of the library (where it re-validate the cache) and it seems there is a leak in blkid_verify(). It does not call blkid_free_probe() because the probe handler is later re-used, unfortunately if does not care about internally used disk_prober. 

So, blkid_put_cache() will fix the problem for rpc.mountd, but it seems we need to improve libblkid too.

Comment 10 Karel Zak 2019-07-31 14:38:02 UTC
Fixed by util-linux upstream commit: https://github.com/karelzak/util-linux/commit/c4d6d1c54dcd0eff701236d396d88b1fc6251768

I'm going to fix it in RHEL7 and RHEL8 too. Thanks for help and test program, it's pretty old bug (v2.20, year 2011).

Comment 11 Dave Wysochanski 2019-07-31 15:09:49 UTC
Thanks Karel!  I also confirm with a test build that patch in comment #10 fixes the problem.


Note You need to log in before you can comment on or make changes to this bug.