Hide Forgot
+++ This bug was initially created as a clone of Bug #1734545 +++ Description of problem: We had a customer report an issue with the nfs server's rpc.mountd where it would retain a reference to an underlying block device even after a filesystem was unexported and no one was using it, and the only way to release it was to restart rpc.mountd. The block device had one partition and on top of that was a filesystem. The NFS reproducer is fairly simple but as it turns out we can reproduce this with a simple C program calling into libblkid and show it will leak a reference. This bug appears to be in at least: RHEL7 (libblkid-2.23.2-59.el7.x86_64, nfs-utils-1.3.0-0.61.el7.x86_64 RHEL8 (nfs-utils-2.3.3-14.el8.x86_64 f30 (libblkid-2.33.2-2.fc30.x86_64, nfs-utils-2.4.1-0.fc30.x86_64) But is not in RHEL6 (libblkid-2.17.2-12.28.el6_9.2.x86_64, nfs-utils-1.2.3-75.el6_9.x86_64) Version-Release number of selected component (if applicable): libblkid-2.23.2-59.el7.x86_64 or above How reproducible: Everytime with reproducer Steps to Reproduce: util-linux / libblkid 1. Create a block device with one partition on it, then a filesystem on that (or just use /boot) 2. Run attached C program to show the file descriptor of the underlying block device remains open / is leaked Example: ./blkid-test /boot calling blkid_get_dev(cache, devname=/dev/vda1, BLKID_DEV_NORMAL) BUG - file descriptor leak - before=6 after=7 uuid = e2b3e658-cc82-4fad-9ac2-2cf384ecc49e # gdb ./blkid-test (gdb) 98 uuid = get_uuid_blkdev(argv[1]); 99 after = get_num_fds(); 100 // printf("AFTER get_uuid_blkdev num_fds = %d\n", after); 101 if (before != after) 102 printf("BUG - file descriptor leak - before=%d after=%d\n", 103 before, after); 104 if (!uuid) { 105 printf("ERROR: uuid = NULL\n"); 106 exit(1); 107 } (gdb) b 98 Breakpoint 1 at 0x401455: file blkid-test.c, line 98. (gdb) b 101 Breakpoint 2 at 0x401479: file blkid-test.c, line 101. (gdb) run /boot Starting program: /root/blkid-test /boot warning: Loadable section ".note.gnu.property" outside of ELF segments warning: Loadable section ".note.gnu.property" outside of ELF segments warning: Loadable section ".note.gnu.property" outside of ELF segments Breakpoint 1, main (argc=2, argv=0x7fffffffe478) at blkid-test.c:98 98 uuid = get_uuid_blkdev(argv[1]); (gdb) [1]+ Stopped gdb ./blkid-test # ps PID TTY TIME CMD 8870 pts/1 00:00:01 bash 13406 pts/1 00:00:01 gdb 13408 pts/1 00:00:00 blkid-test 13412 pts/1 00:00:00 ps # ls -lh /proc/13408/fd total 0 lrwx------. 1 root root 64 Jul 30 15:52 0 -> /dev/pts/1 lrwx------. 1 root root 64 Jul 30 15:52 1 -> /dev/pts/1 lrwx------. 1 root root 64 Jul 30 15:52 2 -> /dev/pts/1 # fg gdb ./blkid-test c c Continuing. calling blkid_get_dev(cache, devname=/dev/vda1, BLKID_DEV_NORMAL) Breakpoint 2, main (argc=2, argv=0x7fffffffe478) at blkid-test.c:101 101 if (before != after) (gdb) [1]+ Stopped gdb ./blkid-test # ls -lh /proc/13408/fd total 0 lrwx------. 1 root root 64 Jul 30 15:52 0 -> /dev/pts/1 lrwx------. 1 root root 64 Jul 30 15:52 1 -> /dev/pts/1 lrwx------. 1 root root 64 Jul 30 15:52 2 -> /dev/pts/1 lr-x------. 1 root root 64 Jul 30 15:52 4 -> /dev/vda nfs-server reproducer (calls into libblkid with same calls as above) 1. Run attached mountd-test.sh (for NFS server reproducer) Actual results: reference to underlying block device remains open after following call is made: blkid_get_dev(cache, devname=/dev/vda1, BLKID_DEV_NORMAL) Expected results: no reference to underlying block device after the following call is made: blkid_get_dev(cache, devname=/dev/vda1, BLKID_DEV_NORMAL) Additional info: If for some reason I've misunderstood and this is somehow misuse of the interface by nfs-utils (rpc.mountd) please let us know. NOTE: on fc30, the blkid.h file is inside libblkid-debugsource, and I did this to compile / link: ln -s /usr/lib64/libblkid.so.1 /usr/lib64/libblkid.so gcc -g -o blkid-test -I/usr/src/debug/util-linux-2.33.2-2.fc30.x86_64/libblkid/src blkid-test.c -lblkid On RHEL6/RHEL7, there is libblkid-devel and I did this: ln -s /usr/lib64/libblkid.so.1 /usr/lib64/libblkid.so gcc -g -o blkid-test -I/usr/include/blkid blkid-test.c -lblkid In all instances you should be able to repro easily with: # ./blkid-test /boot If the bug exists, we see: calling blkid_get_dev(cache, devname=/dev/vda1, BLKID_DEV_NORMAL) BUG - file descriptor leak - before=6 after=7 uuid = 3bcb7235-6b3b-494f-ad49-c4a0e18d67b6 If it does not, there is no "BUG" line above, as on RHEL6: calling blkid_get_dev(cache, devname=/dev/vda1, BLKID_DEV_NORMAL) uuid = e2107ad2-f3f5-4704-94f3-2fe7606ee947 --- Additional comment from Dave Wysochanski on 2019-07-30 20:20 UTC --- RHEL8 build: debuginfo-install libblkid ln -s /usr/lib64/libblkid.so.1 /usr/lib64/libblkid.so gcc -g -o blkid-test -I/usr/src/debug/util-linux-2.32.1-11.el8.x86_64/libblkid/src/ blkid-test.c -lblkid
Created attachment 1594830 [details] simple C program demonstrating descriptor leak in libblkid when blkid_get_dev is called with devname=<somepartition>
Created attachment 1594831 [details] the nfs-server test that originally uncovered the problem
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:3603