Note: This bug is displayed in read-only format because
the product is no longer active in Red Hat Bugzilla.
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
DescriptionAchilles Gaikwad
2020-07-19 10:38:43 UTC
Description of problem:
When using the same system as nfs-server and nfs-client, and
using `nobind` option for autofs we would fall to the code where
we let `mount.nfs(8)` to handle the mount. However, when the
nfs-server and the nfs-client is the same system we end up calling
`rpc_ping` which gives negative return code. Due to this we fall to
the label next: and never attempt a mount of nfs share. (Please check
debug logs added in `Actual results:` section below.)
This patch fixes this BUG by not probing rpc_ping if we're
using rdma.
Environment:
nfs-server and nfs-client is the same system. We're mounting the share locally via RDMA.
Version-Release number of selected component (if applicable):
RHEL 7 :
3.10.0-1127.13.1.el7.x86_64
autofs-5.0.7-109.el7.x86_64
How reproducible:
Always
Steps to Reproduce:
1. Have an IB interface via which you'll attempt the mount of nfs-share.
~~~
# ip -4 addr show ib0
7: ib0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 2044 qdisc mq state UP group default qlen 256
inet 192.168.1.69/24 brd 192.168.1.255 scope global ib0
valid_lft forever preferred_lft forever
~~~
2. Create a nfs-server
o Have the configuration for rdma enabled:
~~~
# cat /etc/sysconfig/nfs | grep -v "#"
RPCNFSDARGS="--rdma=20049"
RPCMOUNTDOPTS=""
STATDARG=""
SMNOTIFYARGS=""
RPCIDMAPDARGS=""
RPCGSSDARGS=""
GSS_USE_PROXY="yes"
BLKMAPDARGS=""
~~~
o create a directory to export:
~~~
# mkdir -p /export/home
# mkdir /mnt2
# chmod a+rwx /export
~~~
o Your /etc/exports should look like this:
~~~
# cat /etc/exports
/export *(rw,insecure,no_root_squash)
~~~
o start the nfs-server and make sure its running:
~~~
# systemctl restart nfs-server ; cat /proc/fs/nfsd/portlist
rdma 20049
rdma 20049
udp 2049
tcp 2049
udp 2049
tcp 2049
~~~
o Disable firewall.
~~~
# systemctl disable --now firewalld
~~~
3. Attempt to manually mount the nfs-share. Once it is mounted, unmount it. This step is optional, just for verifying that the nfs-share mounts manually without any problems
~~~
# mount -t nfs 192.168.1.69:/export /mnt -o proto=rdma,port=20049 -vvv
mount.nfs: timeout set for Sun Jul 19 15:42:27 2020
mount.nfs: trying text-based options 'proto=rdma,port=20049,vers=4.1,addr=192.168.1.69,clientaddr=192.168.1.69'
# df -hT /mnt
Filesystem Type Size Used Avail Use% Mounted on
192.168.1.69:/export nfs4 70G 20G 50G 29% /mnt
# cat /proc/mounts | grep rdma
192.168.1.69:/export /mnt nfs4 rw,relatime,vers=4.1,rsize=1048576,wsize=1048576,namlen=255,hard,proto=rdma,port=20049,timeo=600,retrans=2,sec=sys,clientaddr=192.168.1.69,local_lock=none,addr=192.168.1.69 0 0
~~~
o Umount the share
~~~
# umount /mnt
~~~
4. Add the following autofs configuration:
o Master file:
~~~
# cat /etc/auto.master
/mnt2 /etc/auto.mnt nobind
+auto.master
~~~
o Map file:
~~~
# cat /etc/auto.mnt
test -fstype=nfs,proto=rdma,port=20049 nfs-server:/export
~~~
o Start autofs in foreground with debugging enabled in one terminal:
~~~
# automount -fdv
~~~
o Attempt `ls` on /mnt2/test
~~~
# ls /mnt2/test
~~~
Actual results:
o Share isn't mounted. Following debug logs are seen:
~~~
bind mounts disabled
handle_packet: type = 3
handle_packet_missing_indirect: token 58, name test, request pid 40185
attempting to mount entry /mnt2/test
lookup_mount: lookup(file): looking up test
lookup_mount: lookup(file): test -> -fstype=nfs,proto=rdma,port=20049 nfs-server:/export
parse_mount: parse(sun): expanded entry: -fstype=nfs,proto=rdma,port=20049 nfs-server:/export
parse_mount: parse(sun): gathered options: fstype=nfs,proto=rdma,port=20049
parse_mount: parse(sun): dequote("nfs-server:/export") -> nfs-server:/export
parse_mount: parse(sun): core of entry: options=fstype=nfs,proto=rdma,port=20049, loc=nfs-server:/export
sun_mount: parse(sun): mounting root /mnt2, mountpoint test, what nfs-server:/export, fstype nfs, options proto=rdma,port=20049
mount(nfs): root=/mnt2 name=test what=nfs-server:/export, fstype=nfs, options=proto=rdma,port=20049
mount(nfs): nfs options="proto=rdma,port=20049", nobind=32, nosymlink=0, ro=0
mount_mount: mount(nfs): calling mkdir_path /mnt2/test
mount(nfs): nfs: mount failure nfs-server:/export on /mnt2/test
dev_ioctl_send_fail: token = 58
failed to mount /mnt2/test
~~~
Expected results:
Share should be mounted :
~~~
# ls -l /mnt2/test/
total 4
drwxrwxrwx. 28 root root 4096 Jul 18 14:19 home
~~~
o Autofs debug logs when the share should work: (check Additional info)
~~~
# automount -fdv
Starting automounter version 5.1.6, master map auto.master
using kernel protocol version 5.05
:::
bind mounts disabled
handle_packet: type = 3
handle_packet_missing_indirect: token 59, name test, request pid 40345
attempting to mount entry /mnt2/test
lookup_mount: lookup(file): looking up test
lookup_mount: lookup(file): test -> -fstype=nfs,proto=rdma,port=20049 nfs-server:/export
parse_mount: parse(sun): expanded entry: -fstype=nfs,proto=rdma,port=20049 nfs-server:/export
parse_mount: parse(sun): gathered options: fstype=nfs,proto=rdma,port=20049
parse_mount: parse(sun): dequote("nfs-server:/export") -> nfs-server:/export
parse_mount: parse(sun): core of entry: options=fstype=nfs,proto=rdma,port=20049, loc=nfs-server:/export
sun_mount: parse(sun): mounting root /mnt2, mountpoint test, what nfs-server:/export, fstype nfs, options proto=rdma,port=20049
mount(nfs): root=/mnt2 name=test what=nfs-server:/export, fstype=nfs, options=proto=rdma,port=20049
mount(nfs): nfs options="proto=rdma,port=20049", nobind=32, nosymlink=0, ro=0
mount_mount: mount(nfs): calling mkdir_path /mnt2/test
mount(nfs): calling mount -t nfs -s -o proto=rdma,port=20049 nfs-server:/export /mnt2/test
spawn_mount: mtab link detected, passing -n to mount
mount_mount: mount(nfs): mounted nfs-server:/export on /mnt2/test
dev_ioctl_send_ready: token = 59
mounted /mnt2/test
~~~
Additional info:
The debug output in `Expected results:` is after compiling upstream autofs on RHEL7 and patching the file `modules/mount_nfs.c`.
Following is the patch that was applied:
~~~
diff --git a/modules/mount_nfs.c b/modules/mount_nfs.c
index 4e3e703..5a8c3bf 100644
--- a/modules/mount_nfs.c
+++ b/modules/mount_nfs.c
@@ -375,9 +375,13 @@ dont_probe:
*/
if (this->proximity == PROXIMITY_LOCAL) {
char *host = this->name ? this->name : "localhost";
- int ret;
-
- ret = rpc_ping(host, port, vers, 2, 0, RPC_CLOSE_DEFAULT);
+ /* If we're using RDMA, rpc_ping will fail
+ * when nfs-server is local.
+ * Therefore, don't probe when we're using RDMA
+ */
+ int ret = 1;
+ if(!rdma)
+ ret = rpc_ping(host, port, vers, 2, 0, RPC_CLOSE_DEFAULT);
if (ret <= 0)
goto next;
}
~~~
This patch has also been sent to upstream for review. Please backport this patch as this fixes the issue.
Root cause analysis:
When using rdma and local nfs-server, we fall to the code section :
~~~
218 }
219 /*
220 * We can't probe protocol rdma so leave it to mount.nfs(8)
221 * and and suffer the delay if a server isn't available.
222 */
223 if (rdma)
224 goto dont_probe;
225
~~~
From here we goto the `if` condition where rpc_ping() returns negative for some reason. I did not investigate rpc_ping further. However, because we get a negative value for rpc_ping, the condition line 381 is TRUE. Then we goto next:
~~~
263 dont_probe:
:::
372 /* If this is a fallback from a bind mount failure
373 * check if the local NFS server is available to try
374 * and prevent lengthy mount failure waits.
375 */
376 if (this->proximity == PROXIMITY_LOCAL) {
377 char *host = this->name ? this->name : "localhost";
378 int ret;
379
380 ret = rpc_ping(host, port, vers, 2, 0, RPC_CLOSE_DEFAULT);
381 if (ret <= 0)
382 goto next;
383 }
384
:::
~~~
When we goto `next` following piece of code is executed. Notice that we don't return in next, but we fall thru forced_fail: where we print information on line 418.
~~~
408 next:
409 free(loc);
410 this = this->next;
411 }
412
413 forced_fail:
414 free_host_list(&hosts);
415
416 /* If we get here we've failed to complete the mount */
417
418 info(ap->logopt, MODPREFIX "nfs: mount failure %s on %s", what, fullpath);
419
420 if (ap->type != LKP_INDIRECT)
421 return 1;
422
423 if ((!(ap->flags & MOUNT_FLAG_GHOST) && name_len) || !existed)
424 rmdir_path(ap, fullpath, ap->dev);
425
426 return 1;
427 }
~~~
Therefore, we never attempt a mount of the nfs-share which we delegated to mount.nfs(8) earlier.
- The issue is a userspace issue
- The issue is reproducible on upstream autofs
- The patch provided above applies to upstream autofs
- Issue is not reproducible if nfs-share is a remote system. (i.e. non local system)
I had to add a lot of prints to the code to make sure that we're falling thru the labels therefore not mounting the nfs-share. :)
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory (autofs bug fix and enhancement update), and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.
https://access.redhat.com/errata/RHBA-2020:5438
Description of problem: When using the same system as nfs-server and nfs-client, and using `nobind` option for autofs we would fall to the code where we let `mount.nfs(8)` to handle the mount. However, when the nfs-server and the nfs-client is the same system we end up calling `rpc_ping` which gives negative return code. Due to this we fall to the label next: and never attempt a mount of nfs share. (Please check debug logs added in `Actual results:` section below.) This patch fixes this BUG by not probing rpc_ping if we're using rdma. Environment: nfs-server and nfs-client is the same system. We're mounting the share locally via RDMA. Version-Release number of selected component (if applicable): RHEL 7 : 3.10.0-1127.13.1.el7.x86_64 autofs-5.0.7-109.el7.x86_64 How reproducible: Always Steps to Reproduce: 1. Have an IB interface via which you'll attempt the mount of nfs-share. ~~~ # ip -4 addr show ib0 7: ib0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 2044 qdisc mq state UP group default qlen 256 inet 192.168.1.69/24 brd 192.168.1.255 scope global ib0 valid_lft forever preferred_lft forever ~~~ 2. Create a nfs-server o Have the configuration for rdma enabled: ~~~ # cat /etc/sysconfig/nfs | grep -v "#" RPCNFSDARGS="--rdma=20049" RPCMOUNTDOPTS="" STATDARG="" SMNOTIFYARGS="" RPCIDMAPDARGS="" RPCGSSDARGS="" GSS_USE_PROXY="yes" BLKMAPDARGS="" ~~~ o create a directory to export: ~~~ # mkdir -p /export/home # mkdir /mnt2 # chmod a+rwx /export ~~~ o Your /etc/exports should look like this: ~~~ # cat /etc/exports /export *(rw,insecure,no_root_squash) ~~~ o start the nfs-server and make sure its running: ~~~ # systemctl restart nfs-server ; cat /proc/fs/nfsd/portlist rdma 20049 rdma 20049 udp 2049 tcp 2049 udp 2049 tcp 2049 ~~~ o Disable firewall. ~~~ # systemctl disable --now firewalld ~~~ 3. Attempt to manually mount the nfs-share. Once it is mounted, unmount it. This step is optional, just for verifying that the nfs-share mounts manually without any problems ~~~ # mount -t nfs 192.168.1.69:/export /mnt -o proto=rdma,port=20049 -vvv mount.nfs: timeout set for Sun Jul 19 15:42:27 2020 mount.nfs: trying text-based options 'proto=rdma,port=20049,vers=4.1,addr=192.168.1.69,clientaddr=192.168.1.69' # df -hT /mnt Filesystem Type Size Used Avail Use% Mounted on 192.168.1.69:/export nfs4 70G 20G 50G 29% /mnt # cat /proc/mounts | grep rdma 192.168.1.69:/export /mnt nfs4 rw,relatime,vers=4.1,rsize=1048576,wsize=1048576,namlen=255,hard,proto=rdma,port=20049,timeo=600,retrans=2,sec=sys,clientaddr=192.168.1.69,local_lock=none,addr=192.168.1.69 0 0 ~~~ o Umount the share ~~~ # umount /mnt ~~~ 4. Add the following autofs configuration: o Master file: ~~~ # cat /etc/auto.master /mnt2 /etc/auto.mnt nobind +auto.master ~~~ o Map file: ~~~ # cat /etc/auto.mnt test -fstype=nfs,proto=rdma,port=20049 nfs-server:/export ~~~ o Start autofs in foreground with debugging enabled in one terminal: ~~~ # automount -fdv ~~~ o Attempt `ls` on /mnt2/test ~~~ # ls /mnt2/test ~~~ Actual results: o Share isn't mounted. Following debug logs are seen: ~~~ bind mounts disabled handle_packet: type = 3 handle_packet_missing_indirect: token 58, name test, request pid 40185 attempting to mount entry /mnt2/test lookup_mount: lookup(file): looking up test lookup_mount: lookup(file): test -> -fstype=nfs,proto=rdma,port=20049 nfs-server:/export parse_mount: parse(sun): expanded entry: -fstype=nfs,proto=rdma,port=20049 nfs-server:/export parse_mount: parse(sun): gathered options: fstype=nfs,proto=rdma,port=20049 parse_mount: parse(sun): dequote("nfs-server:/export") -> nfs-server:/export parse_mount: parse(sun): core of entry: options=fstype=nfs,proto=rdma,port=20049, loc=nfs-server:/export sun_mount: parse(sun): mounting root /mnt2, mountpoint test, what nfs-server:/export, fstype nfs, options proto=rdma,port=20049 mount(nfs): root=/mnt2 name=test what=nfs-server:/export, fstype=nfs, options=proto=rdma,port=20049 mount(nfs): nfs options="proto=rdma,port=20049", nobind=32, nosymlink=0, ro=0 mount_mount: mount(nfs): calling mkdir_path /mnt2/test mount(nfs): nfs: mount failure nfs-server:/export on /mnt2/test dev_ioctl_send_fail: token = 58 failed to mount /mnt2/test ~~~ Expected results: Share should be mounted : ~~~ # ls -l /mnt2/test/ total 4 drwxrwxrwx. 28 root root 4096 Jul 18 14:19 home ~~~ o Autofs debug logs when the share should work: (check Additional info) ~~~ # automount -fdv Starting automounter version 5.1.6, master map auto.master using kernel protocol version 5.05 ::: bind mounts disabled handle_packet: type = 3 handle_packet_missing_indirect: token 59, name test, request pid 40345 attempting to mount entry /mnt2/test lookup_mount: lookup(file): looking up test lookup_mount: lookup(file): test -> -fstype=nfs,proto=rdma,port=20049 nfs-server:/export parse_mount: parse(sun): expanded entry: -fstype=nfs,proto=rdma,port=20049 nfs-server:/export parse_mount: parse(sun): gathered options: fstype=nfs,proto=rdma,port=20049 parse_mount: parse(sun): dequote("nfs-server:/export") -> nfs-server:/export parse_mount: parse(sun): core of entry: options=fstype=nfs,proto=rdma,port=20049, loc=nfs-server:/export sun_mount: parse(sun): mounting root /mnt2, mountpoint test, what nfs-server:/export, fstype nfs, options proto=rdma,port=20049 mount(nfs): root=/mnt2 name=test what=nfs-server:/export, fstype=nfs, options=proto=rdma,port=20049 mount(nfs): nfs options="proto=rdma,port=20049", nobind=32, nosymlink=0, ro=0 mount_mount: mount(nfs): calling mkdir_path /mnt2/test mount(nfs): calling mount -t nfs -s -o proto=rdma,port=20049 nfs-server:/export /mnt2/test spawn_mount: mtab link detected, passing -n to mount mount_mount: mount(nfs): mounted nfs-server:/export on /mnt2/test dev_ioctl_send_ready: token = 59 mounted /mnt2/test ~~~ Additional info: The debug output in `Expected results:` is after compiling upstream autofs on RHEL7 and patching the file `modules/mount_nfs.c`. Following is the patch that was applied: ~~~ diff --git a/modules/mount_nfs.c b/modules/mount_nfs.c index 4e3e703..5a8c3bf 100644 --- a/modules/mount_nfs.c +++ b/modules/mount_nfs.c @@ -375,9 +375,13 @@ dont_probe: */ if (this->proximity == PROXIMITY_LOCAL) { char *host = this->name ? this->name : "localhost"; - int ret; - - ret = rpc_ping(host, port, vers, 2, 0, RPC_CLOSE_DEFAULT); + /* If we're using RDMA, rpc_ping will fail + * when nfs-server is local. + * Therefore, don't probe when we're using RDMA + */ + int ret = 1; + if(!rdma) + ret = rpc_ping(host, port, vers, 2, 0, RPC_CLOSE_DEFAULT); if (ret <= 0) goto next; } ~~~ This patch has also been sent to upstream for review. Please backport this patch as this fixes the issue. Root cause analysis: When using rdma and local nfs-server, we fall to the code section : ~~~ 218 } 219 /* 220 * We can't probe protocol rdma so leave it to mount.nfs(8) 221 * and and suffer the delay if a server isn't available. 222 */ 223 if (rdma) 224 goto dont_probe; 225 ~~~ From here we goto the `if` condition where rpc_ping() returns negative for some reason. I did not investigate rpc_ping further. However, because we get a negative value for rpc_ping, the condition line 381 is TRUE. Then we goto next: ~~~ 263 dont_probe: ::: 372 /* If this is a fallback from a bind mount failure 373 * check if the local NFS server is available to try 374 * and prevent lengthy mount failure waits. 375 */ 376 if (this->proximity == PROXIMITY_LOCAL) { 377 char *host = this->name ? this->name : "localhost"; 378 int ret; 379 380 ret = rpc_ping(host, port, vers, 2, 0, RPC_CLOSE_DEFAULT); 381 if (ret <= 0) 382 goto next; 383 } 384 ::: ~~~ When we goto `next` following piece of code is executed. Notice that we don't return in next, but we fall thru forced_fail: where we print information on line 418. ~~~ 408 next: 409 free(loc); 410 this = this->next; 411 } 412 413 forced_fail: 414 free_host_list(&hosts); 415 416 /* If we get here we've failed to complete the mount */ 417 418 info(ap->logopt, MODPREFIX "nfs: mount failure %s on %s", what, fullpath); 419 420 if (ap->type != LKP_INDIRECT) 421 return 1; 422 423 if ((!(ap->flags & MOUNT_FLAG_GHOST) && name_len) || !existed) 424 rmdir_path(ap, fullpath, ap->dev); 425 426 return 1; 427 } ~~~ Therefore, we never attempt a mount of the nfs-share which we delegated to mount.nfs(8) earlier. - The issue is a userspace issue - The issue is reproducible on upstream autofs - The patch provided above applies to upstream autofs - Issue is not reproducible if nfs-share is a remote system. (i.e. non local system) I had to add a lot of prints to the code to make sure that we're falling thru the labels therefore not mounting the nfs-share. :)