Bug 1902445 - Seeing "Input/output error" when mounting a NFS export with AMD (am-utils-6.2.0-27), the file handle given by NFS server is 56 bytes
Summary: Seeing "Input/output error" when mounting a NFS export with AMD (am-utils-6.2...
Keywords:
Status: ASSIGNED
Alias: None
Product: Fedora EPEL
Classification: Fedora
Component: am-utils
Version: epel7
Hardware: Unspecified
OS: Linux
unspecified
medium
Target Milestone: ---
Assignee: Ian Kent
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-11-29 03:23 UTC by Madhu Thorat
Modified: 2020-11-30 12:21 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Type: Bug
Embargoed:


Attachments (Terms of Use)
Tcpdump captured when using AMD to mount a NFS export (4.98 KB, application/vnd.tcpdump.pcap)
2020-11-29 03:23 UTC, Madhu Thorat
no flags Details

Description Madhu Thorat 2020-11-29 03:23:55 UTC
Created attachment 1734519 [details]
Tcpdump captured when using AMD to mount a NFS export

Description of problem:
I recently tried to use AMD (from am-utils-6.2.0-27) to mount a NFS export, but I got "Input/Output error" at NFS client and the mount request failed.
On capturing tcpdump, noticed the NFS server is sending a 56 byte file handle in MNT response. But the complete 56 byte file handle is not getting used by AMD in the subsequent FSINFO request which is sent by AMD to the remote NFS server. And thus in FSINFO response the NFS server returns "Stale file handle" error causing "Input/output error" to be returned for the mount request.
 
Please check the issue description in "Steps to reproduce" section. Thank you in advance for helping with it. 

IMPORTANT: Based on the email exchange in am-utils mailing list at https://lists.fsl.cs.sunysb.edu/mailman/private/am-utils/2020-November/thread.html, following patch recommend by Ian Kent (ikent) helps to fix this issue:
> > diff --git a/conf/fh_dref/fh_dref_linux.h
> > b/conf/fh_dref/fh_dref_linux.h
> > index 7ffa5b50..70aac711 100644
> > --- a/conf/fh_dref/fh_dref_linux.h
> > +++ b/conf/fh_dref/fh_dref_linux.h
> > @@ -1,2 +1,2 @@
> >  /* $srcdir/conf/fh_dref/fh_dref_linux.h */
> > -#define        NFS_FH_DREF(dst, src) memcpy((char *) &(dst.data),
> (char *) src, sizeof(struct nfs_fh))
> > +#define        NFS_FH_DREF(dst, src) memcpy(&(dst.data), src,
> sizeof(dst.data))

So please have the above patch in a stable package/rpm which can be available for general use. Thank you.


Version-Release number of selected component (if applicable): am-utils-6.2.0-27.el7.x86_64.rpm


How reproducible: Always reproducible


Steps to Reproduce:
Using RHEL7.8 on x86_64 system at the NFS client side.
On the server side we have NFS-Ganesha running on a IBM Spectrum Scale node and using a GPFS file file system.

I followed the below steps to do mount operation:
# rpm -Uvh am-utils-6.2.0-27.el7.x86_64.rpm
warning: am-utils-6.2.0-27.el7.x86_64.rpm: Header V3 RSA/SHA256 Signature, key ID 352c64e5: NOKEY
Preparing...                          ################################# [100%]
Updating / installing...
 
# amd
Nov 27 07:20:30 c84f1u04 amd[8595]/info:  using configuration file /etc/amd.conf
 
# mount | grep net
c84f1u04:(pid8596,port1022) on /net type nfs (rw,relatime,sync,vers=3,rsize=4096,wsize=4096,namlen=255,acregmin=0,acregmax=0,acdirmin=0,acdirmax=0,hard,noac,nolock,proto=udp,port=1022,timeo=7,retrans=3,sec=sys,local_lock=all,addr=127.0.0.1)

# cd /net/myexp1
-bash: cd: /net/myexp1: Input/output error

Following observations from tcpdump captured when using AMD to mount a NFS export. Attaching tcpdump as well.
# tshark -r amd_mount.pcap00 | more
....
 27          0   9.47.91.52 -> 9.47.91.54   MOUNT 140 V3 MNT Call /ibm/gpfs0/testexp1
 28          0   9.47.91.54 -> 9.47.91.52   MOUNT 140 V3 MNT Reply (Call In 27)
....
 36          0   9.47.91.52 -> 9.47.91.54   NFS 204 V3 FSINFO Call, FH: 0x05ef7cc8
 37          0   9.47.91.54 -> 9.47.91.52   NFS 104 V3 FSINFO Reply (Call In 36) Error: NFS3ERR_STALE

# tshark -V -x -nr amd_mount.pcap00 -x -R frame.number==27
Mount Service
    [Program Version: 3]
    [V3 Procedure: MNT (1)]
    Path: /ibm/gpfs0/testexp1
        length: 19
        contents: /ibm/gpfs0/testexp1

# tshark -V -x -nr amd_mount.pcap00 -x -R frame.number==28
Mount Service
    [Program Version: 3]
    [V3 Procedure: MNT (1)]
....
        filehandle: 430000053030000a0002002800092f5b365fabc97a0200000200000000acdf04000000000000000000be84c7270300000000000000000000   <-- the bytes towards the end starting from be84c7.. are missing in the subsequent FSINFO request sent to NFS server
 
 
# tshark -V -x -nr amd_mount.pcap00 -x -R frame.number==36
Network File System, FSINFO Call DH: 0x05ef7cc8
    [Program Version: 3]
....
        filehandle: 430000053030000a0002002800092f5b365fabc97a0200000200000000acdf04000000000000000000000000000000000000000000000000
 
 
For reference following is the AMD configuration:
# cat /etc/amd.conf
# GLOBAL OPTIONS SECTION
[ global ]
normalize_hostnames =   no
print_pid =             yes
pid_file =              /var/run/amd.pid
restart_mounts =        yes
auto_dir =              /.automount
log_file =             /var/log/amd
#log_file =              syslog
log_options =           all
debug_options =        all
plock =                 no
selectors_on_default =  yes
print_version =         no
# set map_type to "nis" for NIS maps, or comment it out to search for all
# types
map_type =              file
search_path =           /etc
browsable_dirs =        yes
show_statfs_entries =   no
fully_qualified_hosts = no
cache_duration =        300
# Fedora doesn't support NFSv2, use the amd NFSv3 server.
auto_nfs_version =      3
# DEFINE AN AMD MOUNT POINT
[ /net ]
map_name =              amd.net
map_type =              file
 
# cat /etc/amd.net
localhost       type:=link;fs:=/
myexp1          type:=nfs;fs=/home/myexp1;rhost:=c84f1u07;rfs:=/ibm/gpfs0/testexp1;opts:=vers=3,rw,intr,nosuid,nodev,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,local_lock=none

Actual results:
As mentioned above seeing "Input/output error" when AMD is used to mount a NFS export.
# cd /net/myexp1
-bash: cd: /net/myexp1: Input/output error

Expected results:
When mounting a NFS export with AMD, no issue should be seen and mount should complete successfully.

Additional info:
Based on the email exchange in am-utils mailing list at https://lists.fsl.cs.sunysb.edu/mailman/private/am-utils/2020-November/thread.html, following patch recommend by Ian Kent (ikent) helps to fix this issue:
> > diff --git a/conf/fh_dref/fh_dref_linux.h
> > b/conf/fh_dref/fh_dref_linux.h
> > index 7ffa5b50..70aac711 100644
> > --- a/conf/fh_dref/fh_dref_linux.h
> > +++ b/conf/fh_dref/fh_dref_linux.h
> > @@ -1,2 +1,2 @@
> >  /* $srcdir/conf/fh_dref/fh_dref_linux.h */
> > -#define        NFS_FH_DREF(dst, src) memcpy((char *) &(dst.data),
> (char *) src, sizeof(struct nfs_fh))
> > +#define        NFS_FH_DREF(dst, src) memcpy(&(dst.data), src,
> sizeof(dst.data))

Please have this patch in a stable package/rpm which can be available for general use. Thank you.

Attaching tcpdump captured when using AMD and doing mount.

Comment 1 Ian Kent 2020-11-29 09:17:56 UTC
(In reply to Madhu Thorat from comment #0)
> 
> Additional info:
> Based on the email exchange in am-utils mailing list at
> https://lists.fsl.cs.sunysb.edu/mailman/private/am-utils/2020-November/
> thread.html, following patch recommend by Ian Kent (ikent) helps
> to fix this issue:
> > > diff --git a/conf/fh_dref/fh_dref_linux.h
> > > b/conf/fh_dref/fh_dref_linux.h
> > > index 7ffa5b50..70aac711 100644
> > > --- a/conf/fh_dref/fh_dref_linux.h
> > > +++ b/conf/fh_dref/fh_dref_linux.h
> > > @@ -1,2 +1,2 @@
> > >  /* $srcdir/conf/fh_dref/fh_dref_linux.h */
> > > -#define        NFS_FH_DREF(dst, src) memcpy((char *) &(dst.data),
> > (char *) src, sizeof(struct nfs_fh))
> > > +#define        NFS_FH_DREF(dst, src) memcpy(&(dst.data), src,
> > sizeof(dst.data))
> 
> Please have this patch in a stable package/rpm which can be available for
> general use. Thank you.

I have not been able to find any further problems so it may be that my
variant of the NFS_FH_DREF patch is broken.

I will change my patch to the above patch but if that doesn't resolve
the problem we will need to look further.

> 
> Attaching tcpdump captured when using AMD and doing mount.

I'll have a look at that, thanks.

Comment 2 Ian Kent 2020-11-29 09:49:34 UTC
Could you try this package please:
https://koji.fedoraproject.org/koji/taskinfo?taskID=56394193

Comment 3 Madhu Thorat 2020-11-30 12:21:01 UTC
Ian,

Thank you. Tried the new package, it is working as expected.
# rpm -Uvh am-utils-6.2.0-28.el7.x86_64.rpm
Preparing...                          ################################# [100%]
Updating / installing...
   1:am-utils-5:6.2.0-28.el7          ################################# [100%]
[root@c84f1u04 home]# ls -lrt /usr/sbin/amd
-rwxr-xr-x 1 root root 275008 Nov 29 04:43 /usr/sbin/amd
# /usr/sbin/amd
Nov 30 05:08:18 c84f1u04 amd[27669]/info:  using configuration file /etc/amd.conf
# cd /net1/myexp1
# mount | grep myexp1
c84f1u07:/ibm/gpfs0/testexp1 on /home/myexp1 type nfs (rw,nosuid,nodev,relatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,port=2049,timeo=600,retrans=3,sec=sys,local_lock=none,addr=9.47.91.54)


Note You need to log in before you can comment on or make changes to this bug.