Bug 1418130

Summary: opts:=noatime,ro cause "protocol not supported" error
Product: [Fedora] Fedora EPEL Reporter: nomad
Component: am-utilsAssignee: Ian Kent <ikent>
Status: ASSIGNED --- QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: unspecified    
Version: epel7CC: ikent, nomad
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
log output with debug fully enabled
none
log file showing successful mounts none

Description nomad 2017-02-01 00:16:46 UTC
Description of problem:
When using am-utils entries like:

chicken-b/attic_h       type:=nfs;opts:=noatime,ro,hard,proto=tcp;rhost:=chicken-b;rfs:=/vol/attic_h


An error of "ls: cannot access /n/chicken-b/attic_h: Permission denied" is presented. However, when noatime,ro, is removed the mount works. Note: this happens with either option present, as well as with both present.


Version-Release number of selected component (if applicable):
am-utils-6.2.0-22.el7.x86_64


How reproducible:
100%

Steps to Reproduce:
1. add noatime or ro opts to an existing map entry for an NFS mount
2. try the mount
3.

Actual results:
: [88] ; ls /n/chicken-b/attic_h
ls: cannot access /n/chicken-b/attic_h: Permission denied


Expected results:
A listing of the contents of the top directory in the attic_h filesystem.

Additional info:
Last known working release was 6.2.0-10.

debug log output:
: [89] ; cat /var/log/amd
Jan 31 16:04:36 quillpen amd[31054]/debug: dir name /vol/attic_h
Jan 31 16:04:36 quillpen amd[31054]/error: '/atm/chicken-b/vol/attic_h': mount: Invalid argument
Jan 31 16:04:36 quillpen amd[31034]/warn:  find_nfs_srvr: NFS mount failed, trying again with NFSv2/UDP
Jan 31 16:04:36 quillpen amd[31055]/error: '/atm/chicken-b/vol/attic_h': mount: Protocol not supported
Jan 31 16:04:36 quillpen amd[31034]/error: /n/chicken-b/attic_h: mount (amfs_cont): Protocol not supported

Comment 1 nomad 2017-02-01 00:18:13 UTC
Forgot to say, manual mounts using the same options do work.

nomad

Comment 2 Ian Kent 2017-02-01 04:53:36 UTC
I'll take a look, thanks for the report.

Comment 3 Ian Kent 2017-02-01 09:43:54 UTC
Can you post /etc/amd.conf please?

Comment 4 nomad 2017-02-01 16:07:04 UTC
[global]
auto_dir =			/atm
# Fedora doesn't support NFSv2, use the amd NFSv3 server.
auto_nfs_version =              3
browsable_dirs =		no
dismount_interval =		300
fully_qualified_hosts =		no
log_file =			/var/log/amd
log_options =			fatal,noinfo,nostats,debug
map_type =			file
nfs_retransmit_counter =	20
nfs_retry_interval =		16
normalize_hostnames =		no
plock =				yes
print_pid =			no
print_version =			no
restart_mounts =		yes
search_path =			/etc/amd
selectors_on_default =		yes

[/g]
map_name = amd.group
map_type = file

[/n]
map_name = amd.network
map_type = file

[/users]
map_name = amd.homes
map_type = file

Comment 5 Ian Kent 2017-02-01 23:49:01 UTC
(In reply to nomad from comment #4)
> [global]
> auto_dir =			/atm
> # Fedora doesn't support NFSv2, use the amd NFSv3 server.
> auto_nfs_version =              3

Right, I had to check that you had auto_nfs_version set.

I can't duplicate this on CentOS, think it's 7.3.

How about a debug log please.

Comment 6 nomad 2017-02-02 16:57:52 UTC
I included the entire contents of /var/log/amd after setting log_options to include debug. Are there additional settings I can/should use?

re-posted here for convenience:

: [89] ; cat /var/log/amd
Jan 31 16:04:36 quillpen amd[31054]/debug: dir name /vol/attic_h
Jan 31 16:04:36 quillpen amd[31054]/error: '/atm/chicken-b/vol/attic_h': mount: Invalid argument
Jan 31 16:04:36 quillpen amd[31034]/warn:  find_nfs_srvr: NFS mount failed, trying again with NFSv2/UDP
Jan 31 16:04:36 quillpen amd[31055]/error: '/atm/chicken-b/vol/attic_h': mount: Protocol not supported
Jan 31 16:04:36 quillpen amd[31034]/error: /n/chicken-b/attic_h: mount (amfs_cont): Protocol not supported

Comment 7 nomad 2017-02-02 16:58:56 UTC
/etc/redhat-release says 7.3.1611.

nomad

Comment 8 Ian Kent 2017-02-16 06:53:16 UTC
(In reply to nomad from comment #7)
> /etc/redhat-release says 7.3.1611.

I'm pretty sure that's what I'm testing with too.
Can you post the full debug log please.

Comment 9 Ian Kent 2017-02-16 07:00:29 UTC
(In reply to nomad from comment #6)
> I included the entire contents of /var/log/amd after setting log_options to
> include debug. Are there additional settings I can/should use?
> 
> re-posted here for convenience:
> 
> : [89] ; cat /var/log/amd
> Jan 31 16:04:36 quillpen amd[31054]/debug: dir name /vol/attic_h
> Jan 31 16:04:36 quillpen amd[31054]/error: '/atm/chicken-b/vol/attic_h':
> mount: Invalid argument
> Jan 31 16:04:36 quillpen amd[31034]/warn:  find_nfs_srvr: NFS mount failed,
> trying again with NFSv2/UDP

Oh right, the "Protocol not supported" message is probably coming from
the fall back to NFSv2 so that's sensible and sounds correct.

Which means we need to ficus on the "Invalid argument" error.

It's probably best to send the log output to a specific log file and
set all the debug options so we don't miss anything.

What I use (which gets very verbose output) to make sure I don't
miss anything is:

log_file = /var/log/amd
log_options = all
debug_options = all

Comment 10 nomad 2017-02-16 16:11:06 UTC
Created attachment 1250900 [details]
log output with debug fully enabled

The host was a bit busy but I did manage to capture the attempted mount of /n/chicken-b/data.

Comment 11 Ian Kent 2017-02-16 23:57:40 UTC
(In reply to nomad from comment #10)
> Created attachment 1250900 [details]
> log output with debug fully enabled
> 
> The host was a bit busy but I did manage to capture the attempted mount of
> /n/chicken-b/data.

Thanks for the log.

According to the log the mount(2) system call is returning EINVAL
with parameters that look ok.

But I see it's trying NFSv4, getting an error and falling back to
NFSv2 and, of course, failing because v2 isn't supported by the
RHEL kernel.

I think it should be falling back to v3 not v2, I'll need to have
a look and see if I can work out why it's doing that.

Should v4 mounts work for this host?
What happens if you specify NFSv3 in the mount options?

Comment 12 Ian Kent 2017-02-17 00:00:30 UTC
Oh, and what kernel revision are you using?
Output of "uname -r" will give us that.

Comment 13 Ian Kent 2017-02-17 00:03:40 UTC
Umm, sorry, could you try and get me a debug log of this without
noatime and ro options (IOW a successful mount attempt)?

Comment 14 nomad 2017-02-21 16:57:09 UTC
> ...kernel revision...
3.10.0-514.2.2.el7.x86_64

> Should v4 mounts work for this host?

I would think so. This is a standard CentOS 7 host mounting a filesystem exported by a NetApp filer. When I try to manually do the mount it works:
 
: || lvd@quillpen / [114] ; sudo mount.nfs -o ro,noatime chicken-b:/vol/data /media
: || lvd@quillpen / [115] ; df -h !$
df -h /media
Filesystem           Size  Used Avail Use% Mounted on
chicken-b:/vol/data  3.7T  2.1T  1.7T  55% /media

mount(8) shows:

chicken-b:/vol/data on /media type nfs4 (ro,noatime,vers=4.0,rsize=65536,wsize=65536,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=192.168.232.18,local_lock=none,addr=192.168.232.31)


> What happens if you specify NFSv3 in the mount options?

Interesting. amd.conf has the following setting:

auto_nfs_version =              3

And I don't think there's anything in amd.network that would override that. However, every mount seems to be nfs4.

That said, forcing it with opts:=vers=3 on the type:=nfs line seems to work.

chicken-b/attic_h	type:=nfs;opts:=vers=3,noatime,ro,hard,proto=tcp;rhost:=chicken-b;rfs:=/vol/attic_h

mount(8) shows:

chicken-b:/vol/attic_h on /atm/chicken-b/vol/attic_h type nfs (ro,noatime,vers=3,rsize=65536,wsize=65536,namlen=255,hard,proto=tcp,port=2049,timeo=16,retrans=20,sec=sys,local_lock=none,addr=192.168.232.31)

(I'd swear I tried that early on and it didn't work. Must have been a different combination of things.)

> a debug log of this without noatime and ro options

I'll upload a new log which shows both the successful v3 mount and the successful noatime,ro mount.

Comment 15 nomad 2017-02-21 16:57:55 UTC
Created attachment 1256200 [details]
log file showing successful mounts

Comment 16 Ian Kent 2017-02-22 01:56:19 UTC
(In reply to nomad from comment #14)
> > ...kernel revision...
> 3.10.0-514.2.2.el7.x86_64

Thanks, I'll have a look at the NFS mount options evaluation
in that and see if I can spot anything.

> 
> > Should v4 mounts work for this host?
> 
> I would think so. This is a standard CentOS 7 host mounting a filesystem
> exported by a NetApp filer. When I try to manually do the mount it works:
>  
> : || lvd@quillpen / [114] ; sudo mount.nfs -o ro,noatime chicken-b:/vol/data
> /media
> : || lvd@quillpen / [115] ; df -h !$
> df -h /media
> Filesystem           Size  Used Avail Use% Mounted on
> chicken-b:/vol/data  3.7T  2.1T  1.7T  55% /media

Right, but I don't know what mount.nfs(8) might be doing to
the options before calling mount(2) and we know the mount(2)
call in amd is returning EINVAL. From memory I thought
mount.nfs(8) didn't do much at all to the options for NFSv4
mounts ....

I couldn't see where this happens looking at recent kernel
code either, it's not that straight forward to work it out.

> 
> mount(8) shows:
> 
> chicken-b:/vol/data on /media type nfs4
> (ro,noatime,vers=4.0,rsize=65536,wsize=65536,namlen=255,hard,proto=tcp,
> port=0,timeo=600,retrans=2,sec=sys,clientaddr=192.168.232.18,local_lock=none,
> addr=192.168.232.31)
> 
> 
> > What happens if you specify NFSv3 in the mount options?
> 
> Interesting. amd.conf has the following setting:
> 
> auto_nfs_version =              3

LOL, that option is for the amd internal NFS server which
handles the triggering of mounts.

Historically amd used NFSv2 for this but Fedora and RHEL-7
kernels don't enable NFSv2 any more so the internal server
needed to be updated to use NFSv3.

It doesn't actually affect options used for mounts that are
made based on map entries.

> 
> And I don't think there's anything in amd.network that would override that.
> However, every mount seems to be nfs4.
> 
> That said, forcing it with opts:=vers=3 on the type:=nfs line seems to work.
> 
> chicken-b/attic_h
> type:=nfs;opts:=vers=3,noatime,ro,hard,proto=tcp;rhost:=chicken-b;rfs:=/vol/
> attic_h
> 
> mount(8) shows:
> 
> chicken-b:/vol/attic_h on /atm/chicken-b/vol/attic_h type nfs
> (ro,noatime,vers=3,rsize=65536,wsize=65536,namlen=255,hard,proto=tcp,
> port=2049,timeo=16,retrans=20,sec=sys,local_lock=none,addr=192.168.232.31)

The thing is that, for NFSv4 mounts, amd checks the return and only
tries NFSv3 for two possible error returns and neither are EINVAL.

So there's no fallback to NFSv3 made and it then tries an NFSv2
mount which will always fail because the kernel is configured to
not support v2.

I'm not sure yet if I should just add EINVAL to that, I'll need to
look at it again. There's also the question of why we are getting
EINVAL which needs to be understood.

Unfortunately I have some time critical work (and have been working
on that for a while now) so I need to delay this.

But I think you have a workaround for the time being so I hope that
won't be too much of a problem.

Ian