Bug 1406398 - [RFE] Add NFS V4.2 support for ovirt-engine
Summary: [RFE] Add NFS V4.2 support for ovirt-engine
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: ovirt-engine
Classification: oVirt
Component: RFEs
Version: 4.1.0
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ovirt-4.1.2
: ---
Assignee: Tal Nisan
QA Contact: Elad
URL:
Whiteboard:
Depends On: 1432783
Blocks: 1488717
TreeView+ depends on / blocked
 
Reported: 2016-12-20 12:53 UTC by sefi litmanovich
Modified: 2019-04-28 13:36 UTC (History)
9 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2017-05-23 08:14:29 UTC
oVirt Team: Storage
Embargoed:
rule-engine: ovirt-4.1+
ykaul: exception+
ylavi: planning_ack+
tnisan: devel_ack+
ratamir: testing_ack+


Attachments (Terms of Use)
FailedQA (1.35 MB, application/x-gzip)
2017-04-30 12:24 UTC, Elad
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 734120 0 high CLOSED [RFE] use virt-sparsify to reduce image size 2021-02-22 00:41:40 UTC
Red Hat Bugzilla 1414798 0 high CLOSED NFS storage domain can't be added on Fedora 24/25 (selinux related) 2023-09-14 03:52:26 UTC
Red Hat Bugzilla 1432783 0 unspecified CLOSED Selinux denying sanlock access to /rhev/data-center/mnt/server:_path/uuid/dom_md/ids mounted using nfs v4.2 2021-02-22 00:41:40 UTC
oVirt gerrit 72049 0 master MERGED ui: Added 4.2 to the list of NFS versions 2020-11-27 16:06:47 UTC
oVirt gerrit 72276 0 ovirt-engine-4.1 MERGED ui: Added 4.2 to the list of NFS versions 2020-11-27 16:06:46 UTC

Internal Links: 734120 1414798 1432783

Description sefi litmanovich 2016-12-20 12:53:29 UTC
Description of problem:

Right now one cannot choose to use nfs server with NFS v4.2 which allows new features, e.g. it can make it possible to use the new sparsify feature which was introduced on ovirt-engine-4.1.
The option should be added to 'NFS Version' drop down in storage domain menu, or it should be available to override the 'vers' parameter using the 'Additional mount options' field in storage domain menu.

Comment 1 Yaniv Kaul 2016-12-21 10:50:34 UTC
Isn't Autonegotiate good enough?

Comment 2 sefi litmanovich 2016-12-22 08:03:07 UTC
(In reply to Yaniv Kaul from comment #1)
> Isn't Autonegotiate good enough?

I'm not sure what logic it applies, tried to do the following:

1. Put the SD on maintenance.
2. I enabled V4.2 on the nfs server by adding 'RPCNFSDARGS="-V 4.2"' to /etc/sysconfig/nfs and restarting the nfs service.
3. Verified the nfs version is enabled:
cat /proc/fs/nfsd/versions               
-2 +3 +4 +4.1 +4.2
3. Edit the storage domain and set NFS Version to 'Auto Negotiate'.
4. Activated the storage domain.

Result:
The storage domain was connected, in supervdsm.log I can see that it was mounted using vers=4.0.

If the 'Auto negotiate' option is expected to provide with the highest supported version of the nfs server then I will open a separate bug.

As for this RFE, I cannot see the harm in adding the version to the drop down list, unless there's a specific reason why we don't want the user to use that version.

Comment 3 Yaniv Kaul 2016-12-22 08:07:32 UTC
(In reply to sefi litmanovich from comment #2)
> (In reply to Yaniv Kaul from comment #1)
> > Isn't Autonegotiate good enough?
> 
> I'm not sure what logic it applies, tried to do the following:
> 
> 1. Put the SD on maintenance.
> 2. I enabled V4.2 on the nfs server by adding 'RPCNFSDARGS="-V 4.2"' to
> /etc/sysconfig/nfs and restarting the nfs service.
> 3. Verified the nfs version is enabled:
> cat /proc/fs/nfsd/versions               
> -2 +3 +4 +4.1 +4.2
> 3. Edit the storage domain and set NFS Version to 'Auto Negotiate'.
> 4. Activated the storage domain.
> 
> Result:
> The storage domain was connected, in supervdsm.log I can see that it was
> mounted using vers=4.0.

When you do it manually, what do you get? v4.2?
What do you see in the mount command?

> 
> If the 'Auto negotiate' option is expected to provide with the highest
> supported version of the nfs server then I will open a separate bug.
> 
> As for this RFE, I cannot see the harm in adding the version to the drop
> down list, unless there's a specific reason why we don't want the user to
> use that version.

Agreed.

Comment 4 sefi litmanovich 2016-12-25 11:54:18 UTC
(In reply to Yaniv Kaul from comment #3)
> (In reply to sefi litmanovich from comment #2)
> > (In reply to Yaniv Kaul from comment #1)
> > > Isn't Autonegotiate good enough?
> > 
> > I'm not sure what logic it applies, tried to do the following:
> > 
> > 1. Put the SD on maintenance.
> > 2. I enabled V4.2 on the nfs server by adding 'RPCNFSDARGS="-V 4.2"' to
> > /etc/sysconfig/nfs and restarting the nfs service.
> > 3. Verified the nfs version is enabled:
> > cat /proc/fs/nfsd/versions               
> > -2 +3 +4 +4.1 +4.2
> > 3. Edit the storage domain and set NFS Version to 'Auto Negotiate'.
> > 4. Activated the storage domain.
> > 
> > Result:
> > The storage domain was connected, in supervdsm.log I can see that it was
> > mounted using vers=4.0.
> 
> When you do it manually, what do you get? v4.2?
> What do you see in the mount command?
What do you mean manually? in rhevm side I can only set the NFS version to 3, 4, 4.1 or auto negotiate. When I set it auto negotiate I get:

/usr/bin/mount -t nfs -o soft,nosharecache,timeo=600,retrans=6,nfsvers=4 (snippet from supervdsm.log)

If I'd explicitly chose V4.1 then I get: 

/usr/bin/mount -t nfs -o soft,nosharecache,timeo=600,retrans=6,nfsvers=4,minorversion=1

So I'd expect to get nfsvers=4,minorversion=2 if V4.2 is enabled on the server and I set auto negotiate.

> > 
> > If the 'Auto negotiate' option is expected to provide with the highest
> > supported version of the nfs server then I will open a separate bug.
> > 
> > As for this RFE, I cannot see the harm in adding the version to the drop
> > down list, unless there's a specific reason why we don't want the user to
> > use that version.
> 
> Agreed.
What about the 'Auto negotiate' bug? is it a bug? or is this an expected behaviour?

Comment 5 Yaniv Kaul 2016-12-28 21:53:36 UTC
When I tried with auto I did not see and nfsvers sent.

Comment 6 sefi litmanovich 2016-12-29 13:06:32 UTC
(In reply to Yaniv Kaul from comment #5)
> When I tried with auto I did not see and nfsvers sent.

You are correct, that's my mistake, I was looking at the output when I ran with version 4 explicitly. The only indication for me is that the sparsify feature has no effect, where it should have. Sorry for the misinformation.
In any case, it is running version 4.0, this I can see when using mount -v from the SPM host, I get:
{my_server_ip}:/var/nfsexport on /rhev/data-center/mnt/{my_server_ip}:_var_nfsexport type nfs4 (rw,relatime,vers=4.0,rsize=1048576,wsize=1048576,namlen=255,soft,nosharecache,proto=tcp,port=0,timeo=600,retrans=6,sec=sys,clientaddr={my_server_ip},local_lock=none,addr={my_server_ip})

where on my server 4.1 and 4.2 are enabled as well:
%cat /proc/fs/nfsd/versions              
-2 +3 +4 +4.1 +4.2

Comment 7 Yaniv Kaul 2016-12-29 13:30:37 UTC
(In reply to sefi litmanovich from comment #6)
> (In reply to Yaniv Kaul from comment #5)
> > When I tried with auto I did not see and nfsvers sent.
> 
> You are correct, that's my mistake, I was looking at the output when I ran
> with version 4 explicitly. The only indication for me is that the sparsify
> feature has no effect, where it should have. Sorry for the misinformation.
> In any case, it is running version 4.0, this I can see when using mount -v
> from the SPM host, I get:
> {my_server_ip}:/var/nfsexport on
> /rhev/data-center/mnt/{my_server_ip}:_var_nfsexport type nfs4
> (rw,relatime,vers=4.0,rsize=1048576,wsize=1048576,namlen=255,soft,
> nosharecache,proto=tcp,port=0,timeo=600,retrans=6,sec=sys,
> clientaddr={my_server_ip},local_lock=none,addr={my_server_ip})
> 
> where on my server 4.1 and 4.2 are enabled as well:
> %cat /proc/fs/nfsd/versions              
> -2 +3 +4 +4.1 +4.2

1. With 'auto' - what do you end up getting? I ended up with 4.0 for some reason - I did not see supervdsm asking for any specific version!
2. When manually trying the 'mount' command with 4.2, does it work?

Comment 8 sefi litmanovich 2017-01-01 10:26:02 UTC
(In reply to Yaniv Kaul from comment #7)
> (In reply to sefi litmanovich from comment #6)
> > (In reply to Yaniv Kaul from comment #5)
> > > When I tried with auto I did not see and nfsvers sent.
> > 
> > You are correct, that's my mistake, I was looking at the output when I ran
> > with version 4 explicitly. The only indication for me is that the sparsify
> > feature has no effect, where it should have. Sorry for the misinformation.
> > In any case, it is running version 4.0, this I can see when using mount -v
> > from the SPM host, I get:
> > {my_server_ip}:/var/nfsexport on
> > /rhev/data-center/mnt/{my_server_ip}:_var_nfsexport type nfs4
> > (rw,relatime,vers=4.0,rsize=1048576,wsize=1048576,namlen=255,soft,
> > nosharecache,proto=tcp,port=0,timeo=600,retrans=6,sec=sys,
> > clientaddr={my_server_ip},local_lock=none,addr={my_server_ip})
> > 
> > where on my server 4.1 and 4.2 are enabled as well:
> > %cat /proc/fs/nfsd/versions              
> > -2 +3 +4 +4.1 +4.2
> 
> 1. With 'auto' - what do you end up getting? I ended up with 4.0 for some
> reason - I did not see supervdsm asking for any specific version!
> 2. When manually trying the 'mount' command with 4.2, does it work?

1. Yes the output I attached in this comment is when running with 'auto', so yes, also ending up with 4.0. I see in supervdsm that the mount cmd doesn't pass any variable for the version.
2. Checking manually I can confirm that a. when connecting without the vers variable it automatically chooses 4.0 and when manually setting vers=4.2 it works fine.

So to conclude  - No bug in 'auto' option, but still on option to work with nfs 4.2. So this RFE stands alone I think.
Solution - add 4.2 to drop down to pass is explicitly to the mount call or make it possible to override the vers variable using the 'Additional mount options' field.

Comment 9 Nir Soffer 2017-02-03 12:50:21 UTC
Tal, Michal, without NFS 4.2 support, pass discard and sparsify are useless with
NFS storage.

We should schedule this to next 4.1 build.

Comment 10 Tal Nisan 2017-02-05 10:03:30 UTC
Yaniv, can we introduce it for the next 4.1.1 build?

Comment 11 Red Hat Bugzilla Rules Engine 2017-02-05 10:03:37 UTC
This request has been proposed for two releases. This is invalid flag usage. The ovirt-future release flag has been cleared. If you wish to change the release flag, you must clear one release flag and then set the other release flag to ?.

Comment 12 Nir Soffer 2017-02-12 15:03:29 UTC
I found a workaround for using NFS 4.2:

1. Create POSIX domain with custom options:

  path: server:/export
  vfs type: nfs
  mount options: minorversion=2,soft,timeo=600,retrans=6

The mount options are the same options used by vdsm.

2. Fix selinux context on mountpoint:

  chcon -R -t nfs_t /rhev/data-center/mnt/server:_export

Not sure why selinux works for NFS domain but not for POSIX
domain, on vdsm side both are handled by the same code, and
we don't have any selinux specific code in vdsm.

The selinux issue probably should have its own but, I think we
have a related bug for nfs on Fedora 24.

Comment 13 Yaniv Lavi 2017-02-13 23:04:50 UTC
What is the question?

Comment 14 Tal Nisan 2017-02-14 09:47:39 UTC
(In reply to Tal Nisan from comment #10)
> Yaniv, can we introduce it for the next 4.1.1 build?

Comment 15 Yaniv Kaul 2017-02-16 15:05:07 UTC
Do we need a separate bug for REST API?

Comment 16 Tal Nisan 2017-02-19 16:56:31 UTC
Done - bug 1424821, also sent a patch

Comment 17 Yaniv Lavi 2017-04-04 12:27:39 UTC
Pushing to 4.1.2 due to BZ #1432783.

Comment 18 Nir Soffer 2017-04-05 09:25:17 UTC
(In reply to Yaniv Dary from comment #17)
> Pushing to 4.1.2 due to BZ #1432783.

I don't think this bug depends on bug 1432783, since we support now nfs 4.2.

With nfs 4.2, the server must have the correct selinux labels (nfs_t), this is
documentation issue.

If we want to support nfs server which do not support setting selinux labels, we
may need additional work, but lets open a separate bug to track this.

Comment 19 Yaniv Lavi 2017-04-19 08:27:03 UTC
(In reply to Nir Soffer from comment #18)
> 
> I don't think this bug depends on bug 1432783, since we support now nfs 4.2.
> 
> With nfs 4.2, the server must have the correct selinux labels (nfs_t), this
> is
> documentation issue.

This makes sense to me.
Meital, can you please make sure you defined the NFS server correctly?

Comment 20 Yaniv Kaul 2017-04-19 12:09:52 UTC
(In reply to Yaniv Dary from comment #19)
> (In reply to Nir Soffer from comment #18)
> > 
> > I don't think this bug depends on bug 1432783, since we support now nfs 4.2.
> > 
> > With nfs 4.2, the server must have the correct selinux labels (nfs_t), this
> > is
> > documentation issue.
> 
> This makes sense to me.
> Meital, can you please make sure you defined the NFS server correctly?

That's only if the server supports selinux labels - which are optional part of 4.2.
So for example, NetApp may support 4.2 without selinux.

I've configured ovirt-system-tests storage server (LIO based) correctly, and it worked.

The commands used:
semanage fcontext -a -t nfs_t '/exports/nfs(/.*)?'
restorecon -Rv /exports/nfs

Comment 21 Nir Soffer 2017-04-19 13:44:06 UTC
Maybe we should use mount *context option:

context=context, fscontext=context, defcontext=context, and rootcontext=context                                                                                                               

The context= option is useful when mounting filesystems that do not support
extended attributes, such as a floppy or hard disk formatted with VFAT, or
systems that are not normally running under SELinux, such as an ext3 formatted
disk from a non-SELinux workstation.  You can also use context= on filesystems
you  do  not  trust,  such  as  a floppy.   It also helps in compatibility with
xattr-supporting filesystems on earlier 2.4.<x> kernel versions.  Even where
xattrs are supported, you can save time not hav‐ ing to label every file by
assigning the entire disk one security context.

A commonly used option for removable media is
context="system_u:object_r:removable_t".

Two other options are fscontext= and defcontext=, both of which are mutually
exclusive of the context option.  This means you can use fscontext and 
defcontext  with  each other, but neither can be used with context.

The fscontext= option works for all filesystems, regardless of their xattr
support.  The fscontext option sets the overarching filesystem label to a
specific security con‐ text.  This filesystem label is separate from the 
individual labels on the files.  It represents the entire filesystem for 
certain kinds of permission checks, such as dur‐ ing  mount  or  file
creation.  Individual file labels are still obtained from the xattrs on the 
files themselves.  The context option actually sets the aggregate context that
fscontext provides, in addition to supplying the same label for individual
files.

You can set the default security context for unlabeled files using defcontext=
option.  This overrides the value set for unlabeled files  in  the  policy  and 
requires  a filesystem that supports xattr labeling.

The  rootcontext=  option  allows  you to explicitly label the root inode of a
FS being mounted before that FS or inode becomes visible to userspace.  This
was found to be useful for things like stateless linux.

Note that the kernel rejects any remount request that includes the context
option, even when unchanged from the current context.

Warning: the context value might contain commas, in which case the value has to
be properly quoted, otherwise mount(8) will interpret the  comma  as  a
separator  between mount options.  Don't forget that the shell strips off 
quotes and thus double quoting is required.  For example:

Comment 22 Nir Soffer 2017-04-19 13:45:35 UTC
(Continued, somehow posted again too early)

	 mount -t tmpfs none /mnt -o \
	   'context="system_u:object_r:tmp_t:s0:c127,c456",noexec'

For more details, see selinux(8).

Comment 23 Elad 2017-04-30 12:21:10 UTC
Seems that NFS 4.2 storage domain attachment to pool fails on Sanlock exception[1].

I'm using:
vdsm-4.19.11-1.el7ev.x86_64
sanlock-3.4.0-1.el7.x86_64
nfs-utils-1.3.0-0.33.el7_3.x86_64
rhevm-4.1.2-0.1.el7.noarch


Storage domain creation succeeded, here is the mount on the host:

yellow-vdsb.qa.lab.tlv.redhat.com:/Storage_NFS/storage_local_ge1_nfs_4 on /rhev/data-center/mnt/yellow-vdsb.qa.lab.tlv.redhat.com:_Storage__NFS_storage__local__ge1__nfs__4 type nfs4 (rw,relatime,seclabel,vers=4.2,rsize=1048576,wsize=1048576,namlen=255,soft,nosharecache,proto=tcp,port=0,timeo=600,retrans=6,sec=sys,clientaddr=10.35.82.52,local_lock=none,addr=10.35.80.5)
tmpfs on /run/user/0 type tmpfs (rw,nosuid,nodev,relatime,seclabel,size=388008k,mode=700)




[1]
2017-04-30 15:18:07,582+0300 ERROR (jsonrpc/6) [storage.TaskManager.Task] (Task='b13960de-0a82-4411-9164-9d17b74a7bdf') Unexpected error (task:870)
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/task.py", line 877, in _run
    return fn(*args, **kargs)
  File "/usr/lib/python2.7/site-packages/vdsm/logUtils.py", line 52, in wrapper
    res = f(*args, **kwargs)
  File "/usr/share/vdsm/storage/hsm.py", line 1159, in attachStorageDomain
    pool.attachSD(sdUUID)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py", line 79, in wrapper
    return method(self, *args, **kwargs)
  File "/usr/share/vdsm/storage/sp.py", line 971, in attachSD
    dom.acquireHostId(self.id)
  File "/usr/share/vdsm/storage/sd.py", line 790, in acquireHostId
    self._manifest.acquireHostId(hostId, async)
  File "/usr/share/vdsm/storage/sd.py", line 449, in acquireHostId
    self._domainLock.acquireHostId(hostId, async)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/clusterlock.py", line 297, in acquireHostId
    raise se.AcquireHostIdFailure(self._sdUUID, e)
AcquireHostIdFailure: Cannot acquire host id: (u'ba5fc183-2ad8-448e-979b-49cd69c8af51', SanlockException(19, 'Sanlock lockspace add failure', 'No such device'))

Comment 24 Red Hat Bugzilla Rules Engine 2017-04-30 12:21:18 UTC
Target release should be placed once a package build is known to fix a issue. Since this bug is not modified, the target version has been reset. Please use target milestone to plan a fix for a oVirt release.

Comment 25 Elad 2017-04-30 12:24:03 UTC
Created attachment 1275275 [details]
FailedQA

Comment 26 Yaniv Kaul 2017-04-30 12:34:20 UTC
Elad, thanks for the report. From your log, I suspect a SELinux issue.
1. Did you try with a non-Linux NFS server?
2. Did you perform the needed changes on the Linux NFS server (see comments above) ?

Comment 27 Elad 2017-04-30 12:42:36 UTC
(In reply to Yaniv Kaul from comment #26)
> Elad, thanks for the report. From your log, I suspect a SELinux issue.
> 1. Did you try with a non-Linux NFS server?
I tried with a RHEL7.2 NFS server, in fact it's the same server used by Sefi (the bug reporter)
> 2. Did you perform the needed changes on the Linux NFS server (see comments
> above) ?
Do you mean configuration on the NFS server? If so, we've configured this server with 4.2:

/etc/exports:

/Storage_NFS 10.0.0.0/255.0.0.0(rw,no_root_squash)
.
.
/  *(fsid=0,rw,no_root_squash)


[root@yellow-vdsb ~]# cat /proc/fs/nfsd/versions
-2 +3 +4 +4.1 +4.2

Comment 28 Yaniv Kaul 2017-04-30 12:44:27 UTC
Elad, please try the following:
    semanage fcontext -a -t nfs_t '/Storage_NFS(/.*)?'
    restorecon -Rv /Storage_NFS

Comment 29 Elad 2017-04-30 13:15:18 UTC
(In reply to Yaniv Kaul from comment #28)
> Elad, please try the following:
>     semanage fcontext -a -t nfs_t '/Storage_NFS(/.*)?'
>     restorecon -Rv /Storage_NFS

Tried, now storage domain attachment to pool succeeds, the domain is attached and active in the pool.

Yaniv, shouldn't we add documentation for this as part of preparation for 4.2 NFS storage domain creation? 

From my side, I can move to VERIFIED.


Note You need to log in before you can comment on or make changes to this bug.