Bug 1443913

Summary: mount command on supervdsm never completes
Product: [oVirt] vdsm Reporter: gshinar
Component: GeneralAssignee: Nir Soffer <nsoffer>
Status: CLOSED WONTFIX QA Contact: Raz Tamir <ratamir>
Severity: high Docs Contact:
Priority: medium    
Version: 4.20.0CC: amureini, bugs, gshinar, nsoffer, tnisan, ykaul, ylavi
Target Milestone: ---Keywords: Automation, Reopened
Target Release: ---Flags: sbonazzo: ovirt-4.2-
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-09-03 14:54:50 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Storage RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
supervdsm log
none
engine log
none
vdsm log
none
The right supervdsm log none

Description gshinar 2017-04-20 08:53:54 UTC
Created attachment 1272891 [details]
supervdsm log

Description of problem:
Two hosts are asking for a mount. Host1 succeeds but host0 fails with a timeout.


Version-Release number of selected component (if applicable):
master branch


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:
MainProcess|jsonrpc/3::DEBUG::2017-04-19 18:55:58,076::supervdsm_server::92::SuperVdsm.ServerCallback::(wrapper) call mount with (u'192.168.201.3:/exports/nfs/share1', u'/rhev/data-center/mnt/192.168.201.3:_exports_nfs_share1') {'vfstype': 'nfs', 'mntOpts': 'soft,nosharecache,timeo=600,retrans=6,nfsvers=4,minorversion=2', 'timeout': None, 'cgroup': None}
MainProcess|jsonrpc/3::DEBUG::2017-04-19 18:55:58,076::commands::69::root::(execCmd) /usr/bin/taskset --cpu-list 0-1 /usr/bin/mount -t nfs -o soft,nosharecache,timeo=600,retrans=6,nfsvers=4,minorversion=2 192.168.201.3:/exports/nfs/share1 /rhev/data-center/mnt/192.168.201.3:_exports_nfs_share1 (cwd None)


Expected results:
MainProcess|jsonrpc/5::DEBUG::2017-04-19 18:55:56,300::supervdsm_server::92::SuperVdsm.ServerCallback::(wrapper) call mount with (u'192.168.201.3:/exports/nfs/share2', u'/rhev/data-center/mnt/192.168.201.3:_exports_nfs_share2') {'vfstype': 'nfs', 'mntOpts': 'soft,nosharecache,timeo=600,retrans=6,nfsvers=4,minorversion=1', 'timeout': None, 'cgroup': None}
MainProcess|jsonrpc/5::DEBUG::2017-04-19 18:55:56,301::commands::69::root::(execCmd) /usr/bin/taskset --cpu-list 0-1 /usr/bin/mount -t nfs -o soft,nosharecache,timeo=600,retrans=6,nfsvers=4,minorversion=1 192.168.201.3:/exports/nfs/share2 /rhev/data-center/mnt/192.168.201.3:_exports_nfs_share2 (cwd None)
MainProcess|jsonrpc/5::DEBUG::2017-04-19 18:55:56,338::supervdsm_server::99::SuperVdsm.ServerCallback::(wrapper) return mount with None
MainProcess|jsonrpc/5::DEBUG::2017-04-19 18:55:56,357::supervdsm_server::92::SuperVdsm.ServerCallback::(wrapper) call validateAccess with ('vdsm', ('kvm',), '/rhev/data-center/mnt/192.168.201.3:_exports_nfs_share2', 7) {}
MainProcess|jsonrpc/5::DEBUG::2017-04-19 18:55:56,362::supervdsm_server::99::SuperVdsm.ServerCallback::(wrapper) return validateAccess with None
MainProcess|jsonrpc/5::DEBUG::2017-04-19 18:55:56,362::supervdsm_server::92::SuperVdsm.ServerCallback::(wrapper) call validateAccess with ('qemu', ('qemu', 'kvm'), '/rhev/data-center/mnt/192.168.201.3:_exports_nfs_share2', 5) {}
MainProcess|jsonrpc/5::DEBUG::2017-04-19 18:55:56,367::supervdsm_server::99::SuperVdsm.ServerCallback::(wrapper) return validateAccess with None


Additional info:

Comment 1 gshinar 2017-04-20 08:54:30 UTC
Created attachment 1272892 [details]
engine log

Comment 2 gshinar 2017-04-20 08:54:51 UTC
Created attachment 1272893 [details]
vdsm log

Comment 3 Yaniv Kaul 2017-04-20 12:42:57 UTC
Gil - are you sure that's the correct VDSM log, from host 0?

Comment 4 Nir Soffer 2017-04-20 12:58:42 UTC
Based on information from the mailing list thread:
http://lists.ovirt.org/pipermail/devel/2017-April/030203.html

We have concurent mount requests with different nfs versions (3, 4.1, 4.2). A mount
request for nfs 4.2 was stuck for 3 minutes, while same request was successful 
on another host.

Looks like an issue with the nfs server, not vdsm issue.

I would try to simulate this without OST:
1. start nfs server
2. start 2 hosts
3. Run concurrent mount requests on both hosts with different nfs versions
   we can write a simple script to do this.

Comment 5 Nir Soffer 2017-04-20 13:00:58 UTC
Can we get the configuration of the nfs server?

Comment 6 gshinar 2017-04-23 10:06:49 UTC
Created attachment 1273447 [details]
The right supervdsm log

I think I have attached the wrong supervdsm log
This is the right one.

Comment 7 gshinar 2017-04-23 10:09:11 UTC
(In reply to Nir Soffer from comment #5)
> Can we get the configuration of the nfs server?

It is a netapp. We can ask from engops if it is still relevant

Comment 8 Allon Mureinik 2017-05-15 11:43:40 UTC
(In reply to gshinar from comment #7)
> (In reply to Nir Soffer from comment #5)
> > Can we get the configuration of the nfs server?
> 
> It is a netapp. We can ask from engops if it is still relevant
Please do.
Without that, any response here is just guesswork.

Comment 9 gshinar 2017-05-15 13:02:32 UTC
Can you please elaborate on what exactly do you need?
Which files? Commands output so I'll be able to ask from engops.

Comment 10 Nir Soffer 2017-05-15 15:31:49 UTC
I don't think the underlying storage backing the nfs server matter. I want to see
the configuration of the nfs server used in ovirt system tests.

I guess these will be enough:
/etc/exports
/etc/sysconfig/nfs

Comment 11 Allon Mureinik 2017-06-13 08:54:22 UTC
Needinfo went unanswered for two weeks, closing.
Please reopen if you can provide the info requested in comment 10.

Comment 12 Yaniv Kaul 2017-07-09 12:04:23 UTC
I'll be providing the additional required data, therefore re-opening.

Comment 13 Yaniv Kaul 2017-07-09 14:21:32 UTC
(In reply to Nir Soffer from comment #10)
> I don't think the underlying storage backing the nfs server matter. I want
> to see
> the configuration of the nfs server used in ovirt system tests.
> 
> I guess these will be enough:
> /etc/exports
> /etc/sysconfig/nfs

[root@lago-basic-suite-master-engine ~]# cat /etc/exports
/exports/nfs/share1 *(rw,sync,no_root_squash,no_all_squash)
/exports/nfs/exported *(rw,sync,no_root_squash,no_all_squash)
/exports/nfs/iso *(rw,sync,no_root_squash,no_all_squash)
/exports/nfs/share2 *(rw,sync,no_root_squash,no_all_squash)

[root@lago-basic-suite-master-engine ~]# cat /etc/sysconfig/nfs  |grep -v "#"


RPCNFSDARGS="-V 4.2"
RPCMOUNTDOPTS="-p 892"
STATDARG="-p 662"
SMNOTIFYARGS=""
RPCIDMAPDARGS=""
RPCGSSDARGS=""
GSS_USE_PROXY="yes"
RPCSVCGSSDARGS=""
BLKMAPDARGS=""

Comment 14 Allon Mureinik 2017-09-03 14:54:50 UTC
This test is no longer used, and there's no real world use case for this, closing.