1251827 – NFS storage can not be mounted when network is bond during Hosted Engine setup

Bug 1251827 - NFS storage can not be mounted when network is bond during Hosted Engine setup

Summary: NFS storage can not be mounted when network is bond during Hosted Engine setup

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Virtualization Manager
Classification:	Red Hat
Component:	ovirt-node-plugin-hosted-engine
Sub Component:
Version:	3.5.4
Hardware:	Unspecified
OS:	Unspecified
Priority:	urgent
Severity:	urgent
Target Milestone:	ovirt-3.6.3
Target Release:	3.6.0
Assignee:	Douglas Schilling Landgraf
QA Contact:	wanghui
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1250199 1257980 1273072
TreeView+	depends on / blocked

Reported:	2015-08-10 05:47 UTC by wanghui
Modified:	2016-03-09 14:34 UTC (History)
CC List:	21 users (show)
Fixed In Version:	ovirt-node-plugin-hosted-engine-0.3.0-1.el7ev
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Clones:	1257980 1273072 (view as bug list)
Environment:
Last Closed:	2016-03-09 14:34:50 UTC
oVirt Team:	Node
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
log files in rhevh part (189.78 KB, application/x-gzip) 2015-08-10 05:47 UTC, wanghui	no flags	Details
nfs_error (12.12 KB, application/x-gzip) 2016-02-07 14:11 UTC, Michael Burman	no flags	Details
View All

Links
System	ID	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHBA-2016:0378	normal	SHIPPED_LIVE	ovirt-node bug fix and enhancement update for RHEV 3.6	2016-03-09 19:06:36 UTC
oVirt gerrit	44758	master	MERGED	adding restart of rpc-statd on network config	Never
oVirt gerrit	44773	ovirt-3.5	MERGED	adding restart of rpc-statd on network config	Never
oVirt gerrit	47161	ovirt-3.5	MERGED	adding restart of rpc-statd on network config	Never

Description wanghui 2015-08-10 05:47:43 UTC

Created attachment 1060944 [details]
log files in rhevh part

Description of problem:
The NFS storage can not be mounted when network is bonding during Hosted Engine setup. No such issue when using nonbonding network like em1.

Version-Release number of selected component (if applicable):
rhev-hypervisor7-7.1-20150805.0.el7ev
ovirt-node-3.2.3-16.el7.noarch
ovirt-node-plugin-hosted-engine-0.2.0-18.0.el7ev.noarch
ovirt-hosted-engine-setup-1.2.5.2-1.el7ev.noarch

How reproducible:
100%

Steps to Reproduce:
1. Clean install rhev-hypervisor7-7.1-20150805.0.el7ev
2. Create bond with mode=1

# cat /proc/net/bonding/bond1 
Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)

Bonding Mode: fault-tolerance (active-backup)
Primary Slave: None
Currently Active Slave: p3p1
MII Status: up
MII Polling Interval (ms): 100
Up Delay (ms): 0
Down Delay (ms): 0

Slave Interface: p3p1
MII Status: up
Speed: 1000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: 00:1b:21:36:79:f0
Slave queue ID: 0

Slave Interface: p3p2
MII Status: up
Speed: 1000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: 00:1b:21:36:79:f1
Slave queue ID: 0

3. Donwload ova file to start first host setup
4. Select nfs3 as the storage.
4. Please specify the storage you would like to use (iscsi, nfs3, nfs4)[nfs3]: <enter>
5. Please specify the full shared storage connection path to use (example: host:/path): 10.66.65.196:/home/huiwa/nfs1
[ ERROR ] Error while mounting specified storage path: mount.nfs: rpc.statd is not running but is required for remote locking. mount.nfs: Either use '-o nolock' to keep locks local, or start statd. mount.nfs: an incorrect mount option was specified
[WARNING] Cannot unmount /tmp/tmpgsQcrd 
[ ERROR ] Cannot access storage connection 10.66.65.196:/home/huiwa/nfs1: mount.nfs: rpc.statd is not running but is required for remote locking. mount.nfs: Either use '-o nolock' to keep locks local, or start statd. mount.nfs: an incorrect mount option was specified
          Please specify the full shared storage connection path to use (example: host:/path):

Actual results:
1. After step4, it reports the error like follows.

Please specify the storage you would like to use (iscsi, nfs3, nfs4)[nfs3]: <enter>
Please specify the full shared storage connection path to use (example: host:/path): 10.66.65.196:/home/huiwa/nfs1
[ ERROR ] Error while mounting specified storage path: mount.nfs: rpc.statd is not running but is required for remote locking. mount.nfs: Either use '-o nolock' to keep locks local, or start statd. mount.nfs: an incorrect mount option was specified
[WARNING] Cannot unmount /tmp/tmpgsQcrd 
[ ERROR ] Cannot access storage connection 10.66.65.196:/home/huiwa/nfs1: mount.nfs: rpc.statd is not running but is required for remote locking. mount.nfs: Either use '-o nolock' to keep locks local, or start statd. mount.nfs: an incorrect mount option was specified
          Please specify the full shared storage connection path to use (example: host:/path):

Expected results:
1. The nfs storage can be added succeed.

Additional info:
1. No such issue when network is em1.

Comment 2 Fabian Deutsch 2015-08-10 08:27:49 UTC

Sandro, is he-setup taking care to bring up the right daemons? or is it expected that the OS os configured properly?

Or do you have any other cmoments wrt the error above?

Comment 3 Fabian Deutsch 2015-08-10 12:41:14 UTC

Could this also be related to bug 1159183?

Comment 4 Sandro Bonazzola 2015-08-10 13:18:38 UTC

(In reply to Fabian Deutsch from comment #2)
> Sandro, is he-setup taking care to bring up the right daemons? or is it
> expected that the OS os configured properly?

well, hosted-engine expect that the system is configured properly to some extent.
Specifically it expects that if you're trying to use a NFS storage, you can actually mount it.

> 
> Or do you have any other cmoments wrt the error above?

In this specific case, the setup has been ran at 05:05:28 (according to setup log)
and rpc.statd was running according to message log:

Aug  7 04:37:12 localhost rpc.statd[16855]: Version 1.3.0 starting

So something else is preventing mount to work properly.

Comment 5 Fabian Deutsch 2015-08-10 13:28:42 UTC

Thanks Sandro.

Hui, can you manually mount that path from the description on RHEV-H?

Comment 6 wanghui 2015-08-12 03:10:33 UTC

(In reply to Fabian Deutsch from comment #5)
> Thanks Sandro.
> 
> Hui, can you manually mount that path from the description on RHEV-H?

Yes, manually mount can be succeed. And the same nfs path can be used when network is not bonded.

Comment 7 wanghui 2015-08-12 03:20:58 UTC

(In reply to wanghui from comment #6)
> (In reply to Fabian Deutsch from comment #5)
> > Thanks Sandro.
> > 
> > Hui, can you manually mount that path from the description on RHEV-H?
> 
> Yes, manually mount can be succeed. And the same nfs path can be used when
> network is not bonded.

I have tried manally mount with the default version. And checked that it used nfsv4 for default.
# mount -t nfs 10.66.65.196:/home/huiwa/nfs1 /tmp  -- succeed

But failed when use nfsv3.
# mount -t nfs -o vers=3,retry=1 10.66.65.196:/home/huiwa/nfs1 /tmp

Failed with the follow errors.
mount.nfs: rpc.statd is not running but is required for remote locking.
mount.nfs: Either use '-o nolock' to keep locks local, or start statd.
mount.nfs: an incorrect mount option was specified

And nfs path can be mounted through nfsv3 version when add "nolock".
# mount -t nfs -o vers=3,retry=1,nolock 10.66.65.196:/home/huiwa/nfs1 /tmp

# mount |grep nfs1
10.66.65.196:/home/huiwa/nfs1 on /tmp type nfs (rw,relatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,hard,nolock,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=10.66.65.196,mountvers=3,mountport=20048,mountproto=udp,local_lock=all,addr=10.66.65.196)
10.66.65.196:/home/huiwa/nfs1 on /var/lib/stateless/writable/tmp type nfs (rw,relatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,hard,nolock,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=10.66.65.196,mountvers=3,mountport=20048,mountproto=udp,local_lock=all,addr=10.66.65.196)

Comment 8 Fabian Deutsch 2015-08-12 11:01:29 UTC

We need to understand if statd is installed an enabled or not. Or if something else is missing.

Comment 9 Anatoly Litovsky 2015-08-12 12:53:31 UTC

It looks that rpc-statd needs to be restarted after bond creation

Comment 14 Ivan Makfinsky 2015-09-11 13:37:20 UTC

Found the same issue - rpc-statd service needs to be restarted on 7.1 hypervisor in order to mount NFS storage domains.

systemctl reports that rpc-statd is running but the logs indicate that the mount fails and that rpc-statd is not running.

Restarting rpc-statd service resolves the issue and NFSv3 storage domains then mount automatically afterwards.

Comment 15 Ivan Makfinsky 2015-09-11 13:37:46 UTC

Found the same issue - rpc-statd service needs to be restarted on 7.1 hypervisor in order to mount NFS storage domains.

systemctl reports that rpc-statd is running but the logs indicate that the mount fails and that rpc-statd is not running.

Restarting rpc-statd service resolves the issue and NFSv3 storage domains then mount automatically afterwards.

Comment 18 wanghui 2015-11-12 09:10:22 UTC

Test version:
rhev-hypervisor7-7.2-20151104.0.iso
ovirt-node-3.6.0-0.20.20151103git3d3779a.el7ev.noarch
ovirt-node-plugin-hosted-engine-0.3.0-2.el7ev.noarch

Test steps:
1. Clean install rhev-hypervisor7-7.2-20151104.0.iso
2. Create bond with mode=1
3. Donwload ova file to start first host setup
4. Select nfs3 as the storage.
Please specify the full shared storage connection path to use (example: host:/path): 10.66.9.243:/home/test
[ INFO  ] Installing on first host

Test result:
1. NFS can be mounted succeed.

So this issue is fixed in ovirt-node-plugin-hosted-engine-0.3.0-2.el7ev.noarch.

Comment 19 Michael Burman 2016-02-07 14:05:29 UTC

Hi

This bug should be re opened because it still happens on ovirt-node-plugin-hosted-engine-0.3.0-6.el7ev.noarch when trying to run HE deploy over rhev-H over vlan tagged bond using a rhevm-appliance 

[ ERROR ] Cannot access storage connection mount.nfs: rpc.statd is not running but is required for remote locking. mount.nfs: Either use '-o nolock' to keep locks local, or start statd. mount.nfs: an incorrect mount option was specified

Restarting rpc-statd service is not helping. 

tested with -  rhevm-appliance-20160128.1-1 over a rhev-h 7.2 20160126.0.el7ev
ovirt-node-3.6.1-5.0.el7ev.noarch

root@orchid-vds2-vlan162 ~]# ip -4 a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN 
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
9: bond0.162@bond0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP 
    inet 10.35.129.15/24 brd 10.35.129.255 scope global dynamic bond0.162
       valid_lft 37847sec preferred_lft 37847sec

Comment 20 Michael Burman 2016-02-07 14:11:00 UTC

Created attachment 1121911 [details]
nfs_error

Comment 21 Allon Mureinik 2016-02-08 10:08:29 UTC

Didn't we add a restart of rpc-statsd already? Or what that only in VDSM and not HE?

Comment 22 Douglas Schilling Landgraf 2016-02-08 21:54:04 UTC

(In reply to Allon Mureinik from comment #21)
> Didn't we add a restart of rpc-statsd already? Or what that only in VDSM and
> not HE?

I believe the issue here at this stage is not restart, as reporter shared in comment#19 as it doesn't help. I have tried myself in Michael's machine the restart of rpcbind, rpc-statd and doesn't work either (As I couldn't reproduce the issue locally).

I have noticed that there is a report open against nfs-utils that involve VDSM [1] using the nfs-utils 1.3.0.0.21.el7 which is the same version of the host and the last comment in BZ from Steve is suggesting to update to the last nfs-utils and I did in the Michal's host and after a restart of rpc-statd, the mount worked nicely.

# mount -o remount,rw /
# wget http://<brew>/brewroot/packages/nfs-utils/1.3.0/0.22.el7/x86_64/nfs-utils-1.3.0-0.22.el7.x86_64.rpm
# /bin/systemctl restart  rpc-statd.service
# mount -tnfs -overs=3,retry=1, IP_ADDR:/vol/RHEV/Network/mburman/HE_Over_BOND /mnt
#

Data from Machiel's host before the upgrade:

# cat /etc/redhat-release 
Red Hat Enterprise Virtualization Hypervisor (Beta) release 7.2 (20160126.0.el7ev)

# rpm -qa | grep -i nfs-utils
nfs-utils-1.3.0-0.21.el7.x86_64

# rpm -qa | grep -i vdsm
vdsm-xmlrpc-4.17.18-0.el7ev.noarch
ovirt-node-plugin-vdsm-0.6.1-7.el7ev.noarch
vdsm-yajsonrpc-4.17.18-0.el7ev.noarch
vdsm-python-4.17.18-0.el7ev.noarch
vdsm-cli-4.17.18-0.el7ev.noarch
vdsm-infra-4.17.18-0.el7ev.noarch
vdsm-hook-vmfex-dev-4.17.18-0.el7ev.noarch
vdsm-jsonrpc-4.17.18-0.el7ev.noarch
vdsm-4.17.18-0.el7ev.noarch
vdsm-hook-ethtool-options-4.17.18-0.el7ev.noarch


@Michael, could you please try a deploy of HE now with the nfs-utils updated in your machine?


[1] 
[vdsm] NFS mount fails sometimes with "rpc.statd is not running but is required for remote locking"
https://bugzilla.redhat.com/show_bug.cgi?id=1275082

Comment 25 Michael Burman 2016-02-09 08:04:39 UTC

(In reply to Douglas Schilling Landgraf from comment #22)
> (In reply to Allon Mureinik from comment #21)
> > Didn't we add a restart of rpc-statsd already? Or what that only in VDSM and
> > not HE?
> 
> I believe the issue here at this stage is not restart, as reporter shared in
> comment#19 as it doesn't help. I have tried myself in Michael's machine the
> restart of rpcbind, rpc-statd and doesn't work either (As I couldn't
> reproduce the issue locally).
> 
> I have noticed that there is a report open against nfs-utils that involve
> VDSM [1] using the nfs-utils 1.3.0.0.21.el7 which is the same version of the
> host and the last comment in BZ from Steve is suggesting to update to the
> last nfs-utils and I did in the Michal's host and after a restart of
> rpc-statd, the mount worked nicely.
> 
> # mount -o remount,rw /
> # wget
> http://<brew>/brewroot/packages/nfs-utils/1.3.0/0.22.el7/x86_64/nfs-utils-1.
> 3.0-0.22.el7.x86_64.rpm
> # /bin/systemctl restart  rpc-statd.service
> # mount -tnfs -overs=3,retry=1,
> IP_ADDR:/vol/RHEV/Network/mburman/HE_Over_BOND /mnt
> #
> 
> Data from Machiel's host before the upgrade:
> 
> # cat /etc/redhat-release 
> Red Hat Enterprise Virtualization Hypervisor (Beta) release 7.2
> (20160126.0.el7ev)
> 
> # rpm -qa | grep -i nfs-utils
> nfs-utils-1.3.0-0.21.el7.x86_64
> 
> # rpm -qa | grep -i vdsm
> vdsm-xmlrpc-4.17.18-0.el7ev.noarch
> ovirt-node-plugin-vdsm-0.6.1-7.el7ev.noarch
> vdsm-yajsonrpc-4.17.18-0.el7ev.noarch
> vdsm-python-4.17.18-0.el7ev.noarch
> vdsm-cli-4.17.18-0.el7ev.noarch
> vdsm-infra-4.17.18-0.el7ev.noarch
> vdsm-hook-vmfex-dev-4.17.18-0.el7ev.noarch
> vdsm-jsonrpc-4.17.18-0.el7ev.noarch
> vdsm-4.17.18-0.el7ev.noarch
> vdsm-hook-ethtool-options-4.17.18-0.el7ev.noarch
> 
> 
> @Michael, could you please try a deploy of HE now with the nfs-utils updated
> in your machine?
Douglas, thanks, this is helped)
> 
> 
> [1] 
> [vdsm] NFS mount fails sometimes with "rpc.statd is not running but is
> required for remote locking"
> https://bugzilla.redhat.com/show_bug.cgi?id=1275082

Comment 26 Fabian Deutsch 2016-02-09 09:20:37 UTC

Michael, can you please check if nfs-utils-1.3.0-0.21.el7_2 also fixes the issue?

This build is the one which should get released today.

Comment 30 Steve Dickson 2016-02-10 15:47:20 UTC

Why is there a needinfo for me?

Comment 31 Michael Burman 2016-02-10 16:03:30 UTC

There was a need info on you(see comment 23^^) from Douglas that i removed by mistake.

Comment 33 wanghui 2016-02-24 08:33:36 UTC

Test version:
rhevh-7.2-20160222.0.el7ev.iso
ovirt-node-plugin-hosted-engine-0.3.0-7.el7ev.noarch

Test step:
Scenario 1:
1. Install rhevh
2. Enable bond with mode 1
3. Select nfs3 as the storage.

Scenario 2:
1. Install rhevh
2. Enable bond+vlan with mode 1
3. Select nfs3 as the storage.

Test result:
1. Both scenario 1&2 can mount nfs3 storage succeed during HE setup.

So this issue is fixed in ovirt-node-plugin-hosted-engine-0.3.0-7.el7ev.noarch. Change status to verified.

Comment 35 errata-xmlrpc 2016-03-09 14:34:50 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-0378.html

Note You need to log in before you can comment on or make changes to this bug.

amureini
bmcclain
cshao
cwu
dfediuck
dougsland
fdeutsch
gklein
huiwa
huzhao
istein
ivan.makfinsky
leiwang
lsurette
mburman
pstehlik
sbonazzo
steved
yaniwang
ycui
ykaul