Created attachment 1060944 [details] log files in rhevh part Description of problem: The NFS storage can not be mounted when network is bonding during Hosted Engine setup. No such issue when using nonbonding network like em1. Version-Release number of selected component (if applicable): rhev-hypervisor7-7.1-20150805.0.el7ev ovirt-node-3.2.3-16.el7.noarch ovirt-node-plugin-hosted-engine-0.2.0-18.0.el7ev.noarch ovirt-hosted-engine-setup-1.2.5.2-1.el7ev.noarch How reproducible: 100% Steps to Reproduce: 1. Clean install rhev-hypervisor7-7.1-20150805.0.el7ev 2. Create bond with mode=1 # cat /proc/net/bonding/bond1 Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011) Bonding Mode: fault-tolerance (active-backup) Primary Slave: None Currently Active Slave: p3p1 MII Status: up MII Polling Interval (ms): 100 Up Delay (ms): 0 Down Delay (ms): 0 Slave Interface: p3p1 MII Status: up Speed: 1000 Mbps Duplex: full Link Failure Count: 0 Permanent HW addr: 00:1b:21:36:79:f0 Slave queue ID: 0 Slave Interface: p3p2 MII Status: up Speed: 1000 Mbps Duplex: full Link Failure Count: 0 Permanent HW addr: 00:1b:21:36:79:f1 Slave queue ID: 0 3. Donwload ova file to start first host setup 4. Select nfs3 as the storage. 4. Please specify the storage you would like to use (iscsi, nfs3, nfs4)[nfs3]: <enter> 5. Please specify the full shared storage connection path to use (example: host:/path): 10.66.65.196:/home/huiwa/nfs1 [ ERROR ] Error while mounting specified storage path: mount.nfs: rpc.statd is not running but is required for remote locking. mount.nfs: Either use '-o nolock' to keep locks local, or start statd. mount.nfs: an incorrect mount option was specified [WARNING] Cannot unmount /tmp/tmpgsQcrd [ ERROR ] Cannot access storage connection 10.66.65.196:/home/huiwa/nfs1: mount.nfs: rpc.statd is not running but is required for remote locking. mount.nfs: Either use '-o nolock' to keep locks local, or start statd. mount.nfs: an incorrect mount option was specified Please specify the full shared storage connection path to use (example: host:/path): Actual results: 1. After step4, it reports the error like follows. Please specify the storage you would like to use (iscsi, nfs3, nfs4)[nfs3]: <enter> Please specify the full shared storage connection path to use (example: host:/path): 10.66.65.196:/home/huiwa/nfs1 [ ERROR ] Error while mounting specified storage path: mount.nfs: rpc.statd is not running but is required for remote locking. mount.nfs: Either use '-o nolock' to keep locks local, or start statd. mount.nfs: an incorrect mount option was specified [WARNING] Cannot unmount /tmp/tmpgsQcrd [ ERROR ] Cannot access storage connection 10.66.65.196:/home/huiwa/nfs1: mount.nfs: rpc.statd is not running but is required for remote locking. mount.nfs: Either use '-o nolock' to keep locks local, or start statd. mount.nfs: an incorrect mount option was specified Please specify the full shared storage connection path to use (example: host:/path): Expected results: 1. The nfs storage can be added succeed. Additional info: 1. No such issue when network is em1.
Sandro, is he-setup taking care to bring up the right daemons? or is it expected that the OS os configured properly? Or do you have any other cmoments wrt the error above?
Could this also be related to bug 1159183?
(In reply to Fabian Deutsch from comment #2) > Sandro, is he-setup taking care to bring up the right daemons? or is it > expected that the OS os configured properly? well, hosted-engine expect that the system is configured properly to some extent. Specifically it expects that if you're trying to use a NFS storage, you can actually mount it. > > Or do you have any other cmoments wrt the error above? In this specific case, the setup has been ran at 05:05:28 (according to setup log) and rpc.statd was running according to message log: Aug 7 04:37:12 localhost rpc.statd[16855]: Version 1.3.0 starting So something else is preventing mount to work properly.
Thanks Sandro. Hui, can you manually mount that path from the description on RHEV-H?
(In reply to Fabian Deutsch from comment #5) > Thanks Sandro. > > Hui, can you manually mount that path from the description on RHEV-H? Yes, manually mount can be succeed. And the same nfs path can be used when network is not bonded.
(In reply to wanghui from comment #6) > (In reply to Fabian Deutsch from comment #5) > > Thanks Sandro. > > > > Hui, can you manually mount that path from the description on RHEV-H? > > Yes, manually mount can be succeed. And the same nfs path can be used when > network is not bonded. I have tried manally mount with the default version. And checked that it used nfsv4 for default. # mount -t nfs 10.66.65.196:/home/huiwa/nfs1 /tmp -- succeed But failed when use nfsv3. # mount -t nfs -o vers=3,retry=1 10.66.65.196:/home/huiwa/nfs1 /tmp Failed with the follow errors. mount.nfs: rpc.statd is not running but is required for remote locking. mount.nfs: Either use '-o nolock' to keep locks local, or start statd. mount.nfs: an incorrect mount option was specified And nfs path can be mounted through nfsv3 version when add "nolock". # mount -t nfs -o vers=3,retry=1,nolock 10.66.65.196:/home/huiwa/nfs1 /tmp # mount |grep nfs1 10.66.65.196:/home/huiwa/nfs1 on /tmp type nfs (rw,relatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,hard,nolock,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=10.66.65.196,mountvers=3,mountport=20048,mountproto=udp,local_lock=all,addr=10.66.65.196) 10.66.65.196:/home/huiwa/nfs1 on /var/lib/stateless/writable/tmp type nfs (rw,relatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,hard,nolock,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=10.66.65.196,mountvers=3,mountport=20048,mountproto=udp,local_lock=all,addr=10.66.65.196)
We need to understand if statd is installed an enabled or not. Or if something else is missing.
It looks that rpc-statd needs to be restarted after bond creation
Found the same issue - rpc-statd service needs to be restarted on 7.1 hypervisor in order to mount NFS storage domains. systemctl reports that rpc-statd is running but the logs indicate that the mount fails and that rpc-statd is not running. Restarting rpc-statd service resolves the issue and NFSv3 storage domains then mount automatically afterwards.
Test version: rhev-hypervisor7-7.2-20151104.0.iso ovirt-node-3.6.0-0.20.20151103git3d3779a.el7ev.noarch ovirt-node-plugin-hosted-engine-0.3.0-2.el7ev.noarch Test steps: 1. Clean install rhev-hypervisor7-7.2-20151104.0.iso 2. Create bond with mode=1 3. Donwload ova file to start first host setup 4. Select nfs3 as the storage. Please specify the full shared storage connection path to use (example: host:/path): 10.66.9.243:/home/test [ INFO ] Installing on first host Test result: 1. NFS can be mounted succeed. So this issue is fixed in ovirt-node-plugin-hosted-engine-0.3.0-2.el7ev.noarch.
Hi This bug should be re opened because it still happens on ovirt-node-plugin-hosted-engine-0.3.0-6.el7ev.noarch when trying to run HE deploy over rhev-H over vlan tagged bond using a rhevm-appliance [ ERROR ] Cannot access storage connection mount.nfs: rpc.statd is not running but is required for remote locking. mount.nfs: Either use '-o nolock' to keep locks local, or start statd. mount.nfs: an incorrect mount option was specified Restarting rpc-statd service is not helping. tested with - rhevm-appliance-20160128.1-1 over a rhev-h 7.2 20160126.0.el7ev ovirt-node-3.6.1-5.0.el7ev.noarch root@orchid-vds2-vlan162 ~]# ip -4 a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever 9: bond0.162@bond0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP inet 10.35.129.15/24 brd 10.35.129.255 scope global dynamic bond0.162 valid_lft 37847sec preferred_lft 37847sec
Created attachment 1121911 [details] nfs_error
Didn't we add a restart of rpc-statsd already? Or what that only in VDSM and not HE?
(In reply to Allon Mureinik from comment #21) > Didn't we add a restart of rpc-statsd already? Or what that only in VDSM and > not HE? I believe the issue here at this stage is not restart, as reporter shared in comment#19 as it doesn't help. I have tried myself in Michael's machine the restart of rpcbind, rpc-statd and doesn't work either (As I couldn't reproduce the issue locally). I have noticed that there is a report open against nfs-utils that involve VDSM [1] using the nfs-utils 1.3.0.0.21.el7 which is the same version of the host and the last comment in BZ from Steve is suggesting to update to the last nfs-utils and I did in the Michal's host and after a restart of rpc-statd, the mount worked nicely. # mount -o remount,rw / # wget http://<brew>/brewroot/packages/nfs-utils/1.3.0/0.22.el7/x86_64/nfs-utils-1.3.0-0.22.el7.x86_64.rpm # /bin/systemctl restart rpc-statd.service # mount -tnfs -overs=3,retry=1, IP_ADDR:/vol/RHEV/Network/mburman/HE_Over_BOND /mnt # Data from Machiel's host before the upgrade: # cat /etc/redhat-release Red Hat Enterprise Virtualization Hypervisor (Beta) release 7.2 (20160126.0.el7ev) # rpm -qa | grep -i nfs-utils nfs-utils-1.3.0-0.21.el7.x86_64 # rpm -qa | grep -i vdsm vdsm-xmlrpc-4.17.18-0.el7ev.noarch ovirt-node-plugin-vdsm-0.6.1-7.el7ev.noarch vdsm-yajsonrpc-4.17.18-0.el7ev.noarch vdsm-python-4.17.18-0.el7ev.noarch vdsm-cli-4.17.18-0.el7ev.noarch vdsm-infra-4.17.18-0.el7ev.noarch vdsm-hook-vmfex-dev-4.17.18-0.el7ev.noarch vdsm-jsonrpc-4.17.18-0.el7ev.noarch vdsm-4.17.18-0.el7ev.noarch vdsm-hook-ethtool-options-4.17.18-0.el7ev.noarch @Michael, could you please try a deploy of HE now with the nfs-utils updated in your machine? [1] [vdsm] NFS mount fails sometimes with "rpc.statd is not running but is required for remote locking" https://bugzilla.redhat.com/show_bug.cgi?id=1275082
(In reply to Douglas Schilling Landgraf from comment #22) > (In reply to Allon Mureinik from comment #21) > > Didn't we add a restart of rpc-statsd already? Or what that only in VDSM and > > not HE? > > I believe the issue here at this stage is not restart, as reporter shared in > comment#19 as it doesn't help. I have tried myself in Michael's machine the > restart of rpcbind, rpc-statd and doesn't work either (As I couldn't > reproduce the issue locally). > > I have noticed that there is a report open against nfs-utils that involve > VDSM [1] using the nfs-utils 1.3.0.0.21.el7 which is the same version of the > host and the last comment in BZ from Steve is suggesting to update to the > last nfs-utils and I did in the Michal's host and after a restart of > rpc-statd, the mount worked nicely. > > # mount -o remount,rw / > # wget > http://<brew>/brewroot/packages/nfs-utils/1.3.0/0.22.el7/x86_64/nfs-utils-1. > 3.0-0.22.el7.x86_64.rpm > # /bin/systemctl restart rpc-statd.service > # mount -tnfs -overs=3,retry=1, > IP_ADDR:/vol/RHEV/Network/mburman/HE_Over_BOND /mnt > # > > Data from Machiel's host before the upgrade: > > # cat /etc/redhat-release > Red Hat Enterprise Virtualization Hypervisor (Beta) release 7.2 > (20160126.0.el7ev) > > # rpm -qa | grep -i nfs-utils > nfs-utils-1.3.0-0.21.el7.x86_64 > > # rpm -qa | grep -i vdsm > vdsm-xmlrpc-4.17.18-0.el7ev.noarch > ovirt-node-plugin-vdsm-0.6.1-7.el7ev.noarch > vdsm-yajsonrpc-4.17.18-0.el7ev.noarch > vdsm-python-4.17.18-0.el7ev.noarch > vdsm-cli-4.17.18-0.el7ev.noarch > vdsm-infra-4.17.18-0.el7ev.noarch > vdsm-hook-vmfex-dev-4.17.18-0.el7ev.noarch > vdsm-jsonrpc-4.17.18-0.el7ev.noarch > vdsm-4.17.18-0.el7ev.noarch > vdsm-hook-ethtool-options-4.17.18-0.el7ev.noarch > > > @Michael, could you please try a deploy of HE now with the nfs-utils updated > in your machine? Douglas, thanks, this is helped) > > > [1] > [vdsm] NFS mount fails sometimes with "rpc.statd is not running but is > required for remote locking" > https://bugzilla.redhat.com/show_bug.cgi?id=1275082
Michael, can you please check if nfs-utils-1.3.0-0.21.el7_2 also fixes the issue? This build is the one which should get released today.
Why is there a needinfo for me?
There was a need info on you(see comment 23^^) from Douglas that i removed by mistake.
Test version: rhevh-7.2-20160222.0.el7ev.iso ovirt-node-plugin-hosted-engine-0.3.0-7.el7ev.noarch Test step: Scenario 1: 1. Install rhevh 2. Enable bond with mode 1 3. Select nfs3 as the storage. Scenario 2: 1. Install rhevh 2. Enable bond+vlan with mode 1 3. Select nfs3 as the storage. Test result: 1. Both scenario 1&2 can mount nfs3 storage succeed during HE setup. So this issue is fixed in ovirt-node-plugin-hosted-engine-0.3.0-7.el7ev.noarch. Change status to verified.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2016-0378.html