Bug 508032

Summary: [RFE] VDSM: NFS retransmission errors are not monitored nor reported
Product: [oVirt] ovirt-engine Reporter: Yaniv Kaul <ykaul>
Component: RFEsAssignee: Yaniv Lavi <ylavi>
Status: CLOSED DEFERRED QA Contact: Elad <ebenahar>
Severity: medium Docs Contact:
Priority: medium    
Version: ---CC: acanan, amureini, bsettle, bugs, ebenahar, lpeer, mkenneth, nsoffer, ratamir, rbalakri, Rhev-m-bugs, sradco, srevivo, ykaul, ylavi
Target Milestone: ---Keywords: FutureFeature, Reopened
Target Release: ---Flags: ylavi: ovirt-4.2?
ylavi: planning_ack+
rule-engine: devel_ack?
rule-engine: testing_ack?
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Enhancement
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-08-02 15:40:00 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Metrics RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
engine, vdsm and messages logs
none
logs-17.11.16 none

Description Yaniv Kaul 2009-06-25 08:33:47 UTC
Description of problem:
I have a server that is having quite a number of NFS retransmission errors - which leads to poor VM performance:
calls      retrans    authrefrsh
518811124   512798     0

However, nothing is propagated to the admin, as we don't even monitor that. Since it's NFS, no errors will be issued, but performance would be pathetic.

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:

Comment 3 Itamar Heim 2012-12-08 23:40:39 UTC
Closing old bugs. If this issue is still relevant/important in current version, please re-open the bug.

Comment 4 Yaniv Kaul 2012-12-09 05:38:22 UTC
Still relevant.

Comment 6 Nir Soffer 2015-11-05 08:32:36 UTC
We need more data to make progress:

- Do we see slower readDelay values on such system?
- Do you know how to reproduce or simulate this issue?

Comment 7 Yaniv Kaul 2015-11-05 08:56:59 UTC
(In reply to Nir Soffer from comment #6)
> We need more data to make progress:
> 
> - Do we see slower readDelay values on such system?

Yes, we did.

> - Do you know how to reproduce or simulate this issue?

Using TC and dropping packets. A 1% packet drop should suffice.

Comment 8 Nir Soffer 2015-11-05 10:43:35 UTC
Yaniv, can we reproduce this with current code?

See comment 7 for tips.

Comment 9 Yaniv Kaul 2015-11-05 11:19:51 UTC
(In reply to Nir Soffer from comment #8)
> Yaniv, can we reproduce this with current code?

Aharon is leading RHEV storage QE, moving Needinfo to him.

> 
> See comment 7 for tips.

Comment 10 Elad 2015-11-19 14:44:49 UTC
(In reply to Nir Soffer from comment #8)
> Yaniv, can we reproduce this with current code?
> 
> See comment 7 for tips.

Reproduced using latest code (vdsm-4.17.10.1-0.el7ev.noarch).

While having NFS retransmission errors in vdsm, they are not propagated to 
RHEVM.


Used tc to simulate 10% packet loss:


[root@green-vdsb ~]# tc qdisc add dev ovirtmgmt root netem loss 10%


[root@green-vdsb ~]# nfsstat -c
...
Client rpc stats:
calls      retrans    authrefrsh
61624      94         61564
...


Not sure if the following error in vdsm.log is relevant, but this is what I got while had 10% packet loss simulated by tc:

Thread-1912::ERROR::2015-11-19 14:38:41,999::monitor::250::Storage.Monitor::(_monitorDomain) Error monitoring domain ad1ecd8f-8822-4323-9ba4-4864e4d97297
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/monitor.py", line 246, in _monitorDomain
    self._performDomainSelftest()
  File "/usr/lib/python2.7/site-packages/vdsm/utils.py", line 774, in wrapper
    value = meth(self, *a, **kw)
  File "/usr/share/vdsm/storage/monitor.py", line 313, in _performDomainSelftest
    self.domain.selftest()
  File "/usr/share/vdsm/storage/fileSD.py", line 624, in selftest
    self.oop.os.statvfs(self.domaindir)
  File "/usr/share/vdsm/storage/outOfProcess.py", line 274, in statvfs
    return self._iop.statvfs(path)
  File "/usr/lib/python2.7/site-packages/ioprocess/__init__.py", line 418, in statvfs
    resdict = self._sendCommand("statvfs", {"path": path}, self.timeout)
  File "/usr/lib/python2.7/site-packages/ioprocess/__init__.py", line 385, in _sendCommand
    raise Timeout(os.strerror(errno.ETIMEDOUT))
Timeout: Connection timed out

Comment 11 Elad 2015-11-19 15:20:25 UTC
Created attachment 1096718 [details]
engine, vdsm and messages logs

Comment 12 Yaniv Kaul 2015-12-01 11:19:45 UTC
Yaniv, not sure why it was moved again to future. It should be used (and replace the useless readDelay, at least in NFS). 
I think it's a mistake and we keep seeing such issues in customers - but are not looking at the right place. NFS retrans are always a good indication for something wrong / too much load.

Anyway, if it's not important enough to be implemented for  more than 5 years, closing.

Comment 13 Nir Soffer 2015-12-01 11:30:34 UTC
I think we should consider this for 4.0.

Comment 14 Nir Soffer 2015-12-01 11:45:31 UTC
We need to repeat the test in comment 10 with different values of packet
loss (or other network condition that trigger re-transmissions), and check
if read delay is reflecting the issue.

Comment 15 Yaniv Kaul 2016-10-30 09:59:43 UTC
(In reply to Nir Soffer from comment #14)
> We need to repeat the test in comment 10 with different values of packet
> loss (or other network condition that trigger re-transmissions), and check
> if read delay is reflecting the issue.

Aharon - can you do it? 
- If read delay is not reflecting the issue (with drops of 0.5, 1, 5 percent, for example), then we need to fix read delay.
- If it does, we can probably close this.

TIA,
Y.

Comment 16 Yaniv Lavi 2016-11-03 10:50:12 UTC
Should be part of the new metrics store via nfs collectd plugin.

Comment 17 Raz Tamir 2016-11-15 13:08:39 UTC
(In reply to Yaniv Kaul from comment #15)
> (In reply to Nir Soffer from comment #14)
> > We need to repeat the test in comment 10 with different values of packet
> > loss (or other network condition that trigger re-transmissions), and check
> > if read delay is reflecting the issue.
> 
> Aharon - can you do it? 
> - If read delay is not reflecting the issue (with drops of 0.5, 1, 5
> percent, for example), then we need to fix read delay.
> - If it does, we can probably close this.
> 
> TIA,
> Y.

Elad, can you please test it again (comment #10) with the relevant values?

Comment 18 Elad 2016-11-17 08:37:11 UTC
Created attachment 1221504 [details]
logs-17.11.16

Used:
vdsm-4.18.999-894.git5c13e49.el7.centos.x86_64
ovirt-engine-4.0.5.4-0.1.el7ev.noarch
libvirt-2.0.0-10.el7.x86_64
qemu-kvm-rhev-2.6.0-27.el7.x86_64

=====================================================================
Setup: 
One host, one NFS domain.
Host with 1 NIC with the defaults ovirtmgmt network configured as bridge.
A running guest with 2 disks on NFS domain. RHEL OS installed.

Applied tc for various packet loss rates. Each rule was applied for 5-10 minutes.

=====================================================================
5% packet loss applied:

[root@storage-ge2-vdsm4 ~]# tc qdisc add dev ovirtmgmt root netem loss 5%


[root@storage-ge2-vdsm4 ~]# nfsstat -c      
Client rpc stats:
calls      retrans    authrefrsh
13372      0          13372   


=======================
10% packet loss applied:

[root@storage-ge2-vdsm4 ~]# tc qdisc add dev ovirtmgmt root netem loss 10%


[root@storage-ge2-vdsm4 ~]# tc -s qdisc ls dev ovirtmgmt
qdisc netem 8004: root refcnt 2 limit 1000 loss 10%
 Sent 32711015 bytes 34760 pkt (dropped 2734, overlimits 0 requeues 0) 
 backlog 0b 0p requeues 0 


[root@storage-ge2-vdsm4 ~]# nfsstat -c
Client rpc stats:
calls      retrans    authrefrsh
16084      0          16084 


=======================
20% packet loss applied:

[root@storage-ge2-vdsm4 ~]# tc qdisc add dev ovirtmgmt root netem loss 20%

[root@storage-ge2-vdsm4 ~]# nfsstat -c
Client rpc stats:
calls      retrans    authrefrsh
21659      0          21659   

[root@storage-ge2-vdsm4 ~]# tc -s qdisc ls dev ovirtmgmt
qdisc netem 8005: root refcnt 2 limit 1000 loss 20%
 Sent 12213810 bytes 36381 pkt (dropped 8607, overlimits 0 requeues 0) 
 backlog 0b 0p requeues 0 


=====================================================================

Results:
During packet loss rules applied, I observed lagging in the guest while connected through console. No relevant logs seen in vdsm.log and no warning to admin in RHEVM.


Added log message to vdsm.log For each added tc rule. For example:

----------------------------------------5% packet loss applied----------------------------------------------


Logs attached

Comment 19 Elad 2016-11-17 08:40:45 UTC
*Engine: ovirt-engine-4.1.0-0.0.master.20161112231308.git672bd31.el7.centos.noarch

Comment 20 Yaniv Kaul 2016-11-17 08:48:52 UTC
Elad - shouldn't the tc be on the physical interface itself, and not on the bridge?

Comment 21 Elad 2016-11-17 08:53:27 UTC
IP configuration is for the bridge and not for the physical interface:


root@storage-ge2-vdsm4 ~]# ifconfig ovirtmgmt
ovirtmgmt: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 10.35.65.41  netmask 255.255.255.0  broadcast 10.35.65.255
        inet6 2620:52:0:2340:21a:4aff:fecc:ed0c  prefixlen 64  scopeid 0x0<global>
        inet6 fe80::21a:4aff:fecc:ed0c  prefixlen 64  scopeid 0x20<link>
        ether 00:1a:4a:cc:ed:0c  txqueuelen 1000  (Ethernet)
        RX packets 46831491  bytes 241288586957 (224.7 GiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 46857952  bytes 4944680339 (4.6 GiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

[root@storage-ge2-vdsm4 ~]# ifconfig eth0
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        ether 00:1a:4a:cc:ed:0c  txqueuelen 1000  (Ethernet)
        RX packets 64013875  bytes 278285054789 (259.1 GiB)
        RX errors 0  dropped 9  overruns 0  frame 0
        TX packets 46928098  bytes 4959421598 (4.6 GiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

Comment 22 Yaniv Kaul 2016-11-17 09:00:37 UTC
(In reply to Elad from comment #21)
> IP configuration is for the bridge and not for the physical interface:
> 
> 
> root@storage-ge2-vdsm4 ~]# ifconfig ovirtmgmt
> ovirtmgmt: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
>         inet 10.35.65.41  netmask 255.255.255.0  broadcast 10.35.65.255
>         inet6 2620:52:0:2340:21a:4aff:fecc:ed0c  prefixlen 64  scopeid
> 0x0<global>
>         inet6 fe80::21a:4aff:fecc:ed0c  prefixlen 64  scopeid 0x20<link>
>         ether 00:1a:4a:cc:ed:0c  txqueuelen 1000  (Ethernet)
>         RX packets 46831491  bytes 241288586957 (224.7 GiB)
>         RX errors 0  dropped 0  overruns 0  frame 0
>         TX packets 46857952  bytes 4944680339 (4.6 GiB)
>         TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
> 
> [root@storage-ge2-vdsm4 ~]# ifconfig eth0
> eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
>         ether 00:1a:4a:cc:ed:0c  txqueuelen 1000  (Ethernet)
>         RX packets 64013875  bytes 278285054789 (259.1 GiB)
>         RX errors 0  dropped 9  overruns 0  frame 0
>         TX packets 46928098  bytes 4959421598 (4.6 GiB)
>         TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

The fact the IP is on an interface doesn't mean (confusingly) it is actually being used for a connection... Can you try? I would like to understand why you did not get any NFS retransmission, even though you have set packet loss (and we see packets being lost on the bridge and you've witnessed yourself the delay in the console).

Comment 23 Elad 2016-11-17 09:06:25 UTC
[root@storage-ge2-vdsm4 ~]# tc -s qdisc ls dev eth0
qdisc netem 8006: root refcnt 2 limit 1000 loss 20%
 Sent 160246 bytes 1607 pkt (dropped 466, overlimits 0 requeues 0) 
 backlog 0b 0p requeues 0 
[root@storage-ge2-vdsm4 ~]# nfsstat -c
Client rpc stats:
calls      retrans    authrefrsh
28581      0          28581

Comment 31 Yaniv Lavi 2017-08-02 15:40:00 UTC
Closing this on collectd.