Bug 58041

Summary: report mode impatient with slow links, claims non-existant loss
Product: [Retired] Red Hat Linux Reporter: James Manning <jmm>
Component: mtrAssignee: Phil Knirsch <pknirsch>
Status: CLOSED RAWHIDE QA Contact:
Severity: medium Docs Contact:
Priority: medium    
Version: 7.2CC: rvokal
Target Milestone: ---   
Target Release: ---   
Hardware: i386   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2002-01-29 12:44:20 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description James Manning 2002-01-07 03:57:43 UTC
Description of Problem:
mtr prob. doesn't wait long enough for the last ping packets in report mode

Version-Release number of selected component (if applicable):
mtr-0.44-1

How Reproducible:
100%

Steps to Reproduce:
1. run mtr -r -c 10 to a host where one hop along the route adds a ton of 
latency
2. notice that anything after that hop will have a packet loss % 
associated with it in the report when there is no loss.

Actual Results:
jmm@bp6:/home/jmm> mtr -r -c 10 152.1.1.22
HOST                                    LOSS  RCVD SENT    BEST     AVG   
WORST
10.41.128.1                               0%    10   10    6.05    9.87   
24.21
24.25.1.121                               0%    10   10    6.63    7.38    
9.09
24.25.1.49                                0%    10   10    6.59    7.40    
8.29
rdu26-33-177.nc.rr.com                    0%    10   10    7.03    8.22   
11.63
ROC-ASBR-ROC-GSR.carolina.rr.com         10%     9   10  710.69  725.60  
748.87
ncsugsr-gw.ncni.net                      10%     9   10  709.12  727.27  
739.57
ncsudmz.ncni.net                         10%     9   10  716.10  730.63  
753.70
ithub-6509msfc-1.ncstate.net             10%     9   10  721.74  731.41  
743.22
uni00ns.unity.ncsu.edu                   10%     9   10  465.19  481.15  
501.24

Note that when i just mtr to the same host, i *never* see dropped packets 
(latencies are still high - thanks rr! - ugh :) although the recv packet 
count *will* seem to lag on the post-hog hops simply because the latencies 
to those hops are so high, like this:

                                           Packets               Pings
Hostname                                %Loss  Rcv  Snt  Last Best  Avg  
Worst
 1. 10.41.128.1                            0%  116  116     6    6    9    
130
 2. 24.25.1.121                            0%  116  116    16    6    
8     20
 3. 24.25.1.49                             0%  116  116     7    6    
8     16
 4. rdu26-33-177.nc.rr.com                 0%  116  116     7    6    
9     25
 5. ROC-ASBR-ROC-GSR.carolina.rr.com       0%  115  116   839  743  823    
895
 6. ncsugsr-gw.ncni.net                    0%  115  116   832  748  824    
900
 7. ncsudmz.ncni.net                       0%  115  116   837  743  824    
909
 8. ithub-6509msfc-1.ncstate.net           0%  115  116   839  737  824    
896
 9. uni00ns.unity.ncsu.edu                 0%  115  116   506  458  530    
611

Notice that the display at this moment appears to show a dropped packet 
for each post-hog host, but those packets haven't been dropped, they just 
haven't come back yet.

Expected Results:
mtr should wait longer (3-5 seconds, imho) to keep from mistakenly calling 
the last packets to each host along the way lost prematurely.

Additional Information:

Comment 1 James Manning 2002-01-07 04:25:24 UTC
looks like one workaround is to jack up the -i (WaitTime in the source)
although since the ping times never break 1 seconds, I'm not sure why
that is.  See this output, though:

jmm@bp6:/home/jmm> mtr -i 2 -r -c 10 152.1.1.22
HOST                                    LOSS  RCVD SENT    BEST     AVG   WORST
10.41.128.1                               0%    10   10    6.99    9.19   17.20
24.25.1.121                               0%    10   10    6.37    7.45    9.56
24.25.1.49                                0%    10   10    6.55    8.14   12.65
rdu26-33-177.nc.rr.com                    0%    10   10    7.31    8.40    9.42
ROC-ASBR-ROC-GSR.carolina.rr.com          0%    10   10  522.96  575.91  625.19
ncsugsr-gw.ncni.net                       0%    10   10  531.96  573.39  606.20
ncsudmz.ncni.net                          0%    10   10  535.15  572.72  593.38
ithub-6509msfc-1.ncstate.net             10%     9   10  529.26  566.75  593.22
uni00ns.unity.ncsu.edu                   10%     9   10  261.10  289.26  310.87
jmm@bp6:/home/jmm> mtr -i 3 -r -c 10 152.1.1.22
HOST                                    LOSS  RCVD SENT    BEST     AVG   WORST
10.41.128.1                               0%    10   10    6.16    8.40   12.93
24.25.1.121                               0%    10   10    6.17    7.50   10.92
24.25.1.49                                0%    10   10    6.74    7.76    8.54
rdu26-33-177.nc.rr.com                    0%    10   10    6.95    8.34    9.95
ROC-ASBR-ROC-GSR.carolina.rr.com          0%    10   10  485.51  544.41  585.24
ncsugsr-gw.ncni.net                       0%    10   10  504.86  548.03  579.10
ncsudmz.ncni.net                          0%    10   10  489.19  544.97  591.87
ithub-6509msfc-1.ncstate.net              0%    10   10  484.29  552.91  606.51
uni00ns.unity.ncsu.edu                    0%    10   10  228.03  263.22  317.88

Since the send's and recv's are batched but separate, I'm a little confused 
about why the above behavior is exhibited, but for now I'll use -i 3 as a 
workaround, although it seems clear that i'll have to jack it up even more for 
mtr's to sites with many more hops:

jmm@bp6:/home/jmm> mtr -i 3 -r -c 10 computerjobs.com
HOST                                    LOSS  RCVD SENT    BEST     AVG   WORST
10.41.128.1                               0%    10   10    6.22    8.82   14.23
24.25.1.121                               0%    10   10    6.46    7.23    7.90
24.25.1.49                                0%    10   10    6.58    7.91   10.41
rdu26-33-178.nc.rr.com                    0%    10   10  160.18  169.85  180.67
pop1-cha-P3-0.atdn.net                    0%    10   10  457.70  510.64  585.50
bb1-cha-P0-0.atdn.net                     0%    10   10  479.54  524.38  625.22
bb1-vie-P10-0.atdn.net                    0%    10   10  485.65  518.53  580.38
bb1-rtc-P0-2.atdn.net                     0%    10   10  219.18  258.01  386.07
pop2-rtc-P14-0.atdn.net          0%    10   10  220.73  237.91  257.28
uunet-gw2.atdn.net                        0%    10   10  217.32  235.94  251.38
132.at-5-1-0.XR2.DCA6.ALTER.NET           0%    10   10  458.45  515.54  582.42
0.so-4-0-0.TR2.DCA6.ALTER.NET             0%    10   10  204.94  231.09  250.30
121.at-5-2-0.TR2.ATL5.ALTER.NET           0%    10   10  224.95  247.14  278.68
296.at-2-1-0.XR2.ATL5.ALTER.NET           0%    10   10  494.93  537.82  596.24
192.ATM7-0.SR1.ATL5.ALTER.NET             0%    10   10  490.61  536.50  606.03
???                                     100%     0   10    0.00    0.00    0.00
vlan3.msfc1.dr2.atl7.web.uu.net           0%    10   10  230.21  256.81  280.49
63.111.5.241                             10%     9   10  502.97  539.88  613.71
63.99.138.2                              10%     9   10  469.94  530.97  602.86


Comment 2 Phil Knirsch 2002-01-29 10:55:47 UTC
I've updated the package to 0.45 (which came out a couple of days) and it's
available via rawhide now.

It has quite a few updated (according to the author), so if you could give that
version a shot i'd greately appreciate it.

Thanks,

Read ya, Phil

Comment 3 James Manning 2002-01-29 12:34:41 UTC
I must be looking in the wrong location.  Can you tell me what
the proper full url to fetch 0.45 is from rawhide?

ncftp ...whide/i386/RedHat/RPMS > pwd
  ftp://rawhide.redhat.com/pub/redhat/linux/rawhide/i386/RedHat/RPMS/
ncftp ...whide/i386/RedHat/RPMS > ls mtr*
mtr-0.44-1.i386.rpm       mtr-gtk-0.44-1.i386.rpm

I'll check again later today in case it's in the queue and just hasn't been 
pushed quite yet

Comment 4 James Manning 2002-01-29 12:44:15 UTC
weird - i was 99.44% sure i told it "Leave as NEEDINFO"

Comment 5 James Manning 2002-01-30 22:31:59 UTC
well, my latency problems were fixed a couple of weeks ago so I don't really 
have a viable test scenario any more.  I'll assume it works and close this bug 
and reopen down the line if/when I can see if it still breaks.

Comment 6 Phil Knirsch 2002-01-31 09:34:16 UTC
Sorry, it took a couple of days longer as the updated version did't built
correctly with newer autoconf and gcc packages.

The new mtr 0.45 should appear over the next few days on rawhide, but i am not
sure if it will fix your problem resp. if you will be able to reproduce it.

If yes, feel free to reopen the bug.

Thanks,

Read ya, Phil