Bug 199342

Summary: traceroute may continue past destination host
Product: [Fedora] Fedora Reporter: Daniel Kopeček <dkopecek>
Component: tracerouteAssignee: Martin Bacovsky <mbacovsk>
Status: CLOSED NOTABUG QA Contact:
Severity: medium Docs Contact:
Priority: medium    
Version: 5CC: dmitry, okir, redhat-bugzilla
Target Milestone: ---Keywords: Reopened
Target Release: ---   
Hardware: i386   
OS: Linux   
Whiteboard:
Fixed In Version: 1.0.4-2 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2006-11-24 09:17:19 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Fix :]
none
Fix - new option -c none

Description Daniel Kopeček 2006-07-19 00:10:47 UTC
Description of problem:
traceroute may continue past destination host

Version-Release number of selected component (if applicable):
traceoute-1.0.4

How reproducible:
always

Steps to Reproduce:
Run traceroute with dst host 24.221.130.104

Actual results:
$ traceroute 24.221.130.104
...
22  sl-bb23-sj-10-0.sprintlink.net (144.232.20.113)  211.476 ms sl-gw13-sj-9-
0.sprintlink.net (144.232.3.170)  202.982 ms   199.562 ms
23  sl-bbwl-4-0-0.sprintlink.net (144.228.111.42)  249.675 ms sl-gw13-sj-9-
0.sprintlink.net (144.232.3.170)  203.184 ms   200.161 ms
24  sl-bbwl-4-0-0.sprintlink.net (144.228.111.42)  219.684 ms   234.350 ms cpe-
24-221-130-104.az.sprintbbd.net (24.221.130.104)  403.839 ms
25  * cpe-24-221-130-104.az.sprintbbd.net (24.221.130.104)  341.973 ms   
339.329 ms
26  * * *
27  * * *
28  * * *
29  * * *
30  * * *
$

Expected results:
$ traceroute 24.221.130.104
...
22  sl-gw13-sj-9-0.sprintlink.net (144.232.3.170)  200.680 ms sl-bb23-sj-10-
0.sprintlink.net (144.232.20.113)  206.875 ms   212.309 ms
23  sl-gw13-sj-9-0.sprintlink.net (144.232.3.170)  208.670 ms   202.148 ms   
215.811 ms
24  cpe-24-221-130-104.az.sprintbbd.net (24.221.130.104)  316.045 ms   316.795 
ms   317.416 ms
$

Additional info:

Comment 1 Daniel Kopeček 2006-07-19 00:10:47 UTC
Created attachment 132630 [details]
Fix :]

Comment 2 Radek Vokál 2006-07-19 12:37:40 UTC
Checked in, thanks a lot

Comment 3 Robert Scheck 2006-07-19 21:16:41 UTC
Dan, Radek...was this fix also sent to upstream?

Comment 4 Radek Vokál 2006-07-20 11:40:32 UTC
Will do so .. 

Comment 5 Olaf Kirch 2006-07-20 13:24:29 UTC
I'm not sure this is a genuine bug in traceroute, it looks more like 
a strange IP stack in that router to me. Apparently that router 
decrements the TTL on every incoming packet before looking at it, which is 
in violation of the RFCs (you're supposed to decrement it when forwarding). 
 
On top of that, the router is apparently configured to just drop any 
incoming UDP packet, so this means that instead of replying with a 
DEST_UNREACH to the next packet that comes along, it just shuts up. 
 
The patch you propose works around the problem, even though it seems a 
bit of a hack to me. I'll think about how to do this. 

Comment 6 Daniel Kopeček 2006-07-21 06:45:23 UTC
Yes, it is obvious that the problem is the abnormal response from that
router. Originally I was solving this problem in tracepath and I
noticed that it behaves the same way. They both expect CONNREFUSED
upon reaching the target. This condition seems to me as very deficient
but I guess tracerouting is supposed to work this way. Information
about the origin of the response is available and probably we can't
have anything better. How the machine replies to us is easily changed
but the address stays. I don't see this modification as a hack but as
a feature traceroute should have. If not as default, then using a
specialized option. I can supply a new path which will add this
behaviour. Wouldn't this be more suitable?

Comment 7 Daniel Kopeček 2006-08-14 11:49:09 UTC
Created attachment 134129 [details]
Fix - new option -c

Comment 8 Dmitry Butskoy 2006-11-21 17:09:45 UTC
About 24.221.130.10:

It seems to be a corner case.  It affects all three implementations of
traceroute, including original, Olaf's and mine. The host "24.221.130.104"
answers "time exceed", and then answers nothing. It is the reason why all
traceroutes do not stop on it (all them consider "time exceeded" icmp reply as
*not final* reply).

> Do you find it to be a bug or not?

Nope.

IMO, one of possibilities of traceroute is to detect network loops (where the
same host can appear several times in the output). I assume that the
"24.221.130.10" case is some kind of such a loop (just early filtered somewhere).
The probe packets have not reached this host as a final host.  Probably they
passed through it (or someone have NATed the actual reply by its address), but
there are no any "port unreach", "icmp echoreply" or "tcp resets" from this
host, that means that the actual destination of "24.221.130.10" is not reached.

for comment #7 :
I don't think that some special option could be useful here.
Users should see the loops. It follows the "spirit" of tracerouting. Atleast,
"traditional", "cmdline", "network admin" tracerouting...

If users have already seen a loop, they can interrupt the program earlier
(Ctrl-C). (Users already do so when they see a lot of "*" and don't want to wait
when max number of probes will be reached).



Comment 9 Martin Bacovsky 2006-11-24 09:17:19 UTC
According to comments #5 and #8 I decided to close this as NOTABUG, but thanks
to Dan for his effort to create the patch for this problem.