Bug 539747 - connect_nb() uses a wrong timeout
Summary: connect_nb() uses a wrong timeout
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: autofs
Version: 5.4
Hardware: All
OS: Linux
medium
medium
Target Milestone: rc
: ---
Assignee: Ian Kent
QA Contact: qe-baseos-daemons
URL:
Whiteboard:
Depends On:
Blocks: 540329
TreeView+ depends on / blocked
 
Reported: 2009-11-20 23:14 UTC by Sachin Prabhu
Modified: 2018-10-27 14:11 UTC (History)
4 users (show)

Fixed In Version: autofs-5.0.1-0.rc2.133.el5
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 540329 (view as bug list)
Environment:
Last Closed: 2010-03-30 08:37:06 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Patch to fix timeout in connect_nb() (687 bytes, patch)
2009-11-23 03:14 UTC, Ian Kent
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2010:0265 0 normal SHIPPED_LIVE autofs bug fix update 2010-03-29 12:54:19 UTC

Description Sachin Prabhu 2009-11-20 23:14:58 UTC
The timeout parameter passed to poll() in connect_nb() is expected in milliseconds. However timeout value in seconds is used instead without converting it to milliseconds. This results in a short timeout of 5 milliseconds instead of the expected 5 seconds.

The problem was seen on a user machine which had problems connecting to the portmap server over TCP to obtain a list of exports. The tcpdump seen in this case is 

 97  40.690541 47.130.183.57 -> 47.16.19.42  TCP 45790 > sunrpc [SYN] Seq=0 Win=5840 Len=0 MSS=1460 TSV=832443 TSER=0 WS=7
 98  40.725121  47.16.19.42 -> 47.130.183.57 TCP sunrpc > 45790 [SYN, ACK] Seq=0 Ack=1 Win=65535 Len=0 MSS=1460 WS=3 TSV=78457732 TSER=832443
 99  40.725141 47.130.183.57 -> 47.16.19.42  TCP 45790 > sunrpc [RST] Seq=1 Win=0 Len=0 

The corresponding strace for this issue shows the following

3572  14:33:03 fcntl64(5, F_GETFL)      = 0x2 (flags O_RDWR)
3572  14:33:03 fcntl64(5, F_SETFL, O_RDWR|O_NONBLOCK) = 0
3572  14:33:03 connect(5, {sa_family=AF_INET, sin_port=htons(111), sin_addr=inet_addr("47.16.19.42")}, 16) = -1 EINPROGRESS (Operation now in progress)
3572  14:33:03 poll([{fd=5, events=POLLOUT}], 1, 5) = 0 (Timeout) 

A simple patch to change timeout to milliseconds fixed the problem in the user case.

@@ -234,7 +234,7 @@ static int connect_nb(int fd, struct soc
 	pfd[0].fd = fd;
 	pfd[0].events = POLLOUT;
 
-	ret = poll(pfd, 1, timeout);
+	ret = poll(pfd, 1, timeout*1000);
 	if (ret <= 0) {
 		if (ret == 0)
 			ret = -ETIMEDOUT;

Comment 1 Ian Kent 2009-11-21 01:53:02 UTC
This is also a regression, introduced by the changes for
bug 487653.

I'll fix it.

Comment 3 Ian Kent 2009-11-23 03:14:08 UTC
Created attachment 373002 [details]
Patch to fix timeout in connect_nb()

This patch achieves the same result as the suggestion above.
This has been committed upstream and the Fedora releases
affected have been updated.

Comment 14 errata-xmlrpc 2010-03-30 08:37:06 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2010-0265.html


Note You need to log in before you can comment on or make changes to this bug.