The timeout parameter passed to poll() in connect_nb() is expected in milliseconds. However timeout value in seconds is used instead without converting it to milliseconds. This results in a short timeout of 5 milliseconds instead of the expected 5 seconds. The problem was seen on a user machine which had problems connecting to the portmap server over TCP to obtain a list of exports. The tcpdump seen in this case is 97 40.690541 47.130.183.57 -> 47.16.19.42 TCP 45790 > sunrpc [SYN] Seq=0 Win=5840 Len=0 MSS=1460 TSV=832443 TSER=0 WS=7 98 40.725121 47.16.19.42 -> 47.130.183.57 TCP sunrpc > 45790 [SYN, ACK] Seq=0 Ack=1 Win=65535 Len=0 MSS=1460 WS=3 TSV=78457732 TSER=832443 99 40.725141 47.130.183.57 -> 47.16.19.42 TCP 45790 > sunrpc [RST] Seq=1 Win=0 Len=0 The corresponding strace for this issue shows the following 3572 14:33:03 fcntl64(5, F_GETFL) = 0x2 (flags O_RDWR) 3572 14:33:03 fcntl64(5, F_SETFL, O_RDWR|O_NONBLOCK) = 0 3572 14:33:03 connect(5, {sa_family=AF_INET, sin_port=htons(111), sin_addr=inet_addr("47.16.19.42")}, 16) = -1 EINPROGRESS (Operation now in progress) 3572 14:33:03 poll([{fd=5, events=POLLOUT}], 1, 5) = 0 (Timeout) A simple patch to change timeout to milliseconds fixed the problem in the user case. @@ -234,7 +234,7 @@ static int connect_nb(int fd, struct soc pfd[0].fd = fd; pfd[0].events = POLLOUT; - ret = poll(pfd, 1, timeout); + ret = poll(pfd, 1, timeout*1000); if (ret <= 0) { if (ret == 0) ret = -ETIMEDOUT;
This is also a regression, introduced by the changes for bug 487653. I'll fix it.
Created attachment 373002 [details] Patch to fix timeout in connect_nb() This patch achieves the same result as the suggestion above. This has been committed upstream and the Fedora releases affected have been updated.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2010-0265.html