Bug 171267

Summary: Intermittent NFS problem on client reboot, iptables interaction
Product: [Fedora] Fedora Reporter: josip
Component: kernelAssignee: Steve Dickson <steved>
Status: CLOSED INSUFFICIENT_DATA QA Contact: Brian Brock <bbrock>
Severity: medium Docs Contact:
Priority: medium    
Version: 4CC: davej, wtogami
Target Milestone: ---   
Target Release: ---   
Hardware: i386   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2006-05-05 21:21:10 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description josip 2005-10-20 06:48:40 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.7.12) Gecko/20050922 Fedora/1.0.7-1.1.fc4 Firefox/1.0.7

Description of problem:
Intermittent NFS problem on client reboot.

Client successfully obtains NFS (v3 TCP) mount authorization, but intermittently fails to connect to the NFS server until client's iptables are unloaded.  Ethereal log of this problem shows client (from port 800) connecting to NFS server and then rejecting the server's ACK packets.

Are NFS clients allowed to connect from priviledged ports?  This may also be an iptables implementation bug, because apparently client's iptables blocks server's ACK to client's own SYN packet (i.e. response to an outgoing connection is being blocked).



Version-Release number of selected component (if applicable):


How reproducible:
Sometimes

Steps to Reproduce:
1.reboot NFS (v3 TCP) client w/default iptables active
2.
3.
  

Actual Results:  NFS mount times out with message "could not read superblock."

Expected Results:  Normal NFS mount.

Additional info:

The relevant part of Ethereal log -- note that the client connects to NFS service from port 800, and that the server's ACK is being blocked by client's iptables (mount from port 800 succeeds after stopping client's iptables):

No.     Time        Source                Destination           Protocol Info
[...after successful NFS mount (v3 TCP)...]
     56 5.813171    client                SERVER                TCP      800 > nfs [SYN] Seq=0 Ack=0 Win=23360 Len=0 MSS=1460 TSV=4294938434 TSER=0 WS=2
     57 5.813329    SERVER                client                TCP      nfs > 800 [ACK] Seq=3478945320 Ack=3231224935 Win=16022 Len=0 TSV=422534057 TSER=136935
     58 5.813351    client                SERVER                ICMP     Destination unreachable (Host administratively prohibited)
     59 11.813138   client                SERVER                TCP      800 > nfs [SYN] Seq=0 Ack=0 Win=23360 Len=0 MSS=1460 TSV=4294939934 TSER=0 WS=2
     60 11.813299   SERVER                client                TCP      nfs > 800 [ACK] Seq=3478945320 Ack=3231224935 Win=16022 Len=0 TSV=422535557 TSER=136935
     65 23.813095   client                SERVER                TCP      800 > nfs [SYN] Seq=0 Ack=0 Win=23360 Len=0 MSS=1460 TSV=4294942934 TSER=0 WS=2
     66 23.813225   SERVER                client                TCP      nfs > 800 [ACK] Seq=3478945320 Ack=3231224935 Win=16022 Len=0 TSV=422538557 TSER=136935
     67 23.813245   client                SERVER                ICMP     Destination unreachable (Host administratively prohibited)
     76 47.816975   client                SERVER                TCP      800 > nfs [SYN] Seq=0 Ack=0 Win=23360 Len=0 MSS=1460 TSV=4294948934 TSER=0 WS=2
     77 47.817099   SERVER                client                TCP      nfs > 800 [ACK] Seq=3478945320 Ack=3231224935 Win=16022 Len=0 TSV=422544557 TSER=136935
     78 47.817119   client                SERVER                ICMP     Destination unreachable (Host administratively prohibited)



Also, client's /etc/sysconfig/iptables (Firewall configuration written by system-config-securitylevel, default plus ssh):

# Firewall configuration written by system-config-securitylevel
# Manual customization of this file is not recommended.
*filter
:INPUT ACCEPT [0:0]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [0:0]
:RH-Firewall-1-INPUT - [0:0]
-A INPUT -j RH-Firewall-1-INPUT
-A FORWARD -j RH-Firewall-1-INPUT
-A RH-Firewall-1-INPUT -i lo -j ACCEPT
-A RH-Firewall-1-INPUT -p icmp --icmp-type any -j ACCEPT
-A RH-Firewall-1-INPUT -p 50 -j ACCEPT
-A RH-Firewall-1-INPUT -p 51 -j ACCEPT
-A RH-Firewall-1-INPUT -p udp --dport 5353 -d 224.0.0.251 -j ACCEPT
-A RH-Firewall-1-INPUT -p udp -m udp --dport 631 -j ACCEPT
-A RH-Firewall-1-INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT
-A RH-Firewall-1-INPUT -m state --state NEW -m tcp -p tcp --dport 22 -j ACCEPT
-A RH-Firewall-1-INPUT -j REJECT --reject-with icmp-host-prohibited
COMMIT

Comment 1 josip 2005-10-20 07:21:51 UTC
Kernel version kernel-smp-2.6.13-1.1526_FC4 was used by both client and server,
but earlier kernels (both uniprocessor and smp) had similar problems about
10-30% of the time.  The other 70-90% of the time, NFS mounts work correctly
with unchanged iptables configuration.

Comment 2 Steve Dickson 2005-10-21 23:58:50 UTC
Is selinux in the picture?

Comment 3 josip 2005-10-22 02:46:30 UTC
No. Selinux is installed but disabled on the client, and not even installed on
the server.

Comment 4 Dave Jones 2005-11-10 22:01:11 UTC
Mass update to all FC4 bugs:

An update has been released (2.6.14-1.1637_FC4) which rebases to a new upstream
kernel (2.6.13.2). As there were ~3500 changes upstream between this and the
previous kernel, it's possible your bug has been fixed already.

Please retest with this update, and update this bug if necessary.

Thanks.



Comment 5 josip 2005-11-21 03:32:33 UTC
No improvement at reboot -- NFS still complains "could not read superblock". 
Some improvement at mount from command line after reboot, due to retries.  Here
is the ethereal log of "mount -a -t nfs" from the command line, picking up the
attempt to establish NFS v3 TCP connection after after successful NFS authorization:

Pkt No. Time        Source           Destination      Protocol Detail...
     51 0.025710    Client           Server           TCP      800 > nfs [SYN]
Seq=0 Ack=0 Win=5840 Len=0 MSS=1460 TSV=4294947169 TSER=0 WS=2
     52 0.025821    Server           Client           TCP      nfs > 800 [ACK]
Seq=0 Ack=0 Win=16022 Len=0 TSV=191157096 TSER=156374
     53 0.025853    Client           Server           ICMP     Destination
unreachable (Host administratively prohibited)

Note the ICMP rejection -- generated by Client's iptables despite the fact that
the Client originated the TCP connection to the NFS server.  This rejection
doesn't happen when iptables are disabled.

HINT: TCP 3-way handshake should proceed as SYN-SYN-ACK, but based on my
ethereal logs, the Server's response to Client's SYN doesn't set the SYN bit. 
This could be a bug on the NFS server side.  All machines are running the latest
software (kernel 2.6.14-1.1637_FC4 etc.).

The next mount retry also fails 3 seconds later:

     54 3.022518    Client           Server           TCP      800 > nfs [SYN]
Seq=0 Ack=0 Win=23360 Len=0 MSS=1460 TSV=4294947919 TSER=0 WS=2
     55 3.022637    Server           Client           TCP      nfs > 800 [ACK]
Seq=529232714 Ack=2998602303 Win=16022 Len=0 TSV=191157845 TSER=156374
     56 3.022678    Client           Server           ICMP     Destination
unreachable (Host administratively prohibited)

The attempt at 9 seconds gets no response:

     61 9.022629    Client           Server           TCP      800 > nfs [SYN]
Seq=0 Ack=0 Win=23360 Len=0 MSS=1460 TSV=4294949419 TSER=0 WS=2

The attempt at 21 seconds finally succeeds -- note that this time the Server's
response to Client's SYN is a [SYN,ACK] packet as required by TCP 3-way handshake:

     65 21.022848   Client           Server           TCP      800 > nfs [SYN]
Seq=0 Ack=0 Win=23360 Len=0 MSS=1460 TSV=4294952419 TSER=0 WS=2
     66 21.023019   Server           Client           TCP      nfs > 800 [SYN,
ACK] Seq=0 Ack=1 Win=5792 Len=0 MSS=1460 TSV=191162345 TSER=4294952419 WS=2
     67 21.023074   Client           Server           TCP      800 > nfs [ACK]
Seq=1 Ack=1 Win=5840 Len=0 TSV=4294952419 TSER=191162345
     68 21.029047   Client           Server           NFS      V3 NULL Call
(Reply In 70)

BTW, during the above, the iptables configuration on the Client was quite basic
(default+ssh):

# Firewall configuration written by system-config-securitylevel
# Manual customization of this file is not recommended.
*filter
:INPUT ACCEPT [0:0]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [0:0]
:RH-Firewall-1-INPUT - [0:0]
-A INPUT -j RH-Firewall-1-INPUT
-A FORWARD -j RH-Firewall-1-INPUT
-A RH-Firewall-1-INPUT -i lo -j ACCEPT
-A RH-Firewall-1-INPUT -p icmp --icmp-type any -j ACCEPT
-A RH-Firewall-1-INPUT -p 50 -j ACCEPT
-A RH-Firewall-1-INPUT -p 51 -j ACCEPT
-A RH-Firewall-1-INPUT -p udp --dport 5353 -d 224.0.0.251 -j ACCEPT
-A RH-Firewall-1-INPUT -p udp -m udp --dport 631 -j ACCEPT
-A RH-Firewall-1-INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT
-A RH-Firewall-1-INPUT -m state --state NEW -m tcp -p tcp --dport 22 -j ACCEPT
-A RH-Firewall-1-INPUT -j REJECT --reject-with icmp-host-prohibited
COMMIT

Finally, SELinux is disabled on both Client and Server.  This looks like a
problem involving iptables and NFS mount only.

Most likely, there is a bug in nfsd where clients reconnecting after reboot do
not get the normal TCP 3-way handshake (SYN-SYN-ACK).  Instead, the server
responds to client's SYN with a pure ACK, so that iptables blocks the unexpected
SYN-ACK sequence.





Comment 6 Dave Jones 2006-02-03 07:33:31 UTC
This is a mass-update to all currently open kernel bugs.

A new kernel update has been released (Version: 2.6.15-1.1830_FC4)
based upon a new upstream kernel release.

Please retest against this new kernel, as a large number of patches
go into each upstream release, possibly including changes that
may address this problem.

This bug has been placed in NEEDINFO_REPORTER state.
Due to the large volume of inactive bugs in bugzilla, if this bug is
still in this state in two weeks time, it will be closed.

Should this bug still be relevant after this period, the reporter
can reopen the bug at any time. Any other users on the Cc: list
of this bug can request that the bug be reopened by adding a
comment to the bug.

If this bug is a problem preventing you from installing the
release this version is filed against, please see bug 169613.

Thank you.


Comment 7 John Thacker 2006-05-05 21:21:10 UTC
Closing per previous comment.