Bug 524118 - NFS4 connection to virtual guest NFS4 server fails over bridged interface
Summary: NFS4 connection to virtual guest NFS4 server fails over bridged interface
Keywords:
Status: CLOSED INSUFFICIENT_DATA
Alias: None
Product: Fedora
Classification: Fedora
Component: libvirt
Version: 12
Hardware: All
OS: Linux
low
high
Target Milestone: ---
Assignee: Daniel Veillard
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2009-09-17 22:22 UTC by P Rauser
Modified: 2009-11-20 19:37 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2009-11-20 19:37:13 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
Results of test commands per request (11.12 KB, text/plain)
2009-09-21 21:33 UTC, P Rauser
no flags Details
Host's "iptables -- list" and "iptables --list -t nat" (2.38 KB, text/plain)
2009-09-23 21:40 UTC, P Rauser
no flags Details

Description P Rauser 2009-09-17 22:22:47 UTC
Description of problem:

Attempt to connect to NFS4 server on virtual guest fails when virtual guest's interface is a network bridge from the host.  All other services -- Samba, ssh, etc. -- connect fine from the client to the virtual guest.

Client(192.168.1.1) --> [Host with br1] --> Guest(192.168.1.2:2049 on eth0, which is br1)

Version-Release number of selected component (if applicable):

libvirt-0.7.1-4.fc12 (also previous F11 and F12alpha/rawhide versions)


Steps to Reproduce:

1.  Configure NFS4 server on virtual guest with some test shares.  For testing purposes, we used a configuration that is known to work on a stand-alone, non-virtualized server;

2.  Virtual guest's interface is eth0, which is a bridged interface, br1, from the host machine.  IP address eth0 is static and on the same subnet as the client machine;

3.  For testing purposes, disable SELINUX/firewall/tcp-wrappers on client, host, and virtual guest machines;

4.  Try to mount a share on the virtual guest from the client using NFS4.  The mount times out and fails;

 
Actual results:

NFS4 mount of share fails.


Expected results:

NFS mount of share succeeds.


Additional info:

(a)  Note that running "showmount -e [VIRTUAL GUEST ETH0 IP]" on the client is _very_ slow -- response time is >30 seconds -- but eventually returns a list of shares on the server/virtual guest.  Mount still always fails, regardless of the timeout value.

(b)  Other attempts to connect from the client to the server -- SSH, samba, sftp, etc. -- succeed.

(c)  If a second virtual guest is established on the host using the same network bridge as the first virtual guest, that virtual guest can, as a client, NFS4 mount shares on the first virtual guest's NFS4 server.  Put differently, if virtual guest 1 (server) is on host br1 with eth0=192.168.1.1, and virtual guest 2 (client) is on host br1 with eth0=192.168.1.2, the client can NFS4 mount shares on the server.  "Outside" systems, including the host of the virtual guests itself, cannot mount the shares.

Comment 1 Mark McLoughlin 2009-09-21 17:01:57 UTC
Could you run these commands on the host:

http://fedoraproject.org/wiki/Reporting_virtualization_bugs#Networking

It would also be worthwhile running tcpdump/wireshark on the client, host and guest and comparing what each is seeing

Comment 2 P Rauser 2009-09-21 21:33:39 UTC
Created attachment 362009 [details]
Results of test commands per request

Comment 3 P Rauser 2009-09-21 21:43:13 UTC
Three additional data points:

(a) So that the results of ifconfig -a in the attached file are intelligible, I have the following interfaces on the host:

(i)  Regular ethernet interfaces eth0 and eth2 (unbonded, unbridged);
(ii)  Interface bond1 which bonds together eth1 and eth4;
(iii)  Interface br1 which bridges bond1 to virtual guests;
(iv)  Interface br3 which bridges eth3 to virtual guests;


(b)  The behavior reported in this bug with respect to NFS4 client connection failures also occurs with gluster clients.  When gluster clients attempt to connect to a gluster server running on the virtual guest (ip address 192.168.2.93 at port 6996, the default) the connection hangs and eventually times out.  During the hang, netstat on the client reports the following:

tcp        0      1 client:exp2                192.168.2.93:6996           SYN_SENT    
tcp        0      1 client:1023                192.168.2.93:6996           SYN_SENT


(c)  Notwithstanding the connection problems with NFS4 and gluster reported in this bug, I can telnet from the client to the virtual guest on ports 2049 (NFS4) and 6996 (gluster).

Comment 4 Mark McLoughlin 2009-09-23 16:21:14 UTC
Very strange that telnet to those ports work, but the client is hung in SYN_SENT

Looking at the the file, the only interesting thing I see is:

 2908  564K REJECT     all  --  *      *       0.0.0.0/0            0.0.0.0/0           reject-with icmp-host-prohibited 

i.e. the you're getting a lot of packets reject by the hosts iptables INPUT chain; does that figure go up when you try and connect from the client to the guest?

Again, tcpdump might help to shed some light on this

Comment 5 P Rauser 2009-09-23 20:51:26 UTC
Yes, the rejected-packets figure goes up when the client tries to connect to the virtual guest.

Comment 6 P Rauser 2009-09-23 21:06:55 UTC
Not a tcpdump user -- if somebody wants to craft a command for me in that or in wireshark, I'll run it & post.

Comment 7 P Rauser 2009-09-23 21:29:41 UTC
This may/may not shed some light on the problem.  I set up a second virtual guest server running NFS4 and, instead of using a bridged interface, used the "default" NAT-based networking (i.e. virbr0)

1.  The new guest, testbed2, has an ip address of 192.168.122.2;

2.  The host is able to mount a NFS4 share on the guest.  It was not able to mount the share when the guest used network bridge br1 (see "Description" note (c), above);

3.  Clients (who are not the host) cannot mount the virtual guest's share.  This is expected and makes sense.  However, executing the following two commands on the host should allow clients to mount NFS4 shares on the NAT-ed virtual guest:

iptables -t nat -A PREROUTING -i eth0 -p tcp --dport 2049 -j DNAT --to 192.168.122.2:2049

iptables -A FORWARD -i eth0 -p tcp --dport 2049 -d 192.168.122.2 -j ACCEPT

4.  Clients are still not able to mount shares on the NAT-ed virtual guest.  They should be able to.

Don't know whether this is related to the original bug, but it makes me think that there may be an iptables or routing component to it.

Comment 8 P Rauser 2009-09-23 21:40:03 UTC
Created attachment 362361 [details]
Host's "iptables -- list" and "iptables --list -t nat"

Comment 9 Mark McLoughlin 2009-10-01 08:50:10 UTC
Okay, trying to get DNAT working will probably just confuse things

If the packets are being rejected because of the INPUT chain on the host, that sounds like they're addressed to the host e.g. to the br1 IP address

If the packet is addressed to the guest, it should only traverse the bridge and since net.bridge.bridge-nf-call-iptables is zero, it shouldn't even go through the host's FORWARD chain

Are you sure you're using the IP address of the guest, not the the IP address of the bridge in the host?

To use tcpdump/wireshark, do e.g.

  $> tcpdump -p -i eth0 -w t.dump
  $> wireshark t.dump

Comment 10 Bug Zapper 2009-11-16 12:35:21 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 12 development cycle.
Changing version to '12'.

More information and reason for this action is here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 11 Mark McLoughlin 2009-11-20 19:37:13 UTC
No response to needinfo since 2009-10-01, closing


Note You need to log in before you can comment on or make changes to this bug.