Red Hat Bugzilla – Bug 524118
NFS4 connection to virtual guest NFS4 server fails over bridged interface
Last modified: 2009-11-20 14:37:13 EST
Description of problem:
Attempt to connect to NFS4 server on virtual guest fails when virtual guest's interface is a network bridge from the host. All other services -- Samba, ssh, etc. -- connect fine from the client to the virtual guest.
Client(192.168.1.1) --> [Host with br1] --> Guest(192.168.1.2:2049 on eth0, which is br1)
Version-Release number of selected component (if applicable):
libvirt-0.7.1-4.fc12 (also previous F11 and F12alpha/rawhide versions)
Steps to Reproduce:
1. Configure NFS4 server on virtual guest with some test shares. For testing purposes, we used a configuration that is known to work on a stand-alone, non-virtualized server;
2. Virtual guest's interface is eth0, which is a bridged interface, br1, from the host machine. IP address eth0 is static and on the same subnet as the client machine;
3. For testing purposes, disable SELINUX/firewall/tcp-wrappers on client, host, and virtual guest machines;
4. Try to mount a share on the virtual guest from the client using NFS4. The mount times out and fails;
NFS4 mount of share fails.
NFS mount of share succeeds.
(a) Note that running "showmount -e [VIRTUAL GUEST ETH0 IP]" on the client is _very_ slow -- response time is >30 seconds -- but eventually returns a list of shares on the server/virtual guest. Mount still always fails, regardless of the timeout value.
(b) Other attempts to connect from the client to the server -- SSH, samba, sftp, etc. -- succeed.
(c) If a second virtual guest is established on the host using the same network bridge as the first virtual guest, that virtual guest can, as a client, NFS4 mount shares on the first virtual guest's NFS4 server. Put differently, if virtual guest 1 (server) is on host br1 with eth0=192.168.1.1, and virtual guest 2 (client) is on host br1 with eth0=192.168.1.2, the client can NFS4 mount shares on the server. "Outside" systems, including the host of the virtual guests itself, cannot mount the shares.
Could you run these commands on the host:
It would also be worthwhile running tcpdump/wireshark on the client, host and guest and comparing what each is seeing
Created attachment 362009 [details]
Results of test commands per request
Three additional data points:
(a) So that the results of ifconfig -a in the attached file are intelligible, I have the following interfaces on the host:
(i) Regular ethernet interfaces eth0 and eth2 (unbonded, unbridged);
(ii) Interface bond1 which bonds together eth1 and eth4;
(iii) Interface br1 which bridges bond1 to virtual guests;
(iv) Interface br3 which bridges eth3 to virtual guests;
(b) The behavior reported in this bug with respect to NFS4 client connection failures also occurs with gluster clients. When gluster clients attempt to connect to a gluster server running on the virtual guest (ip address 192.168.2.93 at port 6996, the default) the connection hangs and eventually times out. During the hang, netstat on the client reports the following:
tcp 0 1 client:exp2 192.168.2.93:6996 SYN_SENT
tcp 0 1 client:1023 192.168.2.93:6996 SYN_SENT
(c) Notwithstanding the connection problems with NFS4 and gluster reported in this bug, I can telnet from the client to the virtual guest on ports 2049 (NFS4) and 6996 (gluster).
Very strange that telnet to those ports work, but the client is hung in SYN_SENT
Looking at the the file, the only interesting thing I see is:
2908 564K REJECT all -- * * 0.0.0.0/0 0.0.0.0/0 reject-with icmp-host-prohibited
i.e. the you're getting a lot of packets reject by the hosts iptables INPUT chain; does that figure go up when you try and connect from the client to the guest?
Again, tcpdump might help to shed some light on this
Yes, the rejected-packets figure goes up when the client tries to connect to the virtual guest.
Not a tcpdump user -- if somebody wants to craft a command for me in that or in wireshark, I'll run it & post.
This may/may not shed some light on the problem. I set up a second virtual guest server running NFS4 and, instead of using a bridged interface, used the "default" NAT-based networking (i.e. virbr0)
1. The new guest, testbed2, has an ip address of 192.168.122.2;
2. The host is able to mount a NFS4 share on the guest. It was not able to mount the share when the guest used network bridge br1 (see "Description" note (c), above);
3. Clients (who are not the host) cannot mount the virtual guest's share. This is expected and makes sense. However, executing the following two commands on the host should allow clients to mount NFS4 shares on the NAT-ed virtual guest:
iptables -t nat -A PREROUTING -i eth0 -p tcp --dport 2049 -j DNAT --to 192.168.122.2:2049
iptables -A FORWARD -i eth0 -p tcp --dport 2049 -d 192.168.122.2 -j ACCEPT
4. Clients are still not able to mount shares on the NAT-ed virtual guest. They should be able to.
Don't know whether this is related to the original bug, but it makes me think that there may be an iptables or routing component to it.
Created attachment 362361 [details]
Host's "iptables -- list" and "iptables --list -t nat"
Okay, trying to get DNAT working will probably just confuse things
If the packets are being rejected because of the INPUT chain on the host, that sounds like they're addressed to the host e.g. to the br1 IP address
If the packet is addressed to the guest, it should only traverse the bridge and since net.bridge.bridge-nf-call-iptables is zero, it shouldn't even go through the host's FORWARD chain
Are you sure you're using the IP address of the guest, not the the IP address of the bridge in the host?
To use tcpdump/wireshark, do e.g.
$> tcpdump -p -i eth0 -w t.dump
$> wireshark t.dump
This bug appears to have been reported against 'rawhide' during the Fedora 12 development cycle.
Changing version to '12'.
More information and reason for this action is here:
No response to needinfo since 2009-10-01, closing