Bug 186207

Summary: SSH fails to complete with xen guest
Product: [Fedora] Fedora Reporter: Deon George <dizzy>
Component: xenAssignee: Daniel Veillard <veillard>
Status: CLOSED UPSTREAM QA Contact: Brian Brock <bbrock>
Severity: medium Docs Contact:
Priority: medium    
Version: 5CC: katzj, xen-maint
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2006-11-20 15:15:28 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 179599    

Description Deon George 2006-03-22 08:58:57 UTC
Description of problem:
I am unable to use SSH in a xen guest to the outside world. This is not a
firewall problem, as I am able to SSH to hosts from another PC, where the
connection goes through the Linux box. (I am also able to SSH to the host from
the xen hypervisor).

I have tried using Xen BRIDGE and Xen ROUTE, with both configurations exhibiting
the same systems to the problem.

Here is my setup:

Internet -> ADSL Modem -> Eth1 -> PPPOE -> XEN Host
                                           |+ Eth0 Local LAN
                                           |+ vifx Xen Guests (Bridge or Route)
                                           |+ VPN to another network

In a xen guest, when I connect I see the following output (below) (you'll see
that the TCP session connects) - and I'm using masquerading on the Xen
hypervisors for any host on the Internet (other PCs on the the network connect
fine).

(The same problem occurs when I SSH to a VPN host from a Xen Guest (No problem
from the hypervisor), thus no masquerading or firewalling comes into play.)

[deon@fc5dev ~]$ ssh -v user.net
OpenSSH_4.3p2, OpenSSL 0.9.8a 11 Oct 2005
debug1: Reading configuration data /home/admin/deon/.ssh/config
debug1: Reading configuration data /etc/ssh/ssh_config
debug1: Applying options for *
debug1: Connecting to shell.sf.net [66.35.250.208] port 22.
debug1: Connection established.
debug1: identity file /home/admin/deon/.ssh/identity type -1
debug1: identity file /home/admin/deon/.ssh/id_rsa type -1
debug1: identity file /home/admin/deon/.ssh/id_dsa type -1
debug1: Remote protocol version 1.99, remote software version OpenSSH_3.6.1p2
debug1: match: OpenSSH_3.6.1p2 pat OpenSSH_3.*
debug1: Enabling compatibility mode for protocol 2.0
debug1: Local version string SSH-2.0-OpenSSH_4.3
debug1: SSH2_MSG_KEXINIT sent
**PAUSE FOR A LONG TIME**
Connection closed by y.y.y.y

A TCPDUMP on the ppp0 interface shows the following:
19:42:46.793735 IP x.x.x.x.59937 > y.y.y.y.ssh: S 2831000511:2831000511(0) win
5840 <mss 1460,sackOK,timestamp 245870470,nop,wscale 2>
19:42:47.073026 IP y.y.y.y.ssh > x.x.x.x.59937: S 3460260840:3460260840(0) ack
2831000512 win 5792 <mss 1412,sackOK,timestamp 339701832 24587047,nop,wscale 2>

19:42:47.073998 IP x.x.x.x.59937 > y.y.y.y.ssh: . ack 1 win 1460
<nop,nop,timestamp 24587119 339701832>
19:42:47.349219 IP y.y.y.y.ssh > x.x.x.x.59937: P 1:26(25) ack 1 win 1448
<nop,nop,timestamp 339702111 24587119>

19:42:47.350918 IP x.x.x.x.59937 > y.y.y.y.ssh: . ack 26 win 1460
<nop,nop,timestamp 24587189 339702111>
19:42:47.352514 IP x.x.x.x.59937 > y.y.y.y.ssh: P 1:21(20) ack 26 win 1460
<nop,nop,timestamp 24587189 339702111>
19:42:48.214185 IP x.x.x.x.59937 > y.y.y.y.ssh: P 1:21(20) ack 26 win 1460
<nop,nop,timestamp 24587405 339702111>
19:42:49.966231 IP x.x.x.x.59937 > y.y.y.y.ssh: P 1:21(20) ack 26 win 1460
<nop,nop,timestamp 24587843 339702111>
19:42:53.422108 IP x.x.x.x.59937 > y.y.y.y.ssh: P 1:21(20) ack 26 win 1460
<nop,nop,timestamp 24588707 339702111>
19:43:00.333983 IP x.x.x.x.59937 > y.y.y.y.ssh: P 1:21(20) ack 26 win 1460
<nop,nop,timestamp 24590435 339702111>
19:43:02.465968 IP x.x.x.x.59935 > y.y.y.y.ssh: FP 4294967276:712(732) ack 1 win
1460 <nop,nop,timestamp 24590968 339658422>
19:43:14.157821 IP x.x.x.x.59937 > y.y.y.y.ssh: P 1:21(20) ack 26 win 1460
<nop,nop,timestamp 24593891 339702111>
19:43:41.805346 IP x.x.x.x.59937 > y.y.y.y.ssh: P 1:21(20) ack 26 win 1460
<nop,nop,timestamp 24600803 339702111>

19:44:02.252957 IP 10.1.3.66.59935 > y.y.y.y.ssh: FP 2785887801:2785888533(732)
ack 3400711808 win 1460 <nop,nop,timestamp 24605915 339658422>
19:44:03.661101 IP y.y.y.y.ssh > x.x.x.x.59935: F 1:1(0) ack 4294967276 win 1448
<nop,nop,timestamp 339778442 24576275>
19:44:03.661200 IP x.x.x.x.59935 > y.y.y.y.ssh: R 2785887801:2785887801(0) win 0

Version-Release number of selected component (if applicable): FC5

How reproducible:
SSH to an external host from a XEN guest.

Additional info:
I was using this fine with FC4.

Comment 1 Daniel Veillard 2006-03-22 23:27:24 UTC
Just tried it, it works for me in a slightly different configuration.
It seems to be a problem with the ADSL router blocking TCP from the guest.
Try to check the MTU size used on the interfaces. Also confirm that both
the xen0 and the xenU are running pristine FC5 installations, because that
works for me. But if ssh out to local network works, but not over the ADSL,
I would suspect IP trouble with the router, again check the MTUs.

Daniel



Comment 2 Deon George 2006-03-22 23:57:56 UTC
Hmm..

OK, this box was a cleaned FC5Test3 install - which I yum updated to FC5 when it
was released.

Remember, ssh from xen-guest to xen-host works fine. SSH from xen-guest to
another host, whether that host is on the internet, or via a VPN tunnel stops.

I too, thought of MTU - its all defaults (1500), but I did drop it down to 1400
without success. The TCP trace that I performed didnt suggest a MTU size problem
(i would have thought i'd get "need fragement" (or whatever the icmp is)...

My ADSL router is in BRIDGE mode - so it is transparent here - Linux owns the
internet connection with PPPoE - so I dont believe the ADSL router will have any
 inclusion here.

Also remember I can SSH without any problems from the xen-host and other systems
on the network - just NOT from xen-guests.

I've just realised that the xen guest is running the FC5Test3 kernel, so I'll
change that and see how it goes. (I had this problem when it was all FC5Test3)...

Just so I know we did the same thing - were you able to ssh to user.net
from your xen guest - and you got a login/password prompt?

Comment 3 Daniel Veillard 2006-03-23 10:08:11 UTC
yes

$ ssh xen-fc5 -l root
Last login: Thu Mar 23 00:27:13 2006 from 10.0.0.11
[root@xen-fc5 ~]# ssh user.net
The authenticity of host 'shell.sf.net (66.35.250.208)' can't be established.
DSA key fingerprint is 4c:68:03:d4:5c:58:a6:1d:9d:17:13:24:14:48:ba:99.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'shell.sf.net,66.35.250.208' (DSA) to the list of
known hosts.
user.net's password:

Works for me... please try to reinstall the guess using the FC5 released
environment, it does work for me. What happen if your guest is running in 
another machine than the one driving the PPPoE connection ?



Daniel

Comment 4 Deon George 2006-04-20 09:48:54 UTC
Hey Daniel - I'm stumped.

Finally got xenguest-install.py to install a guest. In the guest SSH works
successfully to local ethernet devices (both to and from works fine), however it
does not work across the PPPoE link on the xen host to "non local" SSH hosts. (I
have tried to shell.sf.net as well as through IPSEC tunnels that use that link -
both fail at the same spot. IE: I have network connectivity, but something
stalls the SSH connection.)

I can successfully connect to SSH hosts via other servers that use the same
PPPoE link as the xen host (IE: I route through that host). So the problem has
to be going from xenguest -> xenhost -> pppoe

Have you got any ideas? Why does it work for other physical hosts, but not for
xen guests?

Comment 5 Deon George 2006-04-20 10:08:45 UTC
I've just noticed and can confirm that everytime I SSH to a host via the PPPoE
link on the xenhost, that the xen guest reports the following in syslog:

kernel: Received packet is 10 bytes before head.

I think this is related - googling around doesnt show a workaround for this...

Got any ideas?

Comment 6 Brian Stein 2006-10-26 20:27:39 UTC
Please confirm this has been resolved with newer releases.

Comment 7 Deon George 2006-11-19 01:04:15 UTC
This is not a problem with FC6.

Using ssh from a xen guest, to a host via a PPPoE link (where the PPPoE link is
on the xen host) works fine.

Comment 8 Brian Stein 2006-11-20 15:15:28 UTC
Thanks ... closing.