Bug 1038213 - Have to ping first to be able to ssh.
Summary: Have to ping first to be able to ssh.
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-neutron
Version: 4.0
Hardware: x86_64
OS: Linux
high
high
Target Milestone: async
: 4.0
Assignee: Brent Eagles
QA Contact: Ofer Blaut
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-12-04 15:51 UTC by lpeer
Modified: 2019-09-09 16:30 UTC (History)
15 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of: 1001725
Environment:
Last Closed: 2013-12-13 19:38:52 UTC
Target Upstream Version:


Attachments (Terms of Use)

Comment 1 Brent Eagles 2013-12-09 18:47:04 UTC
After examining the QE environment, I realized that there was a key difference between my test setup and the environment this was occuring: the provider network being used for the external gateway was a VLAN leg. It seemed plausible that there might be some issue with that, so I started a fresh test environment with two network interfaces, one for lab access and one for trunking VLAN networks. I created two openstack nodes an "everything host" and a standalone compute host. The openvswitch plugin was configured to use the second interface (eth1) for the gateway network and the private networks and each neutron network was configured with a segment ID to incur VLAN mapping. Besides that, packstack was configured pretty much "normally". I also created a third host with two interfaces, one that shared the trunked network with the openstack nodes and I configured a fake bridge with vlan tag to match the VLAN tag of the public network (10). I booted a server instance, verified its DHCP allocation and allocated a floating IP address (I used the same commands as indicated in this bz). From the 3rd host (call it my "experiment" or "forensic" host) I ssh'd the vm through it's public IP address without pinging first... and it worked.

Seeing as I've tried this a few different ways I'm left wondering if there isn't something about the switch configuration in the QE environment that is interfering. When I was logged in, I tried ping the gateway IPs from the router namespaces and it didn't work, while pinging from a different subnet seemed to. That seemed a little suspicious in that the reverse route may not be working properly .. but then I might be making assumptions about how that is actually supposed to work. The fact that this behavior is reported against nova-networking is also suspect. nova-networking and neutron are pretty different when it comes to the external network access so if it is failing with both, it is pretty peculiar.

In environments where traffic (including ARPs) is showing up other than where they are supposed to, I would take a look at the interface adapters, etc.. I mistakenly configured one of my nodes initially to use the rtl8139 driver (why is that the default for the second interface? *shrug*) for the VLAN trunk interface and of course that strips the VLAN tags off of everything and chaos ensued.

Can we do detailed and directed analysis of how the QE environment does the IP to VLAN mapping and eliminate it as a potential cause?

Comment 2 Brent Eagles 2013-12-10 18:43:50 UTC
Thanks to oblaut and adarazs on this issue!

After examining an environment that oblaut and adarazs made available I discovered that ARP requests weren't being made until ping or an http request or anything else was made.

Apparently there is some kind of known issue on some types of Juniper switch. They are referenced on a few pages I found:

http://forums.whirlpool.net.au/archive/2049819

http://showroute.net/juniper-ex-switch-arp-issues-with-re-filters/

http://www.juniper.net/techpubs/en_US/junos11.1/information-products/topic-collections/release-notes/11.1/index.html?topic-53333.html
 - look for 486443

Comment 3 Brent Eagles 2013-12-12 18:01:41 UTC
I should have cleared the NEEDINFO when adding that last comment.

Comment 4 Brent Eagles 2013-12-13 19:38:52 UTC
Issue caused by network switch behavior.


Note You need to log in before you can comment on or make changes to this bug.