1468868 – CI tests failed with ssh timeout error when ovs-security-group is enabled

Bug 1468868 - CI tests failed with ssh timeout error when ovs-security-group is enabled

Summary: CI tests failed with ssh timeout error when ovs-security-group is enabled

Keywords:
Status:	CLOSED DUPLICATE of bug 1508738
Alias:	None
Product:	Red Hat OpenStack
Classification:	Red Hat
Component:	openstack-neutron
Sub Component:
Version:	11.0 (Ocata)
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	high
Target Milestone:	---
Target Release:	12.0 (Pike)
Assignee:	Jakub Libosvar
QA Contact:	Toni Freger
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2017-07-09 06:17 UTC by Eran Kuris
Modified:	2017-11-16 17:23 UTC (History)
CC List:	4 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2017-11-16 17:23:47 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Description Eran Kuris 2017-07-09 06:17:26 UTC

Description of problem:
The issue with ssh timeout errors reproduces on our CI job only when the job configures with OVS-security-group.

Debugging with dev-{Ihar}  found this:

1. https://rhos-qe-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/view/DFG/view/network/view/neutron-lbaas/job/DFG-network-neutron-lbaas-11_director-7.3-virthost-3cont_2comp-ipv4-vxlan-ovs-secgroups-with-custom-guest-image/10/testReport/tempest.scenario.test_network_v6/TestGettingAddress/test_dualnet_dhcp6_stateless_from_os_compute_id_76f26acd_9688_42b4_bc3e_cd134c4cb09e_network_slow_/

There, you can see that it claims connectivity failure, but when you look at tempest log, you see that it successfully reached the node via ssh:

2017-07-04 05:25:57,102 21550 INFO     [paramiko.transport] Connected (version 2.0, client OpenSSH_6.6.1)
2017-07-04 05:25:57,221 21550 INFO     [paramiko.transport] Authentication (publickey) successful!
2017-07-04 05:25:57,253 21550 INFO     [tempest.lib.common.ssh] ssh connection to cloud-user.0.212 successfully created

The way the test checks if everything works is by logging via ssh and calling 'ip address' and checking that the expected address is in the output.

2. https://rhos-qe-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/view/DFG/view/network/view/neutron-lbaas/job/DFG-network-neutron-lbaas-11_director-7.3-virthost-3cont_2comp-ipv4-vxlan-ovs-secgroups-with-custom-guest-image/10/testReport/neutron.tests.tempest.scenario.test_qos/QoSTest/test_qos_id_1f7ed39b_428f_410a_bd2b_db9f465680df_/

Again, tempest output suggests that ssh connectivity is ok, but it still fails to connect to the desired address to execute bandwidth measurement. 

3. https://rhos-qe-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/view/DFG/view/network/view/neutron-lbaas/job/DFG-network-neutron-lbaas-11_director-7.3-virthost-3cont_2comp-ipv4-vxlan-ovs-secgroups-with-custom-guest-image/10/testReport/neutron_lbaas.tests.tempest.v2.scenario.test_listener_basic/TestListenerBasic/test_listener_basic_compute_network_/

This explicitly says it's a timeout, but again, ssh connectivity is fine, it just seems like 'nc' started in the instance can't be reached.

4. https://rhos-qe-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/view/DFG/view/network/view/neutron-lbaas/job/DFG-network-neutron-lbaas-11_director-7.3-virthost-3cont_2comp-ipv4-vxlan-ovs-secgroups-with-custom-guest-image/10/testReport/tempest.scenario.test_network_basic_ops/TestNetworkBasicOps/test_update_instance_port_admin_state_compute_id_f5dfcc22_45fd_409f_954c_5bd500d7890b_network_slow_/

The failure is for SSH (Error reading SSH protocol banner) but see that ping for the address was successful. Also in console log, we see "[   20.541718] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready" 


https://rhos-qe-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/view/DFG/view/network/view/neutron-lbaas/job/DFG-network-neutron-lbaas-11_director-7.3-virthost-3cont_2comp-ipv4-vxlan-ovs-secgroups-with-custom-guest-image/10/testReport/

Version-Release number of selected component (if applicable):
python-neutron-10.0.2-1.el7ost.noarch
openstack-neutron-openvswitch-10.0.2-1.el7ost.noarch
puppet-neutron-10.3.1-1.el7ost.noarch
python-neutron-tests-10.0.2-1.el7ost.noarch
openstack-neutron-ml2-10.0.2-1.el7ost.noarch
openstack-neutron-10.0.2-1.el7ost.noarch
How reproducible:
always

Steps to Reproduce:
1.run the job: https://rhos-qe-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/view/DFG/view/network/view/neutron-lbaas/job/DFG-network-neutron-lbaas-11_director-7.3-virthost-3cont_2comp-ipv4-vxlan-ovs-secgroups-with-custom-guest-image/
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 Assaf Muller 2017-07-17 18:27:45 UTC

We should get an OVS-FW job stable for OSP 12.

Comment 2 Jakub Libosvar 2017-11-16 17:23:47 UTC


*** This bug has been marked as a duplicate of bug 1508738 ***

Note You need to log in before you can comment on or make changes to this bug.