Bug 1468868 - CI tests failed with ssh timeout error when ovs-security-group is enabled
CI tests failed with ssh timeout error when ovs-security-group is enabled
Status: CLOSED DUPLICATE of bug 1508738
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-neutron (Show other bugs)
11.0 (Ocata)
Unspecified Unspecified
high Severity high
: ---
: 12.0 (Pike)
Assigned To: Jakub Libosvar
Toni Freger
: Triaged
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2017-07-09 02:17 EDT by Eran Kuris
Modified: 2017-11-16 12:23 EST (History)
4 users (show)

See Also:
Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2017-11-16 12:23:47 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Eran Kuris 2017-07-09 02:17:26 EDT
Description of problem:
The issue with ssh timeout errors reproduces on our CI job only when the job configures with OVS-security-group.

Debugging with dev-{Ihar}  found this:

1. https://rhos-qe-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/view/DFG/view/network/view/neutron-lbaas/job/DFG-network-neutron-lbaas-11_director-7.3-virthost-3cont_2comp-ipv4-vxlan-ovs-secgroups-with-custom-guest-image/10/testReport/tempest.scenario.test_network_v6/TestGettingAddress/test_dualnet_dhcp6_stateless_from_os_compute_id_76f26acd_9688_42b4_bc3e_cd134c4cb09e_network_slow_/

There, you can see that it claims connectivity failure, but when you look at tempest log, you see that it successfully reached the node via ssh:

2017-07-04 05:25:57,102 21550 INFO     [paramiko.transport] Connected (version 2.0, client OpenSSH_6.6.1)
2017-07-04 05:25:57,221 21550 INFO     [paramiko.transport] Authentication (publickey) successful!
2017-07-04 05:25:57,253 21550 INFO     [tempest.lib.common.ssh] ssh connection to cloud-user@10.0.0.212 successfully created

The way the test checks if everything works is by logging via ssh and calling 'ip address' and checking that the expected address is in the output.

2. https://rhos-qe-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/view/DFG/view/network/view/neutron-lbaas/job/DFG-network-neutron-lbaas-11_director-7.3-virthost-3cont_2comp-ipv4-vxlan-ovs-secgroups-with-custom-guest-image/10/testReport/neutron.tests.tempest.scenario.test_qos/QoSTest/test_qos_id_1f7ed39b_428f_410a_bd2b_db9f465680df_/

Again, tempest output suggests that ssh connectivity is ok, but it still fails to connect to the desired address to execute bandwidth measurement. 

3. https://rhos-qe-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/view/DFG/view/network/view/neutron-lbaas/job/DFG-network-neutron-lbaas-11_director-7.3-virthost-3cont_2comp-ipv4-vxlan-ovs-secgroups-with-custom-guest-image/10/testReport/neutron_lbaas.tests.tempest.v2.scenario.test_listener_basic/TestListenerBasic/test_listener_basic_compute_network_/

This explicitly says it's a timeout, but again, ssh connectivity is fine, it just seems like 'nc' started in the instance can't be reached.

4. https://rhos-qe-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/view/DFG/view/network/view/neutron-lbaas/job/DFG-network-neutron-lbaas-11_director-7.3-virthost-3cont_2comp-ipv4-vxlan-ovs-secgroups-with-custom-guest-image/10/testReport/tempest.scenario.test_network_basic_ops/TestNetworkBasicOps/test_update_instance_port_admin_state_compute_id_f5dfcc22_45fd_409f_954c_5bd500d7890b_network_slow_/

The failure is for SSH (Error reading SSH protocol banner) but see that ping for the address was successful. Also in console log, we see "[   20.541718] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready" 


https://rhos-qe-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/view/DFG/view/network/view/neutron-lbaas/job/DFG-network-neutron-lbaas-11_director-7.3-virthost-3cont_2comp-ipv4-vxlan-ovs-secgroups-with-custom-guest-image/10/testReport/

Version-Release number of selected component (if applicable):
python-neutron-10.0.2-1.el7ost.noarch
openstack-neutron-openvswitch-10.0.2-1.el7ost.noarch
puppet-neutron-10.3.1-1.el7ost.noarch
python-neutron-tests-10.0.2-1.el7ost.noarch
openstack-neutron-ml2-10.0.2-1.el7ost.noarch
openstack-neutron-10.0.2-1.el7ost.noarch
How reproducible:
always

Steps to Reproduce:
1.run the job: https://rhos-qe-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/view/DFG/view/network/view/neutron-lbaas/job/DFG-network-neutron-lbaas-11_director-7.3-virthost-3cont_2comp-ipv4-vxlan-ovs-secgroups-with-custom-guest-image/
2.
3.

Actual results:


Expected results:


Additional info:
Comment 1 Assaf Muller 2017-07-17 14:27:45 EDT
We should get an OVS-FW job stable for OSP 12.
Comment 2 Jakub Libosvar 2017-11-16 12:23:47 EST

*** This bug has been marked as a duplicate of bug 1508738 ***

Note You need to log in before you can comment on or make changes to this bug.