Bug 1438662 - CI tests failed with ssh timeout error
CI tests failed with ssh timeout error
Status: CLOSED ERRATA
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-neutron (Show other bugs)
11.0 (Ocata)
Unspecified Unspecified
urgent Severity urgent
: z1
: 11.0 (Ocata)
Assigned To: Ihar Hrachyshka
Eran Kuris
: AutomationBlocker, Triaged, ZStream
: 1433685 1433688 1433702 1433710 1433712 1438346 (view as bug list)
Depends On: 1450205 1450203
Blocks:
  Show dependency treegraph
 
Reported: 2017-04-04 01:53 EDT by Eran Kuris
Modified: 2017-07-19 13:03 EDT (History)
8 users (show)

See Also:
Fixed In Version: openstack-neutron-10.0.1-4.el7ost
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1457504 (view as bug list)
Environment:
Last Closed: 2017-07-19 13:03:14 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
log1 (97.43 KB, text/plain)
2017-04-04 01:59 EDT, Eran Kuris
no flags Details
log2 (42.63 KB, text/plain)
2017-04-04 02:01 EDT, Eran Kuris
no flags Details


External Trackers
Tracker ID Priority Status Summary Last Updated
OpenStack gerrit 463816 None None None 2017-05-15 23:03 EDT
OpenStack gerrit 464020 None None None 2017-05-15 23:03 EDT

  None (edit)
Description Eran Kuris 2017-04-04 01:53:53 EDT
Description of problem:
 
Running tempest, the following tests are failing  

1. tempest.scenario.test_network_advanced_server_ops.TestNetworkAdvancedServerOps.test_server_connectivity_stop_start failed

2. tempest.scenario.test_network_advanced_server_ops.TestNetworkAdvancedServerOps.test_server_connectivity_suspend_resume

3. tempest.scenario.test_network_basic_ops.TestNetworkBasicOps.test_subnet_details


Version-Release number of selected component (if applicable):
python-neutron-lib-1.1.0-0.20170213120052.9b3ea8f.el7ost.noarch
openstack-neutron-ml2-10.0.0-5.el7ost.noarch
openstack-neutron-lbaas-10.0.1-0.20170222151526.c6011fb.el7ost.noarch
python-neutronclient-6.1.0-0.20170208193918.1a2820d.el7ost.noarch
openstack-neutron-common-10.0.0-5.el7ost.noarch
openstack-neutron-10.0.0-5.el7ost.noarch
openstack-neutron-openvswitch-10.0.0-5.el7ost.noarch
puppet-neutron-10.3.0-1.el7ost.noarch
python-neutron-10.0.0-5.el7ost.noarch
python-neutron-lbaas-10.0.1-0.20170222151526.c6011fb.el7ost.noarch

How reproducible:
always

Steps to Reproduce:
1. Deploy RH-OSP 11 HA (3 controllers, 2 compute)
2. Run tempest network scenario tests


Actual results:
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/tempest/test.py", line 103, in wrapper
    return f(self, *func_args, **func_kwargs)
  File "/usr/lib/python2.7/site-packages/tempest/scenario/test_network_advanced_server_ops.py", line 176, in test_server_connectivity_suspend_resume
    server, keypair, floating_ip)
  File "/usr/lib/python2.7/site-packages/tempest/scenario/test_network_advanced_server_ops.py", line 101, in _wait_server_status_and_check_network_connectivity
    self._check_network_connectivity(server, keypair, floating_ip)
  File "/usr/lib/python2.7/site-packages/tempest/scenario/test_network_advanced_server_ops.py", line 94, in _check_network_connectivity
    servers=[server])
  File "/usr/lib/python2.7/site-packages/tempest/scenario/manager.py", line 608, in check_public_network_connectivity
    mtu=mtu)
  File "/usr/lib/python2.7/site-packages/tempest/scenario/manager.py", line 591, in check_vm_connectivity
    msg=msg)
  File "/usr/lib/python2.7/site-packages/unittest2/case.py", line 678, in assertTrue
    raise self.failureException(msg)
AssertionError: False is not true : Timed out waiting for 10.0.0.211 to become reachable

Expected results:
Tests passed 100%

Additional info:
https://rhos-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/view/RHOS/view/RHOS11/job/qe-DFG-neutron-11_director-rhel-7.3-virthost-3cont_2comp-ipv4-gre-lvm-lbaas/lastCompletedBuild/testReport/tempest.scenario.test_network_advanced_server_ops/TestNetworkAdvancedServerOps/test_server_connectivity_suspend_resume_compute_id_5cdf9499_541d_4923_804e_b9a60620a7f0_network_/
Comment 1 Eran Kuris 2017-04-04 01:59:07 EDT
the following tests are failing with another traceback that related to SSHTIMEOUT: 

1.tempest.scenario.test_network_v6.TestGettingAddress.test_dualnet_multi_prefix_dhcpv6_stateless

2.tempest.scenario.test_network_v6.TestGettingAddress.test_dualnet_dhcp6_stateless_from_os

3.tempest.scenario.test_network_v6.TestGettingAddress.test_multi_prefix_slaac

4.tempest.scenario.test_security_groups_basic_ops.TestSecurityGroupsBasicOps.test_cross_tenant_traffic 

5.neutron.tests.tempest.scenario.test_basic.NetworkBasicTest.test_basic_instance

Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/tempest/test.py", line 103, in wrapper
    return f(self, *func_args, **func_kwargs)
  File "/usr/lib/python2.7/site-packages/tempest/scenario/test_network_v6.py", line 246, in test_dualnet_dhcp6_stateless_from_os
    self._prepare_and_test(address6_mode='dhcpv6-stateless', dualnet=True)
  File "/usr/lib/python2.7/site-packages/tempest/scenario/test_network_v6.py", line 161, in _prepare_and_test
    sshv4_1, ips_from_api_1, sid1 = self.prepare_server(networks=net_list)
  File "/usr/lib/python2.7/site-packages/tempest/scenario/test_network_v6.py", line 134, in prepare_server
    username=username)
  File "/usr/lib/python2.7/site-packages/tempest/scenario/manager.py", line 351, in get_remote_client
    linux_client.validate_authentication()
  File "/usr/lib/python2.7/site-packages/tempest/common/utils/linux/remote_client.py", line 55, in wrapper
    six.reraise(*original_exception)
  File "/usr/lib/python2.7/site-packages/tempest/common/utils/linux/remote_client.py", line 36, in wrapper
    return function(self, *args, **kwargs)
  File "/usr/lib/python2.7/site-packages/tempest/common/utils/linux/remote_client.py", line 100, in validate_authentication
    self.ssh_client.test_connection_auth()
  File "/usr/lib/python2.7/site-packages/tempest/lib/common/ssh.py", line 206, in test_connection_auth
    connection = self._get_ssh_connection()
  File "/usr/lib/python2.7/site-packages/tempest/lib/common/ssh.py", line 120, in _get_ssh_connection
    password=self.password)
tempest.lib.exceptions.SSHTimeout: Connection to the 10.0.0.221 via SSH timed out.
User: cirros, Password: None


https://rhos-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/view/RHOS/view/RHOS11/job/qe-DFG-neutron-11_director-rhel-7.3-virthost-1cont_2comp-ipv4-vxlan-lvm-lbaas/lastCompletedBuild/testReport/tempest.scenario.test_network_v6/TestGettingAddress/test_dualnet_dhcp6_stateless_from_os_compute_id_76f26acd_9688_42b4_bc3e_cd134c4cb09e_network_slow_/
Comment 2 Eran Kuris 2017-04-04 01:59 EDT
Created attachment 1268575 [details]
log1
Comment 3 Eran Kuris 2017-04-04 02:01 EDT
Created attachment 1268576 [details]
log2
Comment 4 Eran Kuris 2017-04-04 02:01:37 EDT
*** Bug 1433712 has been marked as a duplicate of this bug. ***
Comment 5 Eran Kuris 2017-04-04 02:02:11 EDT
*** Bug 1433710 has been marked as a duplicate of this bug. ***
Comment 6 Eran Kuris 2017-04-04 02:02:39 EDT
*** Bug 1433702 has been marked as a duplicate of this bug. ***
Comment 7 Eran Kuris 2017-04-04 02:03:02 EDT
*** Bug 1433688 has been marked as a duplicate of this bug. ***
Comment 8 Eran Kuris 2017-04-04 02:03:41 EDT
*** Bug 1433685 has been marked as a duplicate of this bug. ***
Comment 10 Ihar Hrachyshka 2017-05-11 15:55:52 EDT
I reported the following bugs against kernel:

https://bugzilla.redhat.com/show_bug.cgi?id=1450203
https://bugzilla.redhat.com/show_bug.cgi?id=1450205
Comment 12 Ihar Hrachyshka 2017-05-15 10:40:38 EDT
*** Bug 1438346 has been marked as a duplicate of this bug. ***
Comment 13 Ihar Hrachyshka 2017-05-18 13:15:47 EDT
A workaround on neutron side applied. This should help CI runs.
Comment 16 Ihar Hrachyshka 2017-05-23 21:00:46 EDT
The jobs still fail in gate. We probably need to enforce gratuitous ARP overriding existing entries, but with current kernel, only ARP REQUESTs will work, we need to backport https://review.openstack.org/#/c/463816/. Then we also need to enable arp_accept = 1 on eth2 on undercloud.
Comment 17 Ihar Hrachyshka 2017-05-31 17:00:38 EDT
This fix should be tested with https://bugzilla.redhat.com/show_bug.cgi?id=1457504 fixed on tripleo side.
Comment 24 errata-xmlrpc 2017-07-19 13:03:14 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:1785

Note You need to log in before you can comment on or make changes to this bug.