Bug 1965897 - 16.2 / 17 line jobs are failing on tempest tests with "ERROR tempest.lib.common.ssh paramiko.ssh_exception.NoValidConnectionsError: [Errno None] Unable to connect to port 22 on <IP>"
Summary: 16.2 / 17 line jobs are failing on tempest tests with "ERROR tempest.lib.comm...
Keywords:
Status: CLOSED DUPLICATE of bug 1956748
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-neutron
Version: 16.2 (Train)
Hardware: x86_64
OS: Linux
urgent
urgent
Target Milestone: ---
: ---
Assignee: Slawek Kaplonski
QA Contact: Eran Kuris
URL:
Whiteboard:
: 1967993 1967995 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-05-31 04:42 UTC by Sandeep Yadav
Modified: 2022-08-17 15:05 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-06-14 05:54:05 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker OSP-4307 0 None None None 2022-08-17 15:05:06 UTC

Description Sandeep Yadav 2021-05-31 04:42:56 UTC
Description of problem:

Multiple 16.2 / 17 line jobs are failing on tempest tests after unable to connect to cirros instance with below error:-

"ERROR tempest.lib.common.ssh paramiko.ssh_exception.NoValidConnectionsError: [Errno None] Unable to connect to port 22 on <IP>"


Version-Release number of selected component (if applicable):

16.2/17 Integration line in Downstream CI


How reproducible:
The issue is intermittent but the frequency is high.


Actual results:
ssh to the instance is failing to result in tempest failure.

Expected results:
ssh to the instance should work fine.

Additional info:

~~~
Captured traceback:
~~~~~~~~~~~~~~~~~~~
    b'Traceback (most recent call last):'
    b'  File "/usr/lib/python3.6/site-packages/tempest/lib/common/ssh.py", line 113, in _get_ssh_connection'
    b'    sock=proxy_chan)'
    b'  File "/usr/lib/python3.6/site-packages/paramiko/client.py", line 362, in connect'
    b'    raise NoValidConnectionsError(errors)'
    b'paramiko.ssh_exception.NoValidConnectionsError: [Errno None] Unable to connect to port 22 on 192.168.24.147'
    b''
    b'During handling of the above exception, another exception occurred:'
    b''
    b'Traceback (most recent call last):'
    b'  File "/usr/lib/python3.6/site-packages/neutron_tempest_plugin/scenario/test_ipv6.py", line 169, in test_ipv6_hotplug_slaac'
    b'    self._test_ipv6_hotplug("slaac", "slaac")'
    b'  File "/usr/lib/python3.6/site-packages/neutron_tempest_plugin/scenario/test_ipv6.py", line 154, in _test_ipv6_hotplug'
    b'    self._test_ipv6_address_configured(ssh_client, vm, ipv6_port)'
    b'  File "/usr/lib/python3.6/site-packages/neutron_tempest_plugin/scenario/test_ipv6.py", line 114, in _test_ipv6_address_configured'
    b'    turn_nic6_on(ssh_client, ipv6_port)'
    b'  File "/usr/lib/python3.6/site-packages/neutron_tempest_plugin/scenario/test_ipv6.py", line 45, in turn_nic6_on'
    b"    nic = ip_command.get_nic_name_by_mac(ipv6_port['mac_address'])"
    b'  File "/usr/lib/python3.6/site-packages/neutron_tempest_plugin/common/ip.py", line 153, in get_nic_name_by_mac'
    b'    nics = self.execute("-o", "link")'
    b'  File "/usr/lib/python3.6/site-packages/neutron_tempest_plugin/common/ip.py", line 50, in execute'
    b'    timeout=self.timeout).stdout'
    b'  File "/usr/lib/python3.6/site-packages/neutron_tempest_plugin/common/shell.py", line 69, in execute'
    b'    ssh_client=ssh_client)'
    b'  File "/usr/lib/python3.6/site-packages/neutron_tempest_plugin/common/shell.py", line 103, in execute_remote_command'
    b'    stdout = ssh_client.exec_command(command, timeout=timeout)'
    b'  File "/usr/lib/python3.6/site-packages/tenacity/__init__.py", line 292, in wrapped_f'
    b'    return self.call(f, *args, **kw)'
    b'  File "/usr/lib/python3.6/site-packages/tenacity/__init__.py", line 358, in call'
    b'    do = self.iter(retry_state=retry_state)'
    b'  File "/usr/lib/python3.6/site-packages/tenacity/__init__.py", line 319, in iter'
    b'    return fut.result()'
    b'  File "/usr/lib64/python3.6/concurrent/futures/_base.py", line 425, in result'
    b'    return self.__get_result()'
    b'  File "/usr/lib64/python3.6/concurrent/futures/_base.py", line 384, in __get_result'
    b'    raise self._exception'
    b'  File "/usr/lib/python3.6/site-packages/tenacity/__init__.py", line 361, in call'
    b'    result = fn(*args, **kwargs)'
    b'  File "/usr/lib/python3.6/site-packages/neutron_tempest_plugin/common/ssh.py", line 178, in exec_command'
    b'    return super(Client, self).exec_command(cmd=cmd, encoding=encoding)'
    b'  File "/usr/lib/python3.6/site-packages/tempest/lib/common/ssh.py", line 159, in exec_command'
    b'    ssh = self._get_ssh_connection()'
    b'  File "/usr/lib/python3.6/site-packages/tempest/lib/common/ssh.py", line 129, in _get_ssh_connection'
    b'    password=self.password)'
    b'tempest.lib.exceptions.SSHTimeout: Connection to the 192.168.24.147 via SSH timed out.'
    b'User: cirros, Password: None'
    b''


    b'2021-05-30 15:23:57.758 145023 ERROR tempest.lib.common.ssh Traceback (most recent call last):'
    b'2021-05-30 15:23:57.758 145023 ERROR tempest.lib.common.ssh   File "/usr/lib/python3.6/site-packages/tempest/lib/common/ssh.py", line 113, in _get_ssh_connection'
    b'2021-05-30 15:23:57.758 145023 ERROR tempest.lib.common.ssh     sock=proxy_chan)'
    b'2021-05-30 15:23:57.758 145023 ERROR tempest.lib.common.ssh   File "/usr/lib/python3.6/site-packages/paramiko/client.py", line 362, in connect'
    b'2021-05-30 15:23:57.758 145023 ERROR tempest.lib.common.ssh     raise NoValidConnectionsError(errors)'
    b'2021-05-30 15:23:57.758 145023 ERROR tempest.lib.common.ssh paramiko.ssh_exception.NoValidConnectionsError: [Errno None] Unable to connect to port 22 on 192.168.24.147'
    b'2021-05-30 15:23:57.758 145023 ERROR tempest.lib.common.ssh '
~~~

Comment 2 bkopilov 2021-06-01 15:19:20 UTC
Hi all , 
Looks like this issue reproduced on other jobs in phase 3 , 

After troubleshooting looks like:
#1 sometimes we can not ssh with keys to the instance from tempest .
When it happens , nova conosole-logs reports :
Initializing random number generator... [    4.561793] random: dd urandom read with 18 bits of entropy available
done.
Starting acpid: OK
Starting network...
udhcpc (v1.23.2) started
Sending discover...
Sending select for 10.100.0.4...
Lease of 10.100.0.4 obtained, lease time 43200
route: SIOCADDRT: File exists
WARN: failed: route add -net "0.0.0.0/0" gw "10.100.0.1"
checking http://169.254.169.254/2009-04-04/instance-id
failed 1/20: up 4.74. request failed
failed 2/20: up 16.77. request failed
failed 3/20: up 28.80. request failed
failed 4/20: up 40.83. request failed
failed 5/20: up 52.85. request failed
failed 6/20: up 64.88. request failed
failed 7/20: up 76.91. request failed
failed 8/20: up 88.94. request failed
failed 9/20: up 100.96. request failed


I accessed to the instance and i could not ping to the 169.254.169.254 address .


In case it works , i can ping to the metadata address.


Benny

Comment 3 bkopilov 2021-06-01 15:19:34 UTC
Hi all , 
Looks like this issue reproduced on other jobs in phase 3 , 

After troubleshooting looks like:
#1 sometimes we can not ssh with keys to the instance from tempest .
When it happens , nova conosole-logs reports :
Initializing random number generator... [    4.561793] random: dd urandom read with 18 bits of entropy available
done.
Starting acpid: OK
Starting network...
udhcpc (v1.23.2) started
Sending discover...
Sending select for 10.100.0.4...
Lease of 10.100.0.4 obtained, lease time 43200
route: SIOCADDRT: File exists
WARN: failed: route add -net "0.0.0.0/0" gw "10.100.0.1"
checking http://169.254.169.254/2009-04-04/instance-id
failed 1/20: up 4.74. request failed
failed 2/20: up 16.77. request failed
failed 3/20: up 28.80. request failed
failed 4/20: up 40.83. request failed
failed 5/20: up 52.85. request failed
failed 6/20: up 64.88. request failed
failed 7/20: up 76.91. request failed
failed 8/20: up 88.94. request failed
failed 9/20: up 100.96. request failed


I accessed to the instance and i could not ping to the 169.254.169.254 address .


In case it works , i can ping to the metadata address.


Benny

Comment 5 Kashyap Chamarthy 2021-06-07 12:53:52 UTC
*** Bug 1967995 has been marked as a duplicate of this bug. ***

Comment 6 Kashyap Chamarthy 2021-06-07 12:55:35 UTC
*** Bug 1967993 has been marked as a duplicate of this bug. ***

Comment 8 Sandeep Yadav 2021-06-14 05:54:05 UTC
Hello Slaweq,

Yes, we can close this as duplicate of https://bugzilla.redhat.com/show_bug.cgi?id=1956748( Fixed in version mentioned is ovn2.13-20.12.0-120.el7fdp) 

As we are using ovn-2021, Fixed in ovn-2021-21.03.0-40.el8fdp.x86_64.rpm

*** This bug has been marked as a duplicate of bug 1956748 ***


Note You need to log in before you can comment on or make changes to this bug.