Description of problem: 41 of neutron api/scenario tests fail on OSP 12 How reproducible: 100% Steps to Reproduce: 1. Deploy OSP 12 2. Run neutron API/scenario tests Actual results: Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/tempest/scenario/test_security_groups_basic_ops.py", line 191, in setUp self._verify_mac_addr(self.primary_tenant) File "/usr/lib/python2.7/site-packages/tempest/scenario/test_security_groups_basic_ops.py", line 448, in _verify_mac_addr access_point_ssh = self._connect_to_access_point(tenant) File "/usr/lib/python2.7/site-packages/tempest/scenario/test_security_groups_basic_ops.py", line 372, in _connect_to_access_point access_point_ssh, private_key=private_key) File "/usr/lib/python2.7/site-packages/tempest/scenario/manager.py", line 361, in get_remote_client linux_client.validate_authentication() File "/usr/lib/python2.7/site-packages/tempest/lib/common/utils/linux/remote_client.py", line 57, in wrapper six.reraise(*original_exception) File "/usr/lib/python2.7/site-packages/tempest/lib/common/utils/linux/remote_client.py", line 30, in wrapper return function(self, *args, **kwargs) File "/usr/lib/python2.7/site-packages/tempest/lib/common/utils/linux/remote_client.py", line 113, in validate_authentication self.ssh_client.test_connection_auth() File "/usr/lib/python2.7/site-packages/tempest/lib/common/ssh.py", line 207, in test_connection_auth connection = self._get_ssh_connection() File "/usr/lib/python2.7/site-packages/tempest/lib/common/ssh.py", line 121, in _get_ssh_connection password=self.password) tempest.lib.exceptions.SSHTimeout: Connection to the 10.0.0.213 via SSH timed out. User: cloud-user, Password: None How reproducible: Expected results: All tests finished successfully
On provided environment, I ran all network scenario tests and only following failed: failure: tempest.scenario.test_network_basic_ops.TestNetworkBasicOps.test_subnet_details[compute,id-d8bb918e-e2df-48b2-97cd-b73c95450980,network,slow] Error was following: Traceback (most recent call last): File "tempest/common/utils/__init__.py", line 89, in wrapper return f(self, *func_args, **func_kwargs) File "tempest/scenario/test_network_basic_ops.py", line 618, in test_subnet_details renew_delay), File "tempest/lib/common/utils/test_utils.py", line 103, in call_until_true if func(): File "tempest/scenario/test_network_basic_ops.py", line 610, in check_new_dns_server dhcp_client=CONF.scenario.dhcp_client) File "tempest/common/utils/linux/remote_client.py", line 140, in renew_lease return getattr(self, '_renew_lease_' + dhcp_client)(fixed_ip=fixed_ip) File "tempest/common/utils/linux/remote_client.py", line 116, in _renew_lease_udhcpc format(path=file_path, nic=nic_name)) File "tempest/lib/common/utils/linux/remote_client.py", line 30, in wrapper return function(self, *args, **kwargs) File "tempest/lib/common/utils/linux/remote_client.py", line 105, in exec_command return self.ssh_client.exec_command(cmd) File "tempest/lib/common/ssh.py", line 202, in exec_command stderr=err_data, stdout=out_data) tempest.lib.exceptions.SSHExecCommandFailed: Command 'set -eu -o pipefail; PATH=$PATH:/sbin; cat /var/run/udhcpc.eth0.pid', exit status: 1, stderr: cat: /var/run/udhcpc.eth0.pid: No such file or directory Which means SSH was successful and the error seems to be related to the used image. failure: tempest.scenario.test_network_v6.TestGettingAddress.test_dualnet_dhcp6_stateless_from_os[compute,id-76f26acd-9688-42b4-bc3e-cd134c4cb09e,network,slow] failure: tempest.scenario.test_network_v6.TestGettingAddress.test_dualnet_multi_prefix_dhcpv6_stateless[compute,id-cf1c4425-766b-45b8-be35-e2959728eb00,network,slow] failure: tempest.scenario.test_network_v6.TestGettingAddress.test_dualnet_multi_prefix_slaac[compute,id-9178ad42-10e4-47e9-8987-e02b170cc5cd,network] failure: tempest.scenario.test_network_v6.TestGettingAddress.test_dualnet_slaac_from_os[compute,id-b6399d76-4438-4658-bcf5-0d6c8584fde2,network,slow] Those failures were not SSH related but rather caused by issue tracked by bug 1486324 Is there any specific job I need to run to get a failure related to ovs firewall?
*** Bug 1468868 has been marked as a duplicate of this bug. ***
This option exist - "dhcp_client = dhclient" under [scenario] section in your tempest.conf file. I am using it on SRIOV setup when instances are RHEL.
(In reply to Eran Kuris from comment #13) > This option exist - "dhcp_client = dhclient" under [scenario] section in > your tempest.conf file. I am using it on SRIOV setup when instances are RHEL. Perhaps this bugzilla is a NOTABUG then and I just didn't find in puppet how it's set. Can you confirm the option wasn't added there by something else? I know that Infrared has a support to configure tempest conf. As Infrared is used by automation, I'm removing the AutomationBlocker keyword for now.
As we discuss via IRC its valid bug. The option that I used added manually. So we should keep the BZ open to implementing it via director.
What is the source of selecting rhel guest image, for being at the end consumed for tempest tests? Does TripleO or puppet upload and configure this image to be used? Is it done by CI framework or some other tool? Whichever is the point/tool deciding to use this image, would be the correct place responsible for configuring this option. E.g. if tripleo would indeed be the one obtaining rhel guest image, uploading to glance and configuring it in tempest.conf (maybe via deployer input), then yes at same time it should set this scenario.dhcp_client option too. If it's some other tool/framework (e.g. oooq or infrared) it's their responsibility to configure this. If it's user choice (e.g. via parameters to tools mentioned above) it's obviously also up to user to configure any specifics for such choice. Does not seem reasonable for me to expect tripleo or puppet to know which tests user will be performing and with what image, it could be some ubuntu, rhel, ..., cirros or completely custom build lfs images, and could have any arbitrary dhcp clients (dhcpcd, dhclient, systemd-networkd ...). Possible autodetection would have to inspect the image configured in tempest.conf (e.g. by booting the vm or using libguestfs/guestfish etc) and trying to look for most common dhcp client binaries hopefully in path. If handling also the upload of this image then possibly at that time.
Summarizing the issue: There had been a problem with ssh connection to an instance booted during tempest testing. At that time there had been some issues in the mentioned jenkins jobs as I have pointed out in a comment above. So this might have been the root cause. If any installer (hasn't been specified in the bug) used its default image (for example cirros) and the tests has failed with that image, then a bug against that installer might be opened. If a custom image was used, then it's a user's responsibility to have that image prepared correctly and to set correct options in tempest.conf. python-tempestconf can't analyse any image, it's not in its capabilities. Based on this, comments above and no response at the previous comment by Pavel Sedlak, I'm marking this bug as NOTABUG. If you think otherwise, feel free to reopen the bug and change the component accordingly.