Bug 1809634
Summary: | [osp16] ERROR neutron.agent.dhcp.agent neutron_lib.exceptions.ProcessExecutionError: Exit code: 1; Stdin: ; Stdout: ; Stderr: | ||
---|---|---|---|
Product: | Red Hat OpenStack | Reporter: | Chris Janiszewski <cjanisze> |
Component: | tripleo-ansible | Assignee: | Rodolfo Alonso <ralonsoh> |
Status: | CLOSED CURRENTRELEASE | QA Contact: | Candido Campos <ccamposr> |
Severity: | urgent | Docs Contact: | |
Priority: | high | ||
Version: | 16.1 (Train) | CC: | bcafarel, bhaley, chrisw, jschluet, kholden, mweetman, oblaut, racedoro, ralonsoh, rdiwakar, scohen, stchen, vkoul |
Target Milestone: | --- | Keywords: | Reopened, TestOnly, Triaged |
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | tripleo-ansible-0.5.1-0.20200528013427.9da6a00.el8ost | Doc Type: | If docs needed, set a value |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2021-07-01 10:33:01 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 1810475 |
Description
Chris Janiszewski
2020-03-03 15:19:10 UTC
One thing that have happened in this environment is the dhcp neutron port have disappeared and maybe it's related to this bug (chrisj-osp16) [stack@undercloud-osp16 ~]$ openstack subnet show baremetal-sub +-------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Field | Value | +-------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | allocation_pools | 172.31.10.70-172.31.10.199 | | cidr | 172.31.10.0/24 | | created_at | 2020-02-27T21:14:55Z | | description | | | dns_nameservers | 172.31.8.1 | | enable_dhcp | True | | gateway_ip | 172.31.10.254 | | host_routes | | | id | e69757d3-e3c7-4634-8657-726c19883017 | | ip_version | 4 | | ipv6_address_mode | None | | ipv6_ra_mode | None | | location | cloud='', project.domain_id=, project.domain_name='Default', project.id='f349b269b5bd470f83ffb18fdc1cc049', project.name='admin', region_name='regionOne', zone= | | name | baremetal-sub | | network_id | 45566c04-9a73-4736-acb5-abd040e63bed | | prefix_length | None | | project_id | f349b269b5bd470f83ffb18fdc1cc049 | | revision_number | 0 | | segment_id | None | | service_types | | | subnetpool_id | None | | tags | | | updated_at | 2020-02-27T21:14:55Z | +-------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+ (chrisj-osp16) [stack@undercloud-osp16 ~]$ for i in `openstack port list | awk '/172.31.10/ {print $2}'`; do openstack port show $i | grep device_owner; done | device_owner | baremetal:none | | device_owner | compute:nova | | device_owner | Hello Chris: I debugged this issue with Slawek and we found that, in case the namespace is not found, the haproxy kill script will fail and trigger again the resync process. This resync process will loop again and again returning always the same error: """ 2020-03-03 13:59:42.147 6397 INFO neutron.agent.dhcp.agent [req-7555cc29-6a4a-4010-9211-25d45da0ea5c - - - - -] Synchronizing state complete 2020-03-03 13:59:42.147 6397 INFO neutron.agent.dhcp.agent [req-7555cc29-6a4a-4010-9211-25d45da0ea5c - - - - -] Synchronizing state 2020-03-03 13:59:42.438 6397 INFO neutron.agent.dhcp.agent [req-51828289-eee2-404c-9243-06be02eaca70 - - - - -] All active networks have been fetched through RPC. 2020-03-03 13:59:42.450 6397 ERROR neutron.agent.linux.utils [req-51828289-eee2-404c-9243-06be02eaca70 - - - - -] Exit code: 1; Stdin: ; Stdout: ; Stderr: 2020-03-03 13:59:42.451 6397 ERROR neutron.agent.dhcp.agent [req-51828289-eee2-404c-9243-06be02eaca70 - - - - -] Unable to sync network state on deleted network 1b2621fe-38c7-4572-9b5f-45b2f0687f94: neutron_lib.exceptions.ProcessExecutionError: Exit code: 1; Stdin: ; Stdout: ; Stderr: 2020-03-03 13:59:42.451 6397 ERROR neutron.agent.dhcp.agent Traceback (most recent call last): 2020-03-03 13:59:42.451 6397 ERROR neutron.agent.dhcp.agent File "/usr/lib/python3.6/site-packages/neutron/agent/dhcp/agent.py", line 218, in sync_state 2020-03-03 13:59:42.451 6397 ERROR neutron.agent.dhcp.agent self.disable_dhcp_helper(deleted_id) 2020-03-03 13:59:42.451 6397 ERROR neutron.agent.dhcp.agent File "/usr/lib/python3.6/site-packages/osprofiler/profiler.py", line 160, in wrapper 2020-03-03 13:59:42.451 6397 ERROR neutron.agent.dhcp.agent result = f(*args, **kwargs) 2020-03-03 13:59:42.451 6397 ERROR neutron.agent.dhcp.agent File "/usr/lib/python3.6/site-packages/neutron/agent/dhcp/agent.py", line 357, in disable_dhcp_helper 2020-03-03 13:59:42.451 6397 ERROR neutron.agent.dhcp.agent self.disable_isolated_metadata_proxy(network) 2020-03-03 13:59:42.451 6397 ERROR neutron.agent.dhcp.agent File "/usr/lib/python3.6/site-packages/osprofiler/profiler.py", line 160, in wrapper 2020-03-03 13:59:42.451 6397 ERROR neutron.agent.dhcp.agent result = f(*args, **kwargs) 2020-03-03 13:59:42.451 6397 ERROR neutron.agent.dhcp.agent File "/usr/lib/python3.6/site-packages/neutron/agent/dhcp/agent.py", line 700, in disable_isolated_metadata_proxy 2020-03-03 13:59:42.451 6397 ERROR neutron.agent.dhcp.agent self._process_monitor, uuid, self.conf, network.namespace) 2020-03-03 13:59:42.451 6397 ERROR neutron.agent.dhcp.agent File "/usr/lib/python3.6/site-packages/neutron/agent/metadata/driver.py", line 250, in destroy_monitored_metadata_proxy 2020-03-03 13:59:42.451 6397 ERROR neutron.agent.dhcp.agent pm.disable() 2020-03-03 13:59:42.451 6397 ERROR neutron.agent.dhcp.agent File "/usr/lib/python3.6/site-packages/neutron/agent/linux/external_process.py", line 113, in disable 2020-03-03 13:59:42.451 6397 ERROR neutron.agent.dhcp.agent utils.execute(cmd, run_as_root=self.run_as_root) 2020-03-03 13:59:42.451 6397 ERROR neutron.agent.dhcp.agent File "/usr/lib/python3.6/site-packages/neutron/agent/linux/utils.py", line 147, in execute 2020-03-03 13:59:42.451 6397 ERROR neutron.agent.dhcp.agent returncode=returncode) 2020-03-03 13:59:42.451 6397 ERROR neutron.agent.dhcp.agent neutron_lib.exceptions.ProcessExecutionError: Exit code: 1; Stdin: ; Stdout: ; Stderr: 2020-03-03 13:59:42.451 6397 ERROR neutron.agent.dhcp.agent """ At the same time, this is the log of the kill scripts: """ Tue Mar 3 13:59:29 UTC 2020 Deleting container neutron-dnsmasq-qdhcp-3d1b1411-f8d2-4348-a500-e7af340f2d1c (6a2ac5c799fb7175b20ffa768b2b297f2c9de8e3f5271d99cf41ccff3dc94b64) 6a2ac5c799fb7175b20ffa768b2b297f2c9de8e3f5271d99cf41ccff3dc94b64 Tue Mar 3 13:59:30 UTC 2020 No network namespace detected, exiting Tue Mar 3 13:59:30 UTC 2020 No network namespace detected, exiting Tue Mar 3 13:59:31 UTC 2020 No network namespace detected, exiting """ I'm going to propose a patch for tripleo-ansible (tripleo-heat-templates in OSP16) to solve this issue in the haproxy kill script. Do you still have this deployment? Is it possible to debug this issue in your environment? Regards and thank you in advance. Hi Rodolfo, Thanks for looking into this and finding underlying issue. Unfortunately I had to use this environment for other workload and it's currently not available to put a patch. I apologize for the inconvenience. *** Bug 1810475 has been marked as a duplicate of this bug. *** patch is included in latest released build in OSP 16.0 According to our records, this should be resolved by tripleo-ansible-0.4.2-0.20200404124615.67005aa.el8ost. This build is available now. Closing EOL, OSP 16.0 has been retired as of Oct 27, 2020 Re-opening targeted at 16.1 |