Bug 1106489
| Summary: | neutron-*-agent child processes can die unnoticed | ||
|---|---|---|---|
| Product: | Red Hat OpenStack | Reporter: | Miguel Angel Ajo <majopela> |
| Component: | openstack-neutron | Assignee: | Miguel Angel Ajo <majopela> |
| Status: | CLOSED ERRATA | QA Contact: | Ofer Blaut <oblaut> |
| Severity: | medium | Docs Contact: | |
| Priority: | high | ||
| Version: | 5.0 (RHEL 7) | CC: | chrisw, dron, lpeer, nyechiel, sclewis, stoner, yeylon |
| Target Milestone: | z2 | Keywords: | Regression, ZStream |
| Target Release: | 5.0 (RHEL 7) | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | openstack-neutron-2014.1.3-4.el7ost | Doc Type: | Bug Fix |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2014-11-03 08:38:17 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | 1065172, 1106457 | ||
| Bug Blocks: | 1083890 | ||
|
Description
Miguel Angel Ajo
2014-06-09 12:54:40 UTC
How to test this: 1) With a working deployment, modify l3_agent.ini and dhcp_agent.ini to include: check_child_processes_action = respawn check_child_processes_interval = 5 2) restart the l3 & dhcp agent. 3) Spawn resources (a VM connected to a private tenant network) 4) tail -f /var/log/neutron/dhcp_agent.log & \ tail -f /var/log/neutron/l3_agent.log & 5) sudo killall dnsmasq you should see then, something like: 2014-10-09 04:31:46.434 9651 ERROR neutron.agent.linux.external_process [-] dnsmasq for dhcp with uuid 67f3c1d9-5861-4466-899f-f166aa97a173 not found. The process should not have died 2014-10-09 04:31:46.434 9651 ERROR neutron.agent.linux.external_process [-] respawning dnsmasq for uuid 67f3c1d9-5861-4466-899f-f166aa97a173 6) sudo killall neutron-ns-metadata-proxy you should see something like: 2014-10-09 04:33:06.564 9656 ERROR neutron.agent.linux.external_process [-] default-service for router with uuid a539a2f8-a6ec-41d1-91b0-bf2ca780b644 not found. The process should not have died 2014-10-09 04:33:06.564 9656 ERROR neutron.agent.linux.external_process [-] respawning None for uuid a539a2f8-a6ec-41d1-91b0-bf2ca780b644 7) modify l3_agent.ini and dhcp_agent.ini to include: check_child_processes_action = exit check_child_processes_interval = 5 8) repeat 4-6, but in this case agent should exit. 9) repeat all above with check_child_processes_interval = 0 , and nothing will happen no service will be restarted automatically, or message will be provided. In between step 7 and 8, it should say to restart the neutron-l3-agent and neutron-dhcp-agent. Otherwise, I ran through these steps and verified the expected behavior. (In reply to Sean Toner from comment #7) > In between step 7 and 8, it should say to restart the neutron-l3-agent and > neutron-dhcp-agent. > > Otherwise, I ran through these steps and verified the expected behavior. Correct, I forgot to mention that step. Thank you for testing!. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHSA-2014-1786.html |