Bug 1229135
Summary: | [BUG] Neutron systemd unit file doesn't kill all processes | |||
---|---|---|---|---|
Product: | Red Hat OpenStack | Reporter: | Pablo Iranzo Gómez <pablo.iranzo> | |
Component: | openstack-neutron | Assignee: | Jakub Libosvar <jlibosva> | |
Status: | CLOSED INSUFFICIENT_DATA | QA Contact: | Ofer Blaut <oblaut> | |
Severity: | medium | Docs Contact: | ||
Priority: | medium | |||
Version: | 6.0 (Juno) | CC: | amuller, chrisw, jlibosva, jschwarz, mschuppe, nyechiel, pablo.iranzo, tfreger, yeylon, zshujuan | |
Target Milestone: | --- | Keywords: | ZStream | |
Target Release: | 7.0 (Kilo) | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | Doc Type: | Bug Fix | ||
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 1249181 1249197 (view as bug list) | Environment: | ||
Last Closed: | 2015-12-15 12:51:10 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1249181, 1249192, 1249197 |
Description
Pablo Iranzo Gómez
2015-06-08 07:23:42 UTC
(In reply to Pablo Iranzo Gómez from comment #0) > How reproducible: > > Kill one of the neutron processes, the remaining ones will still take use of > ports or other resources, not allowing neutron to start again You shouldn't send signals to any of Neutron worker processes, there is one parent process that forks into multiple processes. systemd should send signal only to this parent process that should take care of its child processes, we avoid using control-group by purpose. Pablo, can you please confirm that what happens is that systemd restart sends SIGTERM to parent process but child processes hang? Also to be sure, this happens in latest RHOS 6, right? Hi Jakub, The versions are: openstack-neutron-ml2-2014.2.1-6.el7ost.noarch openstack-neutron-2014.2.1-6.el7ost.noarch openstack-neutron-openvswitch-2014.2.1-6.el7ost.noarch python-neutronclient-2.3.9-1.el7ost.noarch openstack-neutron-metering-agent-2014.2.1-6.el7ost.noarch python-neutron-2014.2.1-6.el7ost.noarch The restart process has been tested by customer via: - 1. Simulate the systemd kill by 'kill -STOP <any neutron-server child PID>' - 2. systemctl restart neutron-server Until it complains about in-use resources Regards, Pablo (In reply to Pablo Iranzo Gómez from comment #4) > Hi Jakub, > The versions are: > > openstack-neutron-ml2-2014.2.1-6.el7ost.noarch > openstack-neutron-2014.2.1-6.el7ost.noarch > openstack-neutron-openvswitch-2014.2.1-6.el7ost.noarch > python-neutronclient-2.3.9-1.el7ost.noarch > openstack-neutron-metering-agent-2014.2.1-6.el7ost.noarch > python-neutron-2014.2.1-6.el7ost.noarch > This is the GA version, there are 3 other minor releases. Can you please ask customer to upgrade to the latest and try to reproduce? I suspect the issue they hit is https://bugs.launchpad.net/neutron/+bug/1387053 which basically means you can't stop rpc workers. I reproduced locally with GA version, that restart fails if you manually send SIGTERM to one of workers. I can't reproduce this issue with A3 release - what I discovered is that 'systemctl stop' hangs and at the end is killed by SIGKILL. But it also kills all child processes, so the next start of service is successful. |