Bug 2290473 - Neutron does not gracefully shutdown when receiving SIGTERM
Summary: Neutron does not gracefully shutdown when receiving SIGTERM
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-neutron
Version: 17.1 (Wallaby)
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: z4
: 17.1
Assignee: Terry Wilson
QA Contact: Fiorella Yanac
URL:
Whiteboard:
Depends On: 2290472
Blocks:
TreeView+ depends on / blocked
 
Reported: 2024-06-04 21:01 UTC by Terry Wilson
Modified: 2024-11-21 09:41 UTC (History)
5 users (show)

Fixed In Version: openstack-neutron-18.6.1-17.1.20240821210749.85ff760.el9ost
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2024-11-21 09:41:10 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
OpenStack gerrit 911625 0 None MERGED Use oslo_service's SignalHandler for signals 2024-08-21 19:25:08 UTC
Red Hat Issue Tracker OSP-32209 0 None None None 2024-06-04 21:05:57 UTC
Red Hat Product Errata RHBA-2024:9974 0 None None None 2024-11-21 09:41:13 UTC

Description Terry Wilson 2024-06-04 21:01:15 UTC
Description of problem:
When Neutron is killed with SIGTERM (like via systemctl), when using ML2/OVN neutron workers do not exit and instead are eventually killed with SIGKILL when the graceful timeout is reached (often around 1 minute).

This is happening due to the signal handlers for SIGTERM. There are multiple issues.

1) oslo_service, ml2/ovn mech_driver, and ml2/ovo_rpc.py all call signal.signal(signal.SIGTERM, ...) overwriting each others signal handlers.
2) SIGTERM is handled in the main thread, and running blocking code there causes AssertionErrors in eventlet which also prevents the process from exiting.
3) The ml2/ovn cleanup code doesn't cause the process to end, so itinterrupts the killing of the process.

oslo_service has a singleton SignalHandler class that solves all of these issues

Version-Release number of selected component (if applicable):
17.1.3

How reproducible:
95%-ish

Steps to Reproduce:
1. Start neutron_api
2. Stop neutron_api

Actual results:
Neutron only exits after ~42 seconds

Expected results:
Neutron exits after a few seconds

Additional info:

Comment 14 errata-xmlrpc 2024-11-21 09:41:10 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (RHOSP 17.1.4 bug fix and enhancement advisory), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2024:9974


Note You need to log in before you can comment on or make changes to this bug.