Bug 1792898

Summary: Kill neutron-keepalived-state-change-monitor fails
Product: Red Hat OpenStack Reporter: Slawek Kaplonski <skaplons>
Component: openstack-neutronAssignee: Slawek Kaplonski <skaplons>
Status: CLOSED ERRATA QA Contact: Alex Katz <akatz>
Severity: high Docs Contact:
Priority: high    
Version: 16.0 (Train)CC: amuller, bcafarel, bperkins, ccamposr, chrisw, njohnston, scohen
Target Milestone: z1Keywords: AutomationBlocker, Triaged
Target Release: 16.0 (Train on RHEL 8.1)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: openstack-neutron-15.0.2-0.20200131145709.de4b7a5.el8ost Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-03-03 09:41:29 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1787632    

Description Slawek Kaplonski 2020-01-20 10:44:39 UTC
In case when graceful shutdown of neutron-keepalived-state-change-monitor with SIGTERM fails, Neutron will try to kill it with SIGKILL but as there is no correct rootwrap rule to kill it with -9 it will fail with error like:

2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent [-] Error while deleting router f2613902-6ea2-4f09-9fae-9d5a933c744e: multiprocessing.managers.RemoteError:
---------------------------------------------------------------------------
Unserializable message: Traceback (most recent call last):
  File "/usr/lib64/python3.6/multiprocessing/managers.py", line 283, in serve_client
    send(msg)
  File "/usr/lib/python3.6/site-packages/oslo_rootwrap/jsonrpc.py", line 128, in send
    s = self.dumps(obj)
  File "/usr/lib/python3.6/site-packages/oslo_rootwrap/jsonrpc.py", line 170, in dumps
    return json.dumps(obj, cls=RpcJSONEncoder).encode('utf-8')
  File "/usr/lib64/python3.6/json/__init__.py", line 238, in dumps
    **kw).encode(obj)
  File "/usr/lib64/python3.6/json/encoder.py", line 199, in encode
    chunks = self.iterencode(o, _one_shot=True)
  File "/usr/lib64/python3.6/json/encoder.py", line 257, in iterencode
    return _iterencode(o, 0)
  File "/usr/lib/python3.6/site-packages/oslo_rootwrap/jsonrpc.py", line 43, in default
    return super(RpcJSONEncoder, self).default(o)
  File "/usr/lib64/python3.6/json/encoder.py", line 180, in default
    o.__class__.__name__)
TypeError: Object of type 'ValueError' is not JSON serializable

---------------------------------------------------------------------------
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent Traceback (most recent call last):
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent File "/usr/lib/python3.6/site-packages/neutron/common/utils.py", line 702, in wait_until_true
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent eventlet.sleep(sleep)
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent File "/usr/lib/python3.6/site-packages/eventlet/greenthread.py", line 36, in sleep
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent hub.switch()
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent File "/usr/lib/python3.6/site-packages/eventlet/hubs/hub.py", line 298, in switch
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent return self.greenlet.switch()
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent eventlet.timeout.Timeout: 10 seconds
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent During handling of the above exception, another exception occurred:
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent Traceback (most recent call last):
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent File "/usr/lib/python3.6/site-packages/neutron/agent/l3/ha_router.py", line 420, in destroy_state_change_monitor
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent timeout=SIGTERM_TIMEOUT)
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent File "/usr/lib/python3.6/site-packages/neutron/common/utils.py", line 707, in wait_until_true
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent raise WaitTimeout(_("Timed out after %d seconds") % timeout)
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent neutron.common.utils.WaitTimeout: Timed out after 10 seconds
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent During handling of the above exception, another exception occurred:
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent Traceback (most recent call last):
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent File "/usr/lib/python3.6/site-packages/neutron/agent/l3/agent.py", line 506, in _safe_router_removed
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent self._router_removed(ri, router_id)
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent File "/usr/lib/python3.6/site-packages/neutron/agent/l3/agent.py", line 542, in _router_removed
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent self.router_info[router_id] = ri
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent File "/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", line 220, in __exit__
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent self.force_reraise()
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent File "/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent six.reraise(self.type_, self.value, self.tb)
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent File "/usr/lib/python3.6/site-packages/six.py", line 693, in reraise
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent raise value
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent File "/usr/lib/python3.6/site-packages/neutron/agent/l3/agent.py", line 539, in _router_removed
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent ri.delete()
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent File "/usr/lib/python3.6/site-packages/neutron/agent/l3/ha_router.py", line 478, in delete
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent self.destroy_state_change_monitor(self.process_monitor)
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent File "/usr/lib/python3.6/site-packages/neutron/agent/l3/ha_router.py", line 422, in destroy_state_change_monitor
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent pm.disable(sig=str(int(signal.SIGKILL)))
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent File "/usr/lib/python3.6/site-packages/neutron/agent/linux/external_process.py", line 113, in disable
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent utils.execute(cmd, run_as_root=self.run_as_root)
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent File "/usr/lib/python3.6/site-packages/neutron/agent/linux/utils.py", line 122, in execute
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent execute_rootwrap_daemon(cmd, process_input, addl_env))
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent File "/usr/lib/python3.6/site-packages/neutron/agent/linux/utils.py", line 109, in execute_rootwrap_daemon
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent LOG.error("Rootwrap error running command: %s", cmd)
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent File "/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", line 220, in __exit__
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent self.force_reraise()
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent File "/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent six.reraise(self.type_, self.value, self.tb)
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent File "/usr/lib/python3.6/site-packages/six.py", line 693, in reraise
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent raise value
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent File "/usr/lib/python3.6/site-packages/neutron/agent/linux/utils.py", line 106, in execute_rootwrap_daemon
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent return client.execute(cmd, process_input)
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent File "/usr/lib/python3.6/site-packages/oslo_rootwrap/client.py", line 154, in execute
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent res = self._run_one_command(proxy, cmd, stdin)
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent File "/usr/lib/python3.6/site-packages/oslo_rootwrap/client.py", line 139, in _run_one_command
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent res = proxy.run_one_command(cmd, stdin)
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent File "<string>", line 2, in run_one_command
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent File "/usr/lib64/python3.6/multiprocessing/managers.py", line 772, in _callmethod
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent raise convert_to_error(kind, result)
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent multiprocessing.managers.RemoteError:
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent ---------------------------------------------------------------------------
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent Unserializable message: Traceback (most recent call last):
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent File "/usr/lib64/python3.6/multiprocessing/managers.py", line 283, in serve_client
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent send(msg)
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent File "/usr/lib/python3.6/site-packages/oslo_rootwrap/jsonrpc.py", line 128, in send
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent s = self.dumps(obj)
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent File "/usr/lib/python3.6/site-packages/oslo_rootwrap/jsonrpc.py", line 170, in dumps
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent return json.dumps(obj, cls=RpcJSONEncoder).encode('utf-8')
2020-01-20 10:35:37.337 382811 ERROR neutron.agent.l3.agent File "/usr/lib64/python3.6/json/__init__.py", line 238, in dumps....

Comment 1 Bernard Cafarelli 2020-01-30 13:21:48 UTC
Train backport to merge soon upstream (gates permitting)

Comment 4 Alex McLeod 2020-02-19 12:39:26 UTC
If this bug requires doc text for errata release, please set the 'Doc Type' and provide draft text according to the template in the 'Doc Text' field. The documentation team will review, edit, and approve the text.

If this bug does not require doc text, please set the 'requires_doc_text' flag to '-'.

Comment 6 Slawek Kaplonski 2020-03-02 08:23:05 UTC
Hi Alex,

This bug don't require any doc changes.

Comment 8 errata-xmlrpc 2020-03-03 09:41:29 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:0654