Bug 1397017

Summary: Parallel executions of AppManager.close()
Product: Red Hat OpenStack Reporter: Marian Krcmarik <mkrcmari>
Component: python-ryuAssignee: RHOS Maint <rhos-maint>
Status: CLOSED CURRENTRELEASE QA Contact: Ofer Blaut <oblaut>
Severity: medium Docs Contact:
Priority: medium    
Version: 10.0 (Newton)CC: amuller, apevec, ihrachys, jlibosva, jschluet, lhh, mburns, mkrcmari, nlevinki
Target Milestone: z7Keywords: TestOnly, Triaged, ZStream
Target Release: 10.0 (Newton)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: python-ryu-4.9-2.el7ost Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-03-22 13:25:47 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1450223    

Description Marian Krcmarik 2016-11-21 12:17:46 UTC
Description of problem:
If an AppManager.close call is started and all AppManager.services are stopped, AppManager.run_apps starts another close() call, resulting in a KeyError exception in close() (*1).  Prevent that using a semaphore.

The upstream patch:
https://github.com/osrg/ryu/commit/b0ab4f16028c452374c5a0f22bd970038194f142

I hit this problem when stopping neutron-openvswitch-agent service on Openstack computes/controllers in order to setup up Instance Ha (where the service is being stopped firstly and then put under pacemaker management for computes and left under systemd management for controllers).

The traceack on which openvswitch failed was following:
2016-11-20 07:30:59.464 102285 ERROR neutron.agent.linux.async_process [-] Error received from [ovsdb-client monitor Interface name,ofport,external_ids --format=json]: None
2016-11-20 07:30:59.465 102285 ERROR neutron.agent.linux.async_process [-] Process [ovsdb-client monitor Interface name,ofport,external_ids --format=json] dies due to the error: None
2016-11-20 07:30:59.465 102285 DEBUG neutron.agent.linux.async_process [-] Output received from [ovsdb-client monitor Interface name,ofport,external_ids --format=json]: None _read_stdout /usr/lib/python2.7/site-packages/neutron/agent/linux/async_process.py:237
2016-11-20 07:30:59.466 102285 ERROR ryu.lib.hub [-] hub: uncaught exception: Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/ryu/lib/hub.py", line 54, in _launch
    return func(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/ryu/base/app_manager.py", line 545, in close
    self.uninstantiate(app_name)
  File "/usr/lib/python2.7/site-packages/ryu/base/app_manager.py", line 528, in uninstantiate
    app = self.applications.pop(name)
KeyError: 'ofctl_service'

Version-Release number of selected component (if applicable):
python-ryu-4.3-2.el7ost.noarch.rpm  

How reproducible:
Sometimes

Steps to Reproduce:
1. Stop neutron-openvswitch-agent service

Actual results:
service fails, not sure about other consequences - maybe some binded ports remain active

Expected results:
Sane service stop

Additional info:

Comment 1 Mike Burns 2017-05-11 11:54:09 UTC
is this already fixed?  4.9-2 is already shipped?

Comment 2 Marian Krcmarik 2017-05-11 12:12:28 UTC
(In reply to Mike Burns from comment #1)
> is this already fixed?  4.9-2 is already shipped?

It does seem so in RHOSP11

Comment 3 Ihar Hrachyshka 2017-05-15 18:20:53 UTC
I believe this will be fixed for OSP10 with: https://brewweb.engineering.redhat.com/brew/buildinfo?buildID=534727

Comment 6 Lon Hohberger 2017-09-06 19:58:10 UTC
According to our records, this should be resolved by python-ryu-4.9-2.1.el7ost.  This build is available now.