Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 1090421 - neutron-agent-watch fails when deleting failed resource
neutron-agent-watch fails when deleting failed resource
Status: CLOSED ERRATA
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-neutron (Show other bugs)
4.0
Unspecified Unspecified
medium Severity medium
: z5
: 4.0
Assigned To: Miguel Angel Ajo
yfried
: OtherQA, ZStream
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2014-04-23 05:50 EDT by yfried
Modified: 2016-04-27 00:49 EDT (History)
10 users (show)

See Also:
Fixed In Version: openstack-neutron-2013.2.3-17.el6ost
Doc Type: Bug Fix
Doc Text:
The watcher agent (neutron-agent-watch) periodically polls the OpenStack Networking service to get a network list for a specific host. If a network is no longer available (for example, if it is deleted), the watcher agent is supposed to remove it from the 'known' resources dictionary. This 'known' resources dictionary lists what networks are available for scheduling. In previous releases, the watcher agent did not actually remove deleted networks from the 'known' resources dictionary as expected. This made it possible for the agent to crash if a network that was scheduled to a host's DHCP agent or L3 agent is deleted. With this update, the watcher agent now cleans the 'known' resource dictionary as expected, thereby ensuring that deleting networks no longer causes the agent to crash.
Story Points: ---
Clone Of:
Environment:
Last Closed: 2014-10-22 13:22:58 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Patch for the error (878 bytes, patch)
2014-06-02 05:14 EDT, Miguel Angel Ajo
no flags Details | Diff
Patch (1.90 KB, text/plain)
2014-09-18 01:52 EDT, Miguel Angel Ajo
no flags Details


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2014:1686 normal SHIPPED_LIVE Moderate: openstack-neutron security and bug fix update 2014-10-22 17:21:18 EDT

  None (edit)
Description yfried 2014-04-23 05:50:41 EDT
on RHEL6.5

python-neutron-2013.2.3-4.el6ost.noarch
python-neutronclient-2.3.4-1.el6ost.noarch
openstack-neutron-openvswitch-2013.2.3-4.el6ost.noarch
openstack-neutron-2013.2.3-4.el6ost.noarch

Description of problem:
following steps listed in https://bugzilla.redhat.com/show_bug.cgi?id=1051444#c15

deleting the failed network (for which dnsmasq was killed) prints to agent_watch.log:

2014-04-23 12:46:09.030 118482 ERROR root [-] Unexpected exception occurred 51 time(s)... retrying.
2014-04-23 12:46:09.030 118482 TRACE root Traceback (most recent call last):
2014-04-23 12:46:09.030 118482 TRACE root   File "/usr/lib/python2.6/site-packages/neutron/openstack/common/excutils.py", line 62, in inner_func
2014-04-23 12:46:09.030 118482 TRACE root     return infunc(*args, **kwargs)
2014-04-23 12:46:09.030 118482 TRACE root   File "/usr/bin/neutron-agent-watch", line 627, in run
2014-04-23 12:46:09.030 118482 TRACE root     watcher.run(context)
2014-04-23 12:46:09.030 118482 TRACE root   File "/usr/bin/neutron-agent-watch", line 426, in run
2014-04-23 12:46:09.030 118482 TRACE root     self._run()  # run method implemented in child class
2014-04-23 12:46:09.030 118482 TRACE root   File "/usr/bin/neutron-agent-watch", line 515, in _run
2014-04-23 12:46:09.030 118482 TRACE root     self._remove_old_known_pidfiles(expected_pid_files)
2014-04-23 12:46:09.030 118482 TRACE root   File "/usr/bin/neutron-agent-watch", line 388, in _remove_old_known_pidfiles
2014-04-23 12:46:09.030 118482 TRACE root     self._remove_expected_pid_file(known)
2014-04-23 12:46:09.030 118482 TRACE root AttributeError: 'DhcpAgentWatcher' object has no attribute '_remove_expected_pid_file'

even though pid file in /var/lib/neutron/dhcp/<network-id>/pid was deleted when network was deleted

also:
# /etc/init.d/neutron-dhcp-agent status ; echo $?
neutron-dhcp-agent (pid  20898) is running...
neutron-dhcp-agent health is not good
150

even though agents are now ok.
only restarting the agent-watch service returns it to normal state
Comment 2 Miguel Angel Ajo 2014-06-02 05:14:31 EDT
Created attachment 901387 [details]
Patch for the error

It seems that this case was not covered by my manual testing.
Comment 3 Miguel Angel Ajo 2014-06-02 05:18:17 EDT
This agent should go away as soon as I get the agent-status patches merged upstream, which will have proper functional and unit testing.
Comment 6 Miguel Angel Ajo 2014-09-18 01:52:10 EDT
Created attachment 938753 [details]
Patch
Comment 12 errata-xmlrpc 2014-10-22 13:22:58 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2014-1686.html

Note You need to log in before you can comment on or make changes to this bug.