Bug 1827276 - Sync script for sidecar containers can't spawn dnsmasq processes for all networks
Summary: Sync script for sidecar containers can't spawn dnsmasq processes for all netw...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo-heat-templates
Version: 16.1 (Train)
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: beta
: 16.1 (Train on RHEL 8.2)
Assignee: Alex Katz
QA Contact: David Rosenfeld
URL:
Whiteboard:
Depends On:
Blocks: 1824397
TreeView+ depends on / blocked
 
Reported: 2020-04-23 14:13 UTC by Slawek Kaplonski
Modified: 2020-07-29 07:52 UTC (History)
5 users (show)

Fixed In Version: openstack-tripleo-heat-templates-11.3.2-0.20200513033425.a90c03e.el8ost
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-07-29 07:52:07 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Launchpad 1874470 0 None None None 2020-04-23 15:22:18 UTC
OpenStack gerrit 725162 0 None MERGED Revert systemd sidecars 2020-07-19 05:58:39 UTC
Red Hat Product Errata RHBA-2020:3148 0 None None None 2020-07-29 07:52:31 UTC

Description Slawek Kaplonski 2020-04-23 14:13:04 UTC
I found it when checking failure of Tobiko tests: https://rhos-qe-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/DFG-network-neutron-16_director-rhel-virthost-3cont_2comp-ipv4-vxlan-tobiko/24/testReport/junit/tobiko.tests.faults.agents.test_neutron_agents/DHCPAgentTest/test_dhcp_lease_served_when_dhcp_agent_down/

Basically it seems that neutron-dhcp-agent can actually spawns dnsmasq process (sidecar container) only for one network. For all others processes aren't started at all.

How to reproduce:
1. Create 2 networks (I had 3 in my env) with subnets with dhcp enabled,
2. Restart neutron-dhcp-agent process on controller
3. Check running dnsmasq processes running on host - it will be only one such process and should be one per network.

Comment 1 Bogdan Dobrelya 2020-04-24 11:07:57 UTC
The race with merging events should have something to https://github.com/systemd/systemd/issues/5770

Comment 2 Brent Eagles 2020-04-29 12:25:25 UTC
There are two facets to this bug, each of which may deserve their own separate bug reports:

- The sync and wrapper scripts are not using the same lock file, resulting in a race on the shared "processes" file. The solution for this is straightforward.

- The mechanism used to trigger the sync process is based on systemd.path notifications, which does not appear to queue individual notifications. This may result in some sidecars not being launched.

Bogdan's patch looks promising (see https://review.opendev.org/#/c/723373/) but will need extensive testing.

Comment 3 Brent Eagles 2020-05-07 14:24:33 UTC
The decision was to revert the patches that remove the old sidecar mechanism and modify the neutron templates to use the systemd wrapppers.

Patch to master is here: https://review.opendev.org/#/c/725162/
After this merges, we will backport to train so it is available for 16.1

Comment 4 Bernard Cafarelli 2020-05-13 08:01:26 UTC
Train backport merged and package built for 16.1

Comment 10 Alex McLeod 2020-06-16 12:29:32 UTC
If this bug requires doc text for errata release, please set the 'Doc Type' and provide draft text according to the template in the 'Doc Text' field. The documentation team will review, edit, and approve the text.

If this bug does not require doc text, please set the 'requires_doc_text' flag to '-'.

Comment 12 errata-xmlrpc 2020-07-29 07:52:07 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:3148


Note You need to log in before you can comment on or make changes to this bug.