Bug 1272368
Summary: | Race condition on systemd-run --scope --slice=foo | |||
---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Petr Horáček <phoracek> | |
Component: | systemd | Assignee: | systemd-maint | |
Status: | CLOSED ERRATA | QA Contact: | Branislav Blaškovič <bblaskov> | |
Severity: | urgent | Docs Contact: | ||
Priority: | urgent | |||
Version: | 7.2 | CC: | bblaskov, bmcclain, danken, gcheresh, gklein, jkurik, jscotka, lmiksik, lnykryn, snagar, systemd-maint-list, systemd-maint, tlavigne, ylavi | |
Target Milestone: | rc | Keywords: | ZStream | |
Target Release: | --- | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | systemd-219-20.el7 | Doc Type: | Bug Fix | |
Doc Text: |
Cause:
it might happen that by the time PID 1 adds our process to the scope unit the process might already have died, if the process is short-running (such as an invocation to /bin/true).
Consequence:
When systemd picked some recycled name for scope following error could appear:
'Failed to start transient scope unit: Unit run-XXXXX.scope already exists.'
Fix:
Synchronously wait until the scope unit we create is started.
Result:
It should work now.
|
Story Points: | --- | |
Clone Of: | ||||
: | 1283192 (view as bug list) | Environment: | ||
Last Closed: | 2016-11-04 00:44:19 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1154205, 1259468, 1283192, 1283245, 1289485 |
Description
Petr Horáček
2015-10-16 09:21:52 UTC
Run with systemd from https://brewweb.devel.redhat.com/taskinfo?taskID=9969138 and it looks like the problem of bringing interface up is solved Lukáš, would you explain why we cannot rush this fix into 0day errata of rhel-7.2.0? If not part of 7.2.0, may we have it in a near z-stream? Note this mail from Lennart Poettering. This patch may not be 100% solution, but it's better than nothing: On Thu, 15.10.15 13:25, Petr Horacek (phoracek) wrote: > Hello, > > recently we encountered strange systemd problem on automated tests of > networking > part of oVirt VDSM project. > > Sometimes this happens: > $ /usr/sbin/ifdown enp1s0f1 > $ /usr/bin/systemd-run --scope --slice=vdsm-dhclient /usr/sbin/ifup enp1s0f1 > Failed to start transient scope unit: Unit run-13034.scope already exists. > > systemd-run should create a new scope every time it's called, should not > it? Could it be > a racefull bug in systemd? The code for this is actually really naive... the number is just the PID of the caller, and there's no check at all to ensure it is unique. PIDs overrun easily, hence this is not nice at all... What's even worse: when you use -H or -M to invoke things remotely we still pick the client side PID for the name.... I figure we should rework this to pick some sufficiently large random token instead, so that this is unlikely to conflict without actually having to check for conflicts. In the meantime, you should be able to fix this by explicitly picking a randomized name for the scope using --unit=. For example, consider just adding --unit=`uuidgen` to your command line, and the clashes should not happen anymore. > I found recently added issue [1] which describes similar problem, > but with --unit instead of --slice. Note that our machine which > reproduced it has systemd older than v220. > > Is it possible, that this is the same case as described in [1] and > therefore it should be > fixed in systemd 220? > > Is it possible to backport [1]'s fix to EL7? Well, there are still cases where we unable to clean up scope units properly, because we don't get any notifications for them when they run empty. But yeah the current upstream versions should be better than older versions. Lennart -- Lennart Poettering, Red Hat Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2016-2216.html |