Bug 893015 - NOT_IN_SYSTEMD: DefaultControllers=cpu makes RT unavailable to system services
Summary: NOT_IN_SYSTEMD: DefaultControllers=cpu makes RT unavailable to system services
Keywords:
Status: CLOSED WORKSFORME
Alias: None
Product: Fedora
Classification: Fedora
Component: systemd
Version: rawhide
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
Assignee: systemd-maint
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-01-08 13:04 UTC by Fabio Massimo Di Nitto
Modified: 2015-06-17 23:43 UTC (History)
11 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-06-17 23:43:56 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)

Description Fabio Massimo Di Nitto 2013-01-08 13:04:58 UTC
As upstream I have a small daemon that calls sched_setscheduler(2) to set its priority to SCHED_RR.

When running the daemon standalone, no problem at all.

As soon as I try to start the daemon via systemctl, I get an error that the process can´t set correct scheduling priority.

The C code is:

static int set_scheduler(void)
{
        struct sched_param sched_param;
        int err;

        err = sched_get_priority_max(SCHED_RR);
        if (err < 0) {
                log_warn("Could not get maximum scheduler priority");
                return err;
        }

        sched_param.sched_priority = err;
        err = sched_setscheduler(0, SCHED_RR, &sched_param);
        if (err < 0)
                log_warn("could not set SCHED_RR priority %d",
                           sched_param.sched_priority);

        return err;
}

The systemd unit file is:

[Unit]
Description=kronosnetd
Requires=network.target
After=network.target syslog.target

[Service]
Type=forking
EnvironmentFile=/etc/sysconfig/kronosnetd
ExecStart=/usr/sbin/kronosnetd $KNETD_OPTS

[Install]
WantedBy=multi-user.target

---

couple of related questions:

1) why running the deamon under systemd fails to set the scheduler?
   is this intentional or a bug in systemd?
2) what´s the magic keyword in the unit file to make it working?
   (also note that running the daemon from an LSB init script has the same issue and will fail)

Comment 1 Michal Schmidt 2013-01-08 13:45:30 UTC
Quoting Lennart (http://lists.freedesktop.org/archives/systemd-devel/2011-July/002902.html):

I figure this fails due to the fact that by default we place every
service in its own cgroup in the "cpu" hierarchy, in order to distribute
the available CPU time evenly among the available processes. However,
because the "cpu" controller currently isn't that nice to use this
breaks RT, since if you create a group to allow it RT you need to assign
an RT budget to it, which we however cannot really do, since we cannot
come up with any sane default for it.

There are two ways out of this:

a) disable the implicit sortining into separate cpu cgroups globally, by
setting "DefaultControllers=" (i.e. setting it to the empty string) in
/etc/systemd/system.conf

or

b) disable the automatic creation of a "cpu" cgroup only for this one
service, by placing "ControlGroup=cpu:/" in it.

I recommend the latter.

That this is necessary is very unfortunate and I really hope the cpu
controller can be fixed one day, so that RT budgets and normal
scheduling budgets are independent.

Comment 2 Michal Schmidt 2013-01-08 13:49:39 UTC
I'm closing this as NOTABUG, because it is not a bug in systemd. It can be considered a kernel bug, but reassigning this to kernel is not likely to help. This needs to be resolved upstream.

Comment 3 Fabio Massimo Di Nitto 2013-01-08 14:15:33 UTC
Hi Michal,

thanks for your quick reply and explanation.

(In reply to comment #1)

> b) disable the automatic creation of a "cpu" cgroup only for this one
> service, by placing "ControlGroup=cpu:/" in it.

this change only solves part of the problem and only for packages that ship systemd unit files.

The problem is still present for packages that have not switched to systemd service/unit files (sysVinit compat layer... ISV.. you name it). We cannot expect all ISV to switch or patch the code because of an integration issue our side.

How do we plan to handle ISV integration where one ISV needs one setting and another needs something else? We cannot assume blindly that all ISV will follow our guidelines or users change defaults all the time.

(In reply to comment #2)
> I'm closing this as NOTABUG, because it is not a bug in systemd. It can be
> considered a kernel bug, but reassigning this to kernel is not likely to
> help. This needs to be resolved upstream.

I have to disagree here. Either systemd or kernel, this is an integration problem between 2 components and should be documented as issue.

Is our kernel team informed of this need systemd has? Is there anybody actively looking into the problem? or are we assuming that it will eventually happen?

thanks again
Fabio

Comment 4 Lennart Poettering 2013-01-11 13:23:40 UTC
(In reply to comment #3)
> 
> (In reply to comment #2)
> > I'm closing this as NOTABUG, because it is not a bug in systemd. It can be
> > considered a kernel bug, but reassigning this to kernel is not likely to
> > help. This needs to be resolved upstream.
> 
> I have to disagree here. Either systemd or kernel, this is an integration
> problem between 2 components and should be documented as issue.
> 
> Is our kernel team informed of this need systemd has? Is there anybody
> actively looking into the problem? or are we assuming that it will
> eventually happen?

We have brought this up numerous times. So far nobody has done anything about it. It's not a burning issue for many people since only the fewest services actually require RT. RT is primarily used in user code, not services.

I have no high hopes this will be fixed anytime soon. Quite the opposite, from what I hear most RT folks would do anything to make systemd's life harder...

Comment 5 Fabio Massimo Di Nitto 2013-01-11 13:38:03 UTC
(In reply to comment #4)
> (In reply to comment #3)
> > 
> > (In reply to comment #2)
> > > I'm closing this as NOTABUG, because it is not a bug in systemd. It can be
> > > considered a kernel bug, but reassigning this to kernel is not likely to
> > > help. This needs to be resolved upstream.
> > 
> > I have to disagree here. Either systemd or kernel, this is an integration
> > problem between 2 components and should be documented as issue.
> > 
> > Is our kernel team informed of this need systemd has? Is there anybody
> > actively looking into the problem? or are we assuming that it will
> > eventually happen?
> 
> We have brought this up numerous times. So far nobody has done anything
> about it. It's not a burning issue for many people since only the fewest
> services actually require RT. RT is primarily used in user code, not
> services.

Right, but the integration breaks user code. How much is irrelevant, that´s subjective perception of the system based on the use somebody makes of it.

Please be aware that I am not blaming neither systemd nor kernel here.

There is an integration issue that at some point will have to be resolved.

> I have no high hopes this will be fixed anytime soon.

I am sure somebody will make it a requirement at some point (hint ;))

> Quite the opposite,
> from what I hear most RT folks would do anything to make systemd's life
> harder...

Not interested in political pressures either ways. We can keep those outside of bugzilla.

I filed this bugzilla as "upstream" :) and "something" is breaking known to be working code on all distros, but Fedora (that´s how any random upstream developer would see it). 

Now, to be more constructive, while we can hope for a more general fix, I think that at least the LSB compat layer should be executed with the equivalent of 
"ControlGroup=cpu:/"
This would let say keep a better compatibility layer with older applications that haven´t made the transition to systemd model.

As for the others, we know there is a way to control it via unit/service files (or whatever they are called in the systemd world).

Comment 6 Lennart Poettering 2013-01-11 13:48:19 UTC
(In reply to comment #5)

> > I have no high hopes this will be fixed anytime soon.
> 
> I am sure somebody will make it a requirement at some point (hint ;))
> 
> > Quite the opposite,
> > from what I hear most RT folks would do anything to make systemd's life
> > harder...
> 
> Not interested in political pressures either ways. We can keep those outside
> of bugzilla.
> 
> I filed this bugzilla as "upstream" :) and "something" is breaking known to
> be working code on all distros, but Fedora (that´s how any random upstream
> developer would see it). 
> 
> Now, to be more constructive, while we can hope for a more general fix, I
> think that at least the LSB compat layer should be executed with the
> equivalent of 
> "ControlGroup=cpu:/"
> This would let say keep a better compatibility layer with older applications
> that haven´t made the transition to systemd model.

So, there are a number of incompatibilities with SysV anyway, maybe we should just add this to the list? i.e. treat this as documentation issue rather than as a bug to fix for now.

http://www.freedesktop.org/wiki/Software/systemd/Incompatibilities

> As for the others, we know there is a way to control it via unit/service
> files (or whatever they are called in the systemd world).

Yeah, and I probably should add this to the FAQ actually, as it came up before, and we currently have no good documentaiton for this. Will add it here:

http://www.freedesktop.org/wiki/Software/systemd/FrequentlyAskedQuestions

Comment 8 Fabio Massimo Di Nitto 2013-01-11 13:58:24 UTC
(In reply to comment #6)
> (In reply to comment #5)
> 
> > > I have no high hopes this will be fixed anytime soon.
> > 
> > I am sure somebody will make it a requirement at some point (hint ;))
> > 
> > > Quite the opposite,
> > > from what I hear most RT folks would do anything to make systemd's life
> > > harder...
> > 
> > Not interested in political pressures either ways. We can keep those outside
> > of bugzilla.
> > 
> > I filed this bugzilla as "upstream" :) and "something" is breaking known to
> > be working code on all distros, but Fedora (that´s how any random upstream
> > developer would see it). 
> > 
> > Now, to be more constructive, while we can hope for a more general fix, I
> > think that at least the LSB compat layer should be executed with the
> > equivalent of 
> > "ControlGroup=cpu:/"
> > This would let say keep a better compatibility layer with older applications
> > that haven´t made the transition to systemd model.
> 
> So, there are a number of incompatibilities with SysV anyway, maybe we
> should just add this to the list? i.e. treat this as documentation issue
> rather than as a bug to fix for now.
> 
> http://www.freedesktop.org/wiki/Software/systemd/Incompatibilities
> 
> > As for the others, we know there is a way to control it via unit/service
> > files (or whatever they are called in the systemd world).
> 
> Yeah, and I probably should add this to the FAQ actually, as it came up
> before, and we currently have no good documentaiton for this. Will add it
> here:
> 
> http://www.freedesktop.org/wiki/Software/systemd/FrequentlyAskedQuestions

I agree that having it documented is good. It would have spared me the time to file a bugzilla, since google returned only that same message on the devel list from 2011.

I would still prefer to keep a bugzilla open, as many others might look for similar symptoms. It´s costless and it would avoid duplicates :)

Comment 9 Lennart Poettering 2013-01-11 18:56:28 UTC
Made the changes to the wiki. There's now a new document:

http://www.freedesktop.org/wiki/MyServiceCantGetRealtime

And I referenced it from the FAQ and Incompatibilities page.

I'll leave the bug open as requested, but I tagged it "NOT_IN_SYSTEMD" to clarify that we can't do much about this without kernel support.

Comment 10 Fabio Massimo Di Nitto 2013-01-11 18:59:23 UTC
(In reply to comment #9)
> Made the changes to the wiki. There's now a new document:
> 
> http://www.freedesktop.org/wiki/MyServiceCantGetRealtime
> 
> And I referenced it from the FAQ and Incompatibilities page.

Hi Lennart, thanks a lot for the effort, I am sure others will appreciate it too.

> 
> I'll leave the bug open as requested, but I tagged it "NOT_IN_SYSTEMD" to
> clarify that we can't do much about this without kernel support.

Sure, it's a fair compromise. Tho maybe we should ask a bugzilla to have a bug assigned to multiple components :) cloning wouldn't help much here other than scattering info around.

Comment 14 Fedora End Of Life 2013-07-04 06:33:04 UTC
This message is a reminder that Fedora 17 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 17. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '17'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 17's end of life.

Bug Reporter:  Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 17 is end of life. If you 
would still like  to see this bug fixed and are able to reproduce it 
against a later version  of Fedora, you are encouraged  change the 
'version' to a later Fedora version prior to Fedora 17's end of life.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 15 Lennart Poettering 2015-06-17 23:43:56 UTC
Rawhide kernels now turn off CONFIG_RT_GROUP_SCHED again, which should make the problem go away. see bug 1229700 for details.


Note You need to log in before you can comment on or make changes to this bug.