1016548 – Systemd limits manual starts

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1016548 - Systemd limits manual starts

Summary: Systemd limits manual starts

Keywords:
Status:	CLOSED DUPLICATE of bug 821723
Alias:	None
Product:	Red Hat Enterprise Linux 7
Classification:	Red Hat
Component:	systemd
Sub Component:
Version:	7.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	rc
Target Release:	---
Assignee:	systemd-maint
QA Contact:	qe-baseos-daemons
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2013-10-08 10:39 UTC by Nikolai Kondrashov
Modified:	2013-10-16 10:29 UTC (History)
CC List:	3 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2013-10-08 13:08:43 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Description Nikolai Kondrashov 2013-10-08 10:39:10 UTC

Description of problem:
Systemd applies StartLimitInterval and StartLimitBurst to services started "manually", e.g. from command line.

This interferes with scripting and automatic tests in particular, where many different service configurations need to be tried in a short time.

Version-Release number of selected component (if applicable):
systemd-sysv-207-2.el7.x86_64
systemd-libs-207-2.el7.x86_64
systemd-207-2.el7.x86_64

How reproducible:
Always

Steps to Reproduce:
Run the following commands as root:

    for i in `seq 1 6`; do systemctl restart named; done
    journalctl -xn | grep refusing

Actual results:

Output:
    Job for named.service failed. See 'systemctl status named.service' and 'journalctl -xn' for details.
    Oct 08 06:35:31 sgi-xe250-02.rhts.eng.bos.redhat.com systemd[1]: named.service start request repeated too quickly, refusing to start.

Service is not started.

Expected results:
No output. Service started.

Additional info:
This reproduces with the default systemd start limits. Seen with sssd service as well.

Comment 2 Michal Schmidt 2013-10-08 13:08:43 UTC

This has come up before. In bug 821723, for instance.
An old upstream mailing list posting:
http://lists.freedesktop.org/archives/systemd-devel/2012-September/006530.html

In short, "systemctl reset-failed named" can be used in the loop to reset the burst limit counter (documented in 'man systemd.service', section about StartLimitInterval=, StartLimitBurst=).

*** This bug has been marked as a duplicate of bug 821723 ***

Comment 3 Nikolai Kondrashov 2013-10-08 14:05:43 UTC

This is very unintuitive and I consider the above a workaround. I would like to have this fixed still. This is breaking "service" command and thus RHEL6 compatibility and makes our tests unnecessarily complicated.

I would say that this is logically possible to implement. I.e. limit restarts initiated by automatic restart on death, socket-triggered start, etc., but don't limit otherwise perfectly valid administrative restarts.

Comment 4 Lennart Poettering 2013-10-13 19:49:56 UTC

(In reply to Nikolai Kondrashov from comment #3)
> This is very unintuitive and I consider the above a workaround. I would like
> to have this fixed still. This is breaking "service" command and thus RHEL6
> compatibility and makes our tests unnecessarily complicated.
> 
> I would say that this is logically possible to implement. I.e. limit
> restarts initiated by automatic restart on death, socket-triggered start,
> etc., but don't limit otherwise perfectly valid administrative restarts.

We cannot distuingish that really. Also, your tests are *not* administrative restarts, they are scripts, which by your own admission should be rate limited.

We simply want to avoid busy loops, and to protect against that it doesn't matter what exactly is causing a restart.

Comment 5 Nikolai Kondrashov 2013-10-14 10:37:17 UTC

I'm sorry, you haven't provided any arguments, and I still consider it logically possible to distinguish restarts initiated with the "service" ("systemctl") command compared to other methods. However, I can understand if the current design of systemd doesn't allow implementing that easily.

Why administrative restarts cannot be done by a script? Would it be allowed to do them if it was reimplemented in C?

Sorry, I don't see where I admitted that my scripts should be limited on how often they restart a service. On the contrary, I need them to restart the service as fast as possible.

And no, I don't want to make my scripts change a service description file or execute a special command so they're allowed to restart a service as often as needed. Not in the least because these are incompatible with RHEL6 (and other distributions not using systemd), which I (may) still have to test.

I still insist that administrative restarts, i.e. run by an administrator, using administrative interface (i.e. "service" or "systemctl" commands) don't need to be protected against. Otherwise systemd is trying to protect administrators from themselves, thus limiting its own usability, akin to a hammer padded with feather cushions for "safety".

Comment 6 Lennart Poettering 2013-10-15 21:28:02 UTC

(In reply to Nikolai Kondrashov from comment #5)
> I'm sorry, you haven't provided any arguments, and I still consider it
> logically possible to distinguish restarts initiated with the "service"
> ("systemctl") command compared to other methods. However, I can understand
> if the current design of systemd doesn't allow implementing that easily.
> 
> Why administrative restarts cannot be done by a script? Would it be allowed
> to do them if it was reimplemented in C?

No, that's the point, scripts using systemctl, users running systemctl from the command line and C programs all use the same interface: the bus APIs, and to systemd they are all the same.

Also, how would you distuingish systemctl run by a script and run by the admin anyway?

To avoid busy loops we rate limit everything. And that's the right thing to do.

Note that you can reset the rate counter with "systemctl reset-failed" on the specific unit. If you really really think that busy loops are awesome you can invoke that before each "systemctl start", and not rate limits will every apply.

> Sorry, I don't see where I admitted that my scripts should be limited on how
> often they restart a service. On the contrary, I need them to restart the
> service as fast as possible.

That's a very bad idea. Note that even sysvinit employed rate limits for its services. 

> And no, I don't want to make my scripts change a service description file or
> execute a special command so they're allowed to restart a service as often
> as needed. Not in the least because these are incompatible with RHEL6 (and
> other distributions not using systemd), which I (may) still have to test.
> 
> I still insist that administrative restarts, i.e. run by an administrator,
> using administrative interface (i.e. "service" or "systemctl" commands)
> don't need to be protected against. Otherwise systemd is trying to protect
> administrators from themselves, thus limiting its own usability, akin to a
> hammer padded with feather cushions for "safety".

Well, your script is not the admin, and we cannot distuingish that and it's a bad idea anyway and there's a work-around using systemctl reset-failed. Sorry, but this is really not going to change. It's a safety and robustness feature, and it's going to stay.

Comment 7 Nikolai Kondrashov 2013-10-16 10:29:59 UTC

(In reply to Lennart Poettering from comment #6)
> (In reply to Nikolai Kondrashov from comment #5)
> > Why administrative restarts cannot be done by a script? Would it be allowed
> > to do them if it was reimplemented in C?
> 
> No, that's the point, scripts using systemctl, users running systemctl from
> the command line and C programs all use the same interface: the bus APIs,
> and to systemd they are all the same.
> 
> Also, how would you distuingish systemctl run by a script and run by the
> admin anyway?

Nohow, and that's not what I'm asking to do.
I'm asking for systemd not to limit how it is used.

The mechanisms for limiting re-spawning of a dying service or starting based
on socket connections have their merit, since it is done by systemd itself and
the triggers are potentially outside of administrator's control.

However, systemd limiting *outside* API (such as systemctl interface, or maybe
some D-Bus methods) is just assuming too much responsibility. There are valid
use cases where systemd might be driven by other software, it shouldn't be
limited to a slow human interface only. And it should be the driving
software's decision on how often a service can be restarted.

If the internal restart mechanisms use the same API (D-Bus methods?) as the
outside control is supposed to use, then make a separate API for the former
and make *that* limited, but please don't limit outside control.

> Note that you can reset the rate counter with "systemctl reset-failed" on
> the specific unit. If you really really think that busy loops are awesome
> you can invoke that before each "systemctl start", and not rate limits will
> every apply.

No, I don't think busy loops are "awesome" or "awful", they're just loops, and
"busy" is relative. When I need to try 100 different configurations of a
service in a test I would naturally want it to try them faster than 5 per 10
seconds, i.e. completing in *less* than 3.3 minutes, preferably as quickly as
possible. And, no, I don't want to call "systemctl reset-failed", not in the
least because it doesn't exist in "service" command interface, and on RHEL6
and other distros.

> > Sorry, I don't see where I admitted that my scripts should be limited on how
> > often they restart a service. On the contrary, I need them to restart the
> > service as fast as possible.
> 
> That's a very bad idea. Note that even sysvinit employed rate limits for its
> services. 

You probably mean services defined in /etc/inittab and (re-)spawned
automatically by "init". Limiting those is right. However, there was never any
limit on how often a service could be restarted using scripts from /etc/init.d
or the "service" command.

Note You need to log in before you can comment on or make changes to this bug.