Bug 1068542 - socket stays active even though its service cannot start (is masked)
Summary: socket stays active even though its service cannot start (is masked)
Keywords:
Status: CLOSED EOL
Alias: None
Product: Fedora
Classification: Fedora
Component: systemd
Version: 25
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: systemd-maint
QA Contact: Fedora Extras Quality Assurance
URL: https://github.com/systemd/systemd/is...
Whiteboard:
Depends On: 917503
Blocks: 1069306
TreeView+ depends on / blocked
 
Reported: 2014-02-21 13:23 UTC by Karel Volný
Modified: 2017-12-12 10:22 UTC (History)
12 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of: 917503
: 1069306 (view as bug list)
Environment:
Last Closed: 2017-12-12 10:22:27 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)

Description Karel Volný 2014-02-21 13:23:32 UTC
seems we have some regression here with systemd-208-9.fc20

I managed to get into unbootable state on one of the systems, but I cannot reproduce; what can I reproduce is that I am unable to login

steps to reproduce:
1. qemu-img create -f qcow2 F20-minimal.qcow2 10G
2. qemu-kvm -m 2048 -cdrom Fedora-20-x86_64-netinst.iso F20-minimal.qcow2
3. do the minimall install
4. reboot, login as root
5. systemctl mask systemd-journald.service
6. reboot
7. try to login

after typing in the password, you won't get to console, rather you get back to login after some timeout


+++ This bug was initially created as a clone of Bug #917503 +++

If you mask systemd-journald.service (and/or systemd-journald.socket, but masking the service is sufficient), you get unbootable system.


Please note that rsyslog is installed on my system:
$ systemctl status rsyslog.service
rsyslog.service - System Logging Service
          Loaded: loaded (/usr/lib/systemd/system/rsyslog.service; enabled)
          Active: active (running) since Mon 2013-03-04 09:18:08 CET; 6min ago
        Main PID: 736 (rsyslogd)
          CGroup: name=systemd:/system/rsyslog.service
                  `-736 /sbin/rsyslogd -n -c 7


Can you change systemd so it can boot without working/available journald?

--- Additional comment from Michal Schmidt on 2013-03-04 08:19:58 EST ---

With the following upstream changes, the system will boot without a working journald:

http://cgit.freedesktop.org/systemd/systemd/commit/?id=9d246da3c630559924a0262769c8493fa22c7acc

http://cgit.freedesktop.org/systemd/systemd/commit/?id=47c1d80d844689c81faf2eede95803c1ed6eb4af

I don't think it's going to get more optional than that.

--- Additional comment from Bill Nottingham on 2013-03-04 11:39:20 EST ---

Note that the journal isn't intended to be that optional in RHEL 7 - it's going to be what's feeding events to syslog, even if it's not storing data persistently.

--- Additional comment from Michal Schmidt on 2013-03-05 07:44:21 EST ---

Yes, it's not really that optional. systemd will still report any failures to connect stdout/stderr of services to the journal socket as errors. It's not meant as a supported configuration. But at least it will boot.

--- Additional comment from Lennart Poettering on 2013-03-07 22:14:50 EST ---

I'd actually suggest we turn persistence on on RHEL7, too, but limit it to something relatively small. That shouldn't hurt anybody, and is probably better than only have a tiny buffer in tmpfs...

I turned persistency for the journal on in F19 today. This will be on, and rsyslog stays in the default install, but that should be OK.

--- Additional comment from Fedora Update System on 2013-04-10 16:10:43 EDT ---

systemd-201-2.fc18.1 has been submitted as an update for Fedora 18.
https://admin.fedoraproject.org/updates/systemd-201-2.fc18.1

--- Additional comment from Fedora Update System on 2013-04-11 19:23:34 EDT ---

Package systemd-201-2.fc18.2:
* should fix your issue,
* was pushed to the Fedora 18 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing systemd-201-2.fc18.2'
as soon as you are able to.
Please go to the following url:
https://admin.fedoraproject.org/updates/FEDORA-2013-5452/systemd-201-2.fc18.2
then log in and leave karma (feedback).

--- Additional comment from Fedora Update System on 2013-04-15 19:58:11 EDT ---

Package systemd-201-2.fc18.4:
* should fix your issue,
* was pushed to the Fedora 18 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing systemd-201-2.fc18.4'
as soon as you are able to.
Please go to the following url:
https://admin.fedoraproject.org/updates/FEDORA-2013-5452/systemd-201-2.fc18.3
then log in and leave karma (feedback).

--- Additional comment from Fedora Update System on 2013-04-17 22:35:27 EDT ---

Package systemd-201-2.fc18.5:
* should fix your issue,
* was pushed to the Fedora 18 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing systemd-201-2.fc18.5'
as soon as you are able to.
Please go to the following url:
https://admin.fedoraproject.org/updates/FEDORA-2013-5452/systemd-201-2.fc18.5
then log in and leave karma (feedback).

--- Additional comment from Fedora Update System on 2013-05-07 09:37:55 EDT ---

systemd-201-2.fc18.6 has been submitted as an update for Fedora 18.
https://admin.fedoraproject.org/updates/FEDORA-2013-5452/systemd-201-2.fc18.6

--- Additional comment from Fedora Update System on 2013-05-09 06:00:47 EDT ---

Package systemd-201-2.fc18.6:
* should fix your issue,
* was pushed to the Fedora 18 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing systemd-201-2.fc18.6'
as soon as you are able to.
Please go to the following url:
https://admin.fedoraproject.org/updates/FEDORA-2013-5452/systemd-201-2.fc18.6
then log in and leave karma (feedback).

--- Additional comment from Fedora Update System on 2013-05-15 22:56:36 EDT ---

systemd-201-2.fc18.6 has been pushed to the Fedora 18 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 1 Karel Volný 2014-02-21 13:43:08 UTC
p.s. I forgot to mention that the same happens after choosing rescue in grub - I thought that this option should somehow resemble the old "single" behaviour, allowing to fix the system ... but I cannot fix it if I cannot log in ...?

Comment 2 Lennart Poettering 2014-02-23 15:23:04 UTC
systemd-journald is not optional and will not be optional. We connect all service stdout/stderr to it, and that makes it integral part of service management, and even if you dislike it so much it will still be useful to redirect that to syslog. You may turn off local storage if you think the journal is an abomination, do that with Storage=none in journald.conf.

Comment 3 Karel Volný 2014-02-23 22:57:42 UTC
(In reply to Lennart Poettering from comment #2)
> systemd-journald is not optional and will not be optional.

so ... why bug #917503 had been fixed in the first place?

> We connect all service stdout/stderr to it,

I'm perfectly fine with /dev/null

I'm not fine with making system completely unusable (and unrepairable without external boot medium!) just because of logging daemon failure

Comment 4 Michal Schmidt 2014-02-24 14:04:56 UTC
(In reply to Karel Volný from comment #3)
> unrepairable without external boot medium!

Not quite. Booting with the "emergency" option still works in this case. Or as a last resort, "init=/bin/sh".

By the way, you can boot without systemd-journald by masking its *.socket unit as well. Not recommended of course, but whatever...

The problem is when the *.service unit is masked and the *.socket unit is not. I'm not sure why the socket unit did not enter a failed state when the activation of the service is not possible. This may be an actual bug.

Comment 5 Michal Schmidt 2014-02-24 15:04:28 UTC
systemd-journald is a bit special in the way its socket survives the switch-root from the initramfs. Nevertheless, here's a reproducer that involves neigher initramfs nor systemd-journald:

1. Make sure a socket and its service are both "active (running)":
     systemctl start avahi-daemon.{socket,service}
2. Mask only the service:
     systemctl mask avahi-daemon.service
3. Stop the service:
     systemctl stop avahi-daemon.service
4. Check the resulting statuses:
     systemctl status avahi-daemon.{socket,service}

Actual result: Though the service is now inactive, the socket unit is still "active (running)". This is an inconsistency.
Expected result: The socket should be at most "active (listening)", so it can go into "failed" by itself when a connection is attempted.

I think the socket failed to notice the death of the service because socket_trigger_notify() returns early if the service's load state is UNIT_MASKED (!= UNIT_LOADED).

Comment 6 Karel Volný 2014-02-24 17:09:33 UTC
(In reply to Michal Schmidt from comment #4)
> (In reply to Karel Volný from comment #3)
> > unrepairable without external boot medium!
> 
> Not quite. Booting with the "emergency" option still works in this case.

er, I thought this is what "rescue" should do ...

unfortunately, I cannot find anything useful about it in the docs(.fedoraproject.org) ... is it just me or ... => docs bug?

anyways, thanks for the info, this works for me, so I'm quitting the panic mode  :-)

(or ... not that fast, I should investigate also what if there is only ssh access?)

> Or as a last resort, "init=/bin/sh".

as for this one, I though it is no longer supported in Fedora ... the current situation is that I cannot see what I write, the command output lacks newlines and systemctl does not work at all (to run the "unmask" action)

> By the way, you can boot without systemd-journald by masking its *.socket
> unit as well. Not recommended of course, but whatever...

"not recommended" ... well, ask Jirka Eischmann about his Pidora usecase

I'm not sure about his requirements, but I've briefly checked the memory consumption and it doesn't seem that low ... possibly a RFE material too

> The problem is when the *.service unit is masked and the *.socket unit is
> not. I'm not sure why the socket unit did not enter a failed state when the
> activation of the service is not possible. This may be an actual bug.

cool, thanks for the investigation, well done!

Comment 7 Michal Schmidt 2014-02-25 13:21:47 UTC
(In reply to Karel Volný from comment #6)
> (In reply to Michal Schmidt from comment #4)
> > Not quite. Booting with the "emergency" option still works in this case.
> 
> er, I thought this is what "rescue" should do ...
> 
> unfortunately, I cannot find anything useful about it in the
> docs(.fedoraproject.org) ... is it just me or ... => docs bug?

There are two rescue-like targets that can be booted into using the following kernel command line options:
- "systemd.unit=rescue.target" or "single" or "1" - similar to the traditional single-user mode. Still has quite a few services running.
- "systemd.unit=emergency.target" or "emergency" - a more minimal target.

There is no "rescue" option. I've just tried it on RHEL6 and it does not have any effect there either.

The supported options are documented in the man pages kernel-command-line(7) and systemd(1). The upstream wiki page on debugging systemd issues mentions them as well: http://freedesktop.org/wiki/Software/systemd/Debugging

I do not see a relevant book in docs.fedoraproject.org. There used to be a System Administrators Guide, where advice on rescuing non-booting systems would be nice to have, but it seems the book is not published anymore for current Fedora releases.
 
> > Or as a last resort, "init=/bin/sh".
> 
> as for this one, I though it is no longer supported in Fedora ... the
> current situation is that I cannot see what I write, the command output
> lacks newlines

I'm not seeing this issue. I guess plymouth is screwing your terminal. Make sure you drop "rhgb" from the command line. Maybe try also adding "plymouth.enable=0".

> and systemctl does not work at all (to run the "unmask" action)

Yes. By default it wants to talk to the running systemd to perform the operation. It's possible to let it go directly to the filesystem by adding "--root=/" on its command line. I myself find it delete the masking symlink to /dev/null manually: rm /etc/systemd/system/systemd-journald.service
 
> > By the way, you can boot without systemd-journald by masking its *.socket
> > unit as well. Not recommended of course, but whatever...
> 
> "not recommended" ... well, ask Jirka Eischmann about his Pidora usecase

Hm, I haven't talked to him yet, but I wonder if he considered either
- disabling only the persistent storage (Storage=volatile), or
- disabling all storage (Storage=none).

> I'm not sure about his requirements, but I've briefly checked the memory
> consumption and it doesn't seem that low ... possibly a RFE material too

Looking at systemd-journald's memory consumption can be misleading in a couple of ways:
- Some people look at virtual memory size, which is entirely irrelevant.
- Even when looking at RSS one can easily overestimate the memory stress caused by systemd-journald, because the easily discardable clean mappings of the on-disk journal files take a large fraction of the total size.

Comment 8 Fedora End Of Life 2015-05-29 11:00:27 UTC
This message is a reminder that Fedora 20 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 20. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as EOL if it remains open with a Fedora  'version'
of '20'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 20 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 9 Lukáš Nykrýn 2015-06-11 14:13:37 UTC
Most certainly still an issue.

Comment 10 Jan Kurik 2015-07-15 14:42:47 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 23 development cycle.
Changing version to '23'.

(As we did not run this process for some time, it could affect also pre-Fedora 23 development
cycle bugs. We are very sorry. It will help us with cleanup during Fedora 23 End Of Life. Thank you.)

More information and reason for this action is here:
https://fedoraproject.org/wiki/BugZappers/HouseKeeping/Fedora23

Comment 11 Lennart Poettering 2016-02-10 14:05:37 UTC
Fixed in git now.

Comment 12 Mike McCune 2016-03-28 23:38:32 UTC
This bug was accidentally moved from POST to MODIFIED via an error in automation, please see mmccune with any questions

Comment 13 Jan Synacek 2016-11-21 11:42:23 UTC
(In reply to Lennart Poettering from comment #11)
> Fixed in git now.

I can still use the reproducer from comment 5 and successfully reproduce the issue with systemd-231-10.fc25.x86_64.

Comment 14 Jan Synacek 2016-11-21 11:50:29 UTC
https://github.com/systemd/systemd/issues/4708

Comment 15 Fedora End Of Life 2016-11-24 11:07:28 UTC
This message is a reminder that Fedora 23 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 23. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as EOL if it remains open with a Fedora  'version'
of '23'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 23 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 16 Fedora End Of Life 2016-12-20 12:45:23 UTC
Fedora 23 changed to end-of-life (EOL) status on 2016-12-20. Fedora 23 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.

Comment 17 Karel Volný 2017-02-02 12:43:06 UTC
reopening as per #c13
just retested with systemd-231-12.fc25.x86_64 => updating version

Comment 18 Fedora End Of Life 2017-11-16 18:52:30 UTC
This message is a reminder that Fedora 25 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 25. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as EOL if it remains open with a Fedora  'version'
of '25'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version'
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not
able to fix it before Fedora 25 is end of life. If you would still like
to see this bug fixed and are able to reproduce it against a later version
of Fedora, you are encouraged  change the 'version' to a later Fedora
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's
lifetime, sometimes those efforts are overtaken by events. Often a
more recent Fedora release includes newer upstream software that fixes
bugs or makes them obsolete.

Comment 19 Fedora End Of Life 2017-12-12 10:22:27 UTC
Fedora 25 changed to end-of-life (EOL) status on 2017-12-12. Fedora 25 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.


Note You need to log in before you can comment on or make changes to this bug.