Bug 1853261 - sssd fails to start on read-only filesystem
Summary: sssd fails to start on read-only filesystem
Keywords:
Status: CLOSED EOL
Alias: None
Product: Fedora
Classification: Fedora
Component: sssd
Version: 33
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: SSSD Maintainers
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-07-02 09:41 UTC by Zbigniew Jędrzejewski-Szmek
Modified: 2021-11-30 16:08 UTC (History)
11 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-11-30 16:08:05 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)

Description Zbigniew Jędrzejewski-Szmek 2020-07-02 09:41:26 UTC
Description of problem:
sssd refuses to start when /var/log is read-only. It is obviously a problem when trying to recover
from a file system issue. Since logging is non-essential, sssd should just continue. At most is
should emit a warning.

$ touch /var/tmp/foo
touch: cannot touch '/var/tmp/foo': Read-only file system
$ sudo systemctl start sssd.service
Job for sssd.service failed because the control process exited with error code.
See "systemctl status sssd.service" and "journalctl -xe" for details.
[fedora@workstation-uefi ~]$ 
Broadcast message from systemd-journald@workstation-uefi (Thu 2020-07-02 11:16:46 CEST):
sssd[900]: Could not open file [/var/log/sssd/sssd.log]. Error: [30][Read-only file system]

Broadcast message from systemd-journald@workstation-uefi (Thu 2020-07-02 11:16:47 CEST):
sssd[903]: Could not open file [/var/log/sssd/sssd.log]. Error: [30][Read-only file system]

Broadcast message from systemd-journald@workstation-uefi (Thu 2020-07-02 11:16:47 CEST):
sssd[905]: Could not open file [/var/log/sssd/sssd.log]. Error: [30][Read-only file system]

Broadcast message from systemd-journald@workstation-uefi (Thu 2020-07-02 11:16:47 CEST):
sssd[906]: Could not open file [/var/log/sssd/sssd.log]. Error: [30][Read-only file system]

Broadcast message from systemd-journald@workstation-uefi (Thu 2020-07-02 11:16:47 CEST):
sssd[907]: Could not open file [/var/log/sssd/sssd.log]. Error: [30][Read-only file system]

We see multiple issues here:
- the main one is that sssd should not fail to start as described above
- but also, why is sssd not just logging to the journal? Why is it spamming with broadcast messages?
  If sssd would just log to the journal like any modern service, all those problems
  would be avoided.

Version-Release number of selected component (if applicable):
sssd-2.3.0-2.fc33.x86_64

How reproducible:
Deterministic

Steps to Reproduce:
1. Make /var/log read-only, either by not remounting the filesystems rw, or by using udev.blockdev-read-only with systemd-246+
2. Try to start sssd

Comment 1 Lukas Slebodnik 2020-07-02 12:10:07 UTC
(In reply to Zbigniew Jędrzejewski-Szmek from comment #0)
> 
> We see multiple issues here:
> - the main one is that sssd should not fail to start as described above

I would say you should use root to recover from a file system issue
and root user is not handled by sssd

> - but also, why is sssd not just logging to the journal? Why is it spamming
> with broadcast messages?

By default it try to log to files in /var/log/sssd. Obviously it failed due to read-only file system.
that's the reason of broadcast emssage.

>   If sssd would just log to the journal like any modern service, all those
> problems would be avoided.

sssd can log to journald  you just need to explicitely configure it.

sh$ systemctl cat sssd | head
# /usr/lib/systemd/system/sssd.service
[Unit]
Description=System Security Services Daemon
# SSSD must be running before we permit user sessions
Before=systemd-user-sessions.service nss-user-lookup.target
Wants=nss-user-lookup.target

[Service]
Environment=DEBUG_LOGGER=--logger=files
EnvironmentFile=-/etc/sysconfig/sssd

Just set DEBUG_LOGGER in /etc/sysconfig/sssd (details in `man sssd`)

But I doubt it will help if system is read-only.
sssd will still try to open rw files in /var/lib/sss/

Comment 2 Alexey Tikhonov 2020-07-02 12:23:44 UTC
Hi,


1) "basic" logs by default are written to files (/var/log/sssd) and one can control this via "--logger" option (including option to redirect to journald or stderr completely)
2) "important" messages are also (additionally) logged to journald
What you see as "Broadcast message from systemd-journald@workstation-uefi" is broadcast from journald, not directly from SSSD.
You can control those broadcast messages via journald settings (namely `ForwardToWall` and `MaxLevelWall`)


So the questions here are:
1) Does this message have appropriate log level (*perhaps* emergency level of this specific message should be decreased)?
2) as you wrote, perhaps SSSD should fallback to "/dev/null" for "basic" logs output in case of fail to open files...

Comment 3 Zbigniew Jędrzejewski-Szmek 2020-07-02 13:25:39 UTC
The goal of my bug report is to nudge Fedora toward a system which is resilient to failure.
In particular this is motivated by making it easier for users to recover from disk corruption
and other file system issues. Users are confused and unhappy when they boot to a tty prompt.
I'm going through all services which reasonably should run fine with readonly fs. Obviously,
in really bad cases, logging in at the tty will be necessary. But read-only root is not one
of them.

There is really no reason to log to a private log file nowadays. Please just log to the
journal by default. This is much nicer for users because they expect 'journalctl' to give
them system logs, and not to have to hunt for some obscure log file. This also makes this
*much* more robust. In particular, journald is completely fine with read-only root.
Log files are stored in /run, and if /var/log becomes available, logs are flushed there.
Logs should "just work" from user's POV.

Comment 4 Pavel Březina 2020-07-02 13:39:21 UTC
Well, I don't think users would be happy given the amount of messages sssd produce. But I agree that being unable to produce logs should not be considered fatal.

Comment 5 Zbigniew Jędrzejewski-Szmek 2020-07-02 13:59:15 UTC
(In reply to Pavel Březina from comment #4)
> Well, I don't think users would be happy given the amount of messages sssd produce.

I didn't consider this... On my machine sssd seems to be very quiet, but that's
probably because I have just one local user. Users hate "chatty" services with a
passion, so if sssd were to log the journal, it indeed shouldn't log too much
(by default).

> But I agree that being unable to produce logs should not be considered fatal.

Comment 6 Pavel Březina 2020-07-03 08:29:48 UTC
Since you have only one local users you use sssd only marginally.

We use journald (well syslog) for the most important error messages (like something that requires admin attention), but the rest goes to files when logging is enabled (unless --logger journald).

Comment 7 Ben Cotton 2020-08-11 13:45:09 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 33 development cycle.
Changing version to 33.

Comment 8 Michael Romero 2021-03-09 17:53:06 UTC
Any change this is going to be addressed?  I'm trying to get us completely away from an older LDAP environment, but whenever a filesystem goes read-only, the only account we can get in to investigate with is either the "root" user via console, or a legacy LDAP user via "SSH".   While these r/o FS issues don't happen that often, when they do - it would be GREAT if authentication could continue to function and not require logging in as root over console to resolve.

Comment 9 Alexey Tikhonov 2021-03-09 19:56:40 UTC
(In reply to Michael Romero from comment #8)
> Any change this is going to be addressed?  I'm trying to get us completely
> away from an older LDAP environment, but whenever a filesystem goes
> read-only, the only account we can get in to investigate with is either the
> "root" user via console, or a legacy LDAP user via "SSH".   While these r/o
> FS issues don't happen that often, when they do - it would be GREAT if
> authentication could continue to function and not require logging in as root
> over console to resolve.

Solving this for logs won't help: sssd need writable /var/lib/sss/ anyway.

Comment 10 Paweł Poławski 2021-04-08 13:31:48 UTC
After some discussion it looks like there is no point of SSSD working on read-only filesystem.
User can already redirect logs to journal using existing SSSD options.
The problem is our databases located at /var/lib/sss/db/* still needs to be RW for SSSD to work.

Comment 11 Zbigniew Jędrzejewski-Szmek 2021-04-09 08:38:37 UTC
Sorry, but that is not an acceptable answer.

> After some discussion it looks like there is no point of SSSD working on read-only filesystem.

Let me quote the previous previous message:
> but whenever a filesystem goes read-only, the only account we can get in to investigate with is either the "root" user via console, or a legacy LDAP user via "SSH".

If sssd was an opt-in thing, throwing up your hands and saying this is too hard to support would fine.
But it is a default component in Fedora. This means that it also needs to solve the hard edge cases.
The default authentication mechanism simply cannot stop working in a scenario which is a fairly
common failure mode.

Comment 12 Ben Cotton 2021-11-04 13:44:58 UTC
This message is a reminder that Fedora 33 is nearing its end of life.
Fedora will stop maintaining and issuing updates for Fedora 33 on 2021-11-30.
It is Fedora's policy to close all bug reports from releases that are no longer
maintained. At that time this bug will be closed as EOL if it remains open with a
Fedora 'version' of '33'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 33 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 13 Ben Cotton 2021-11-04 14:14:27 UTC
This message is a reminder that Fedora 33 is nearing its end of life.
Fedora will stop maintaining and issuing updates for Fedora 33 on 2021-11-30.
It is Fedora's policy to close all bug reports from releases that are no longer
maintained. At that time this bug will be closed as EOL if it remains open with a
Fedora 'version' of '33'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 33 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 14 Ben Cotton 2021-11-04 15:12:03 UTC
This message is a reminder that Fedora 33 is nearing its end of life.
Fedora will stop maintaining and issuing updates for Fedora 33 on 2021-11-30.
It is Fedora's policy to close all bug reports from releases that are no longer
maintained. At that time this bug will be closed as EOL if it remains open with a
Fedora 'version' of '33'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 33 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 15 Ben Cotton 2021-11-30 16:08:05 UTC
Fedora 33 changed to end-of-life (EOL) status on 2021-11-30. Fedora 33 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.


Note You need to log in before you can comment on or make changes to this bug.