Bug 915912 - %u/%U/%h/%s requires NSS lookups when loading systemd unit configuration, but that's totally *not* OK
Summary: %u/%U/%h/%s requires NSS lookups when loading systemd unit configuration, but...
Keywords:
Status: CLOSED RAWHIDE
Alias: None
Product: Fedora
Classification: Fedora
Component: systemd
Version: rawhide
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: systemd-maint
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-02-26 18:43 UTC by Anthony Messina
Modified: 2014-06-23 04:03 UTC (History)
10 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2014-06-23 04:03:10 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
dmesg out of memory error (204.26 KB, text/plain)
2013-02-26 18:43 UTC, Anthony Messina
no flags Details
out of memory error with systemd metainfo (3.45 KB, text/plain)
2013-02-26 18:45 UTC, Anthony Messina
no flags Details

Description Anthony Messina 2013-02-26 18:43:36 UTC
Created attachment 703060 [details]
dmesg out of memory error

Using the following unit file, I receive the above stated "Cannot allocate memory" error from systemd.

I've started a conversation on the systemd-devel list (http://lists.freedesktop.org/archives/systemd-devel/2013-February/009179.html) with little helpful response.  I have also attached the dmesg output as well as the raw systemd output.

# Unit file
[Unit]
Description=k5start Kerberos ticket service for %i
Documentation=man:k5start(1)
Before=display-manager.service httpd.service mythbackend.service
After=network.target sssd.service

[Service]
User=%i
Type=forking
PIDFile=/run/user/%U/k5start.pid
#ConditionPathExists=/etc/k5start.d/%u.keytab
Environment=KRB5CCNAME=DIR:/run/user/%U/krb5cc
ExecStartPre=/usr/bin/mkdir -p -m 0700 /run/user/%U ; /usr/bin/mkdir -p -m 0700 /run/user/%U/krb5cc ; /bin/chown -R %u:%u /run/user/%U ; /usr/bin/chcon -R -t user_tmp_t /run/user/%U
ExecStart=/usr/bin/k5start -b -f /etc/k5start.d/%u.keytab -K 60 -p /run/user/%U/k5start.pid -L -v -U
ExecReload=/bin/kill -ALRM $MAINPID
PermissionsStartOnly=true

[Install]
WantedBy=multi-user.target

Comment 1 Anthony Messina 2013-02-26 18:45:12 UTC
Created attachment 703061 [details]
out of memory error with systemd metainfo

Comment 2 Michal Schmidt 2013-02-27 14:49:59 UTC
There's a bug in error handling. All failures of unit_full_printf() are treated as Out of memory errors, but it may fail for different reasons. For instance, NIS or LDAP lookup may fail for the user name.

Comment 3 Zbigniew Jędrzejewski-Szmek 2013-02-27 15:08:35 UTC
> There's a bug in error handling. All failures of unit_full_printf() are treated
> as Out of memory errors, but it may fail for different reasons.
Yeah, adding an log_error in specifier_user_name would help a lot, but
probably the proper solution is to rework all functions to return an
exit code and set an output variable. Lot's of work :(

> For instance, NIS or LDAP lookup may fail for the user name.
Anthony, is the 'mythtv' user defined in the local password database?

Zbyszek

Comment 4 Anthony Messina 2013-02-27 16:09:08 UTC
(In reply to comment #3)
> > For instance, NIS or LDAP lookup may fail for the user name.
> Anthony, is the 'mythtv' user defined in the local password database?
> 
> Zbyszek

Eventually, I would like to be able to use a unit file like the one above for the following user accounts: apache, mythtv (the backend user), mythtv-fe1 (a frontend user), mythtv-fe2 (another frontend user).

In all of those cases, I have the user accounts defined in FreeIPA, though the 'apache' user is duplicated on the local machine with the same uidnumber/gidnumber.

In the particular case with the logs I have submitted, the 'mythtv' user account is stored in FreeIPA and all of my systems use SSSD, which is why I had the dependency to start After=sssd.service

Comment 5 Zbigniew Jędrzejewski-Szmek 2013-02-27 16:19:33 UTC
So the issue of the unclear (to put it nicely) logging is clear. We'll probably have to fix that.

Another question is why it is failing at all.
Just to verify, does it fail if you add something like
  ExecStartPre=/bin/id %u
before the other ExecStartPre line?

Comment 6 Anthony Messina 2013-02-27 17:09:13 UTC
(In reply to comment #5)
> So the issue of the unclear (to put it nicely) logging is clear. We'll
> probably have to fix that.
> 
> Another question is why it is failing at all.
> Just to verify, does it fail if you add something like
>   ExecStartPre=/bin/id %u
> before the other ExecStartPre line?

Unfortunately, that doesn't make it work either.

Comment 7 Zbigniew Jędrzejewski-Szmek 2013-02-27 17:24:33 UTC
>> Another question is why it is failing at all.
>> Just to verify, does it fail if you add something like
>>   ExecStartPre=/bin/id %u
>> before the other ExecStartPre line?

> Unfortunately, that doesn't make it work either.

This was just supposed to be a diagnostic: if /bin/id fails, than it means that resolution through sssd is not working yet. If /bin/id succeeds, than it might be a problem with systemd.

Comment 8 Anthony Messina 2013-02-27 18:00:06 UTC
(In reply to comment #7)
> >> Another question is why it is failing at all.
> >> Just to verify, does it fail if you add something like
> >>   ExecStartPre=/bin/id %u
> >> before the other ExecStartPre line?
> 
> > Unfortunately, that doesn't make it work either.
> 
> This was just supposed to be a diagnostic: if /bin/id fails, than it means
> that resolution through sssd is not working yet. If /bin/id succeeds, than
> it might be a problem with systemd.

I see.  What I meant was, I still get the out of memory error so the service doesn't even attempt to start later on as it's excluded early on in the boot process, I think when systemd is compiling it's list of work and dependencies.

Comment 9 Anthony Messina 2013-03-01 21:47:19 UTC
Lennart applied an upstream patch to help get more detailed information about this situation.  Would id be possible to bring that patch into the next F18 RPM and I can test this out.

http://cgit.freedesktop.org/systemd/systemd/commit/?id=487060c2394b7703e59650ef332053645ffae2a3

Comment 10 Zbigniew Jędrzejewski-Szmek 2013-03-01 21:51:54 UTC
A release is imminent.

Comment 11 Anthony Messina 2013-04-03 12:27:56 UTC
(In reply to comment #10)
> A release is imminent.

If possible, and in order for me to test this out with the updates, I would need an updated F18 RPM with the modifications to improve logging, which I think are http://cgit.freedesktop.org/systemd/systemd/commit/?id=487060c2394b7703e59650ef332053645ffae2a3

Comment 12 Fedora Update System 2013-04-10 20:12:06 UTC
systemd-201-2.fc18.1 has been submitted as an update for Fedora 18.
https://admin.fedoraproject.org/updates/systemd-201-2.fc18.1

Comment 13 Fedora Update System 2013-04-11 23:25:02 UTC
Package systemd-201-2.fc18.2:
* should fix your issue,
* was pushed to the Fedora 18 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing systemd-201-2.fc18.2'
as soon as you are able to.
Please go to the following url:
https://admin.fedoraproject.org/updates/FEDORA-2013-5452/systemd-201-2.fc18.2
then log in and leave karma (feedback).

Comment 14 Anthony Messina 2013-04-13 13:15:59 UTC
(In reply to comment #13)
> Package systemd-201-2.fc18.2:
> * should fix your issue,
> * was pushed to the Fedora 18 testing repository,
> * should be available at your local mirror within two days.
> Update it with:
> # su -c 'yum update --enablerepo=updates-testing systemd-201-2.fc18.2'
> as soon as you are able to.
> Please go to the following url:
> https://admin.fedoraproject.org/updates/FEDORA-2013-5452/systemd-201-2.fc18.2
> then log in and leave karma (feedback).

Unfortunately, this update does not resolve this issue:

systemd[1]: Cannot add dependency job for unit k5start, ignoring: Unit k5start failed to load: Cannot allocate memory.

Comment 15 Zbigniew Jędrzejewski-Szmek 2013-04-15 18:07:30 UTC
To sum up:

1. a unit has User=<user>, where the <user> is defined through freeipa
   and is not available when systemd is started.
2. systemd disables the unit because it cannot resolve the user.
3. systemd reports the error as 'out of memory'.

So far, the only change in systemd behaviour is what Lennart added:
the real cause is logged to the journal. Nevertheless lower in the
stack systemd still shows 'out of memory' as the reason for failure.
Changing that would require some refactoring of the specifier (%u, %U,
etc.) resolver code. This will be done, but so far nobody has worked
on that.

> Unfortunately, this update does not resolve this issue:
>
> systemd[1]: Cannot add dependency job for unit k5start,
> ignoring: Unit k5start failed to load: Cannot allocate memory.
This is expected, as I wrote above. The only change from before is that
there's additional information in the logs.

Going forward:
1. Changing the way that the error is reported is on the TODO list.
2. Supporting units which require other units (e.g. network access and
   other services) to be active when loading them: probably not
   going to happen.

So the bug maybe should be retitled 'systemd incorrectly reports unit specifier
errors as oom', and left as open, but it should be acknowleged that the unit is
incorrect.

Comment 16 Anthony Messina 2013-04-15 20:30:00 UTC
(In reply to comment #15)
> To sum up:
> 
> 1. a unit has User=<user>, where the <user> is defined through freeipa
>    and is not available when systemd is started.
> 2. systemd disables the unit because it cannot resolve the user.
> 3. systemd reports the error as 'out of memory'.
> 
> So far, the only change in systemd behaviour is what Lennart added:
> the real cause is logged to the journal. Nevertheless lower in the
> stack systemd still shows 'out of memory' as the reason for failure.
> Changing that would require some refactoring of the specifier (%u, %U,
> etc.) resolver code. This will be done, but so far nobody has worked
> on that.
> 
> > Unfortunately, this update does not resolve this issue:
> >
> > systemd[1]: Cannot add dependency job for unit k5start,
> > ignoring: Unit k5start failed to load: Cannot allocate memory.
> This is expected, as I wrote above. The only change from before is that
> there's additional information in the logs.
> 
> Going forward:
> 1. Changing the way that the error is reported is on the TODO list.
> 2. Supporting units which require other units (e.g. network access and
>    other services) to be active when loading them: probably not
>    going to happen.
> 
> So the bug maybe should be retitled 'systemd incorrectly reports unit
> specifier
> errors as oom', and left as open, but it should be acknowleged that the unit
> is
> incorrect.

That is an excellent summation.  Thank you.  However, I am not certain which part of the unit is incorrect.  What recommendations can be made for rewriting the unit?  For example, I have specified that the unit is:

After=network.target sssd.service

Until going though this process and joining the systemd mailing list, I was not aware that a unit's properties are evaluated long before they are executed.  I would think that if a user specified that a unit depends on network and sssd, then that unit would function properly so long as network and sssd are started before the specified unit.

I am not opposed to changing the title of the bug, but note that I, as the reporter/end user am not really concerned with whether or not the error is reported as "out of memory" or "some other underlying cause."

I *am* concerned with the fact that the unit as written doesn't accomplish what it appears it should.  Perhaps this type of unit file is not technically possible with systemd--that's fine, but then I would ask what is the "systemd way" a person running a systemd-based system would reliably accomplish the task of acquiring Kerberos credentials for a service which runs as a user, before the service itself starts.

Again, thank you for your feedback.

Comment 17 Zbigniew Jędrzejewski-Szmek 2013-10-09 02:31:58 UTC
Logging has been fixed upstream in http://cgit.freedesktop.org/systemd/systemd/commit/?id=19f6d71 and is now in F20. This is not going to be backported, since the patch is pretty large.

> Until going though this process and joining the systemd mailing list, I was not > aware that a unit's properties are evaluated long before they are executed.  I > would think that if a user specified that a unit depends on network and sssd,
> then that unit would function properly so long as network and sssd are started > before the specified unit.
That's a good question. It is currently impossible, but maybe wouldn't be that hard to add: we *could* delay the evaluation of some fields. I'm retitling this bug and leaving it open, as food for thought. Please don't count on a quick resolution :)

Comment 18 Lennart Poettering 2013-10-13 19:17:32 UTC
Hmm. So, originally we made sure we never do NSS from PID 1 ever, so that things can never block in PID 1 on external services. By the introduction of %u, %U, %h, %s we lost that. I am pretty sure that was a big mistake.

Note that for User= we only resolved the user name after forking, before execing, so that the NSS request would not be done in PID1.

Not sure what we can do about %u/%U/%h%s now. This was a big mistake to allow. I am not sure how I let this pass and get in without noticing. :-(

%u could be salvaged to simply resolve to the same string as specified in User=. But the others? We could limit their purposes and declare they only resolve to the specific credentials of the running systemd instance, rather than the service. But I am have no idea at all how we could make them safely work for arbitrary system services...

Comment 19 Lennart Poettering 2013-10-13 19:20:33 UTC
Oh, and in general, system users must be available without the network. They must be served from /etc/passwd,  not from sssd/ldap or anything else. We simply do not support any setups like that, and we will never.

Thus, renaming the bug.

Comment 20 Zbigniew Jędrzejewski-Szmek 2013-10-13 19:39:33 UTC
Right, but getpwnam doesn't have a non-blocking mode, so we would need to do ugly hacks all the time. I think that having the user in the local database is not enough, since NSS might be configured to try something over the network, causing arbitrary delays when loading units. So we would really need to forbid using %U/%h/%s or forbid using networked NSS. The second is not an option, and the first also sucks.

Comment 21 Lennart Poettering 2013-10-13 19:42:57 UTC
I am tempted to just remove %U, %h, %s support from systemd's codebase now, accept that this a regression, and declare that %u resolves to the very same value set with User= or if that isn't set to the user name of the user running systemd.

Comment 22 Zbigniew Jędrzejewski-Szmek 2013-10-13 19:52:07 UTC
Hm, do wo actually know what people are using it for? None of systemd files that I have on F19 system have %U/%h/%s, except for user@.service with SHELL=%s, which is already removed in git.

If we remove it, how is the kerberos service supposed to look?

Comment 23 Lennart Poettering 2013-10-13 20:10:59 UTC
Doesn't kerberos use the kernel keyring now?

Comment 24 Zbigniew Jędrzejewski-Szmek 2013-10-13 20:23:25 UTC
Yes, but here it is also used for the pid file.

Comment 25 Lennart Poettering 2013-10-13 21:01:39 UTC
What does it need a PID file for? 

The whole kerberos kernel keyring thing was done to avoid that we have to run fake user services outside of user contexts.

The unit file suggested in the original bug is something we really cannot support anyway: if it drops privileges to normal users it *must* go through PAM, otherwise pam_limits and all the other stuff is not applied, and that's just wrong.

Comment 26 Jóhann B. Guðmundsson 2013-10-14 07:58:03 UTC
(In reply to Lennart Poettering from comment #19)
> Oh, and in general, system users must be available without the network. They
> must be served from /etc/passwd,  not from sssd/ldap or anything else. We
> simply do not support any setups like that, and we will never.

To give another example even if you are not looking to utilize central authentication such as LDAP, you can still run in to problems with having inconsistent UID and GID numbers. 

For example, suppose you have a SAN LUN mapped to ServerA. This LUN might have thousands of files stored on it. Each file stored on the LUN has the file owner and group stored as UID and GID numbers. 

So if you take this LUN and unmap it from ServerA and map it to ServerB, you will have an issues if the UID and GID numbers are not consistent between ServerA and ServerB. 

In this scenario, you could have a couple of problems. If apache was UID 80 on ServerA, and samba is UID 80 on ServerB, after moving the LUN samba  owns all of apache files. 

If there is no UID 80 on ServerB, then the file does not have an owner on ServerB, and you simply see "404" as the owner when you run a ls –al command and you might also have such issues with inconsistent UID/GID numbers across servers when you are exporting NFS shares between servers. 

So as one can see this cannot be properly supported ( even outside systemd ) until system static uid/gid are unified between *nix ( or even just linux distro's  ) and system users and groups cannot be supported to be stored elsewhere other then locally.

Comment 27 Lennart Poettering 2014-01-23 14:01:29 UTC
This is fixed in git, where the specifiers are not available anymore for the systemd system instance.

Comment 28 Zbigniew Jędrzejewski-Szmek 2014-06-23 04:03:10 UTC
Changes described in comment #27 went in in -209.


Note You need to log in before you can comment on or make changes to this bug.