Bug 988068 - getpwnam_r fails for non-existing users when sssd is not running
getpwnam_r fails for non-existing users when sssd is not running
Product: Fedora
Classification: Fedora
Component: sssd (Show other bugs)
All Linux
unspecified Severity medium
: ---
: ---
Assigned To: Jakub Hrozek
Fedora Extras Quality Assurance
Depends On:
  Show dependency treegraph
Reported: 2013-07-24 12:32 EDT by Jochen De Smet
Modified: 2015-02-14 09:43 EST (History)
17 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2015-02-14 09:43:51 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)
SSS client hacky workaround (1.28 KB, patch)
2013-08-20 13:36 EDT, Stephen Gallagher
no flags Details | Diff

  None (edit)
Description Jochen De Smet 2013-07-24 12:32:34 EDT
Description of problem:
In Bug 867473, "sss" was added to the default nsswitch.conf. This causes getpwnam_r to report an error when queried for a non-existing user, instead of just returning "user not found".  This breaks things like postfix's luser_relay functionality.

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:
1. Install F19; notice that sssd is installed by default, but not enabled
2. Compile this short test program:
#include <sys/types.h>
#include <pwd.h>
#include <stdio.h>

int main(int argc, char* argv[]) {
  struct passwd pwd;
  char buf[4096];
  int err;
  struct passwd *res;

  err = getpwnam_r(argv[1], &pwd, buf, 4096, &res);

  printf("<%s> err: <%d>\n", argv[1], err);
  return 0;

3. Run it with a non-existing user:  ./t unknown-user

Actual results:
# ./t unknown-user
<unknown-user> err: <2>

Expected results:
# ./t unknown-user
<unknown-user> err: <0>

Additional info:
Removing the sss from nsswitch.conf results in the expected behaviour
Comment 1 Lukas Slebodnik 2013-07-25 06:09:08 EDT
I don't think, that it is a sssd bug.

I reproduced it successfully with nss-pam-ldapd

Steps to Reproduce:
1. Install F19
2. Install nss-pam-ldapd, but service nslcd must be inactive
3. Configure nss to use nss-pam-ldap
   --file /etc/nsswitch have to contain line "passwd:     files ldap"
4. Compile short test program (from bug description)
5. Run it with a non-existing user:  ./t unknown-user

# ./t unknown-user
<unknown-user> err: <2>

Additional info:
Removing the ldap from nsswitch.conf results in your expected behaviour
Comment 2 Lukas Slebodnik 2013-07-25 06:23:16 EDT
According to libc(nss) manual documentation:

While the user-level function returns a pointer to the result the reentrant function return an enum nss_status value: 
NSS_STATUS_TRYAGAIN (numeric value -2)
NSS_STATUS_UNAVAIL (numeric value -1)
NSS_STATUS_NOTFOUND (numeric value 0)
NSS_STATUS_SUCCESS (numeric value 1)

In case the interface function has to return an error it is important that the correct error code is stored in *errnop. Some return status value have only one associated error code, others have more.

NSS_STATUS_UNAVAIL 	ENOENT 	A necessary input file cannot be found. 

In your case, sssd (_nss_sss_getpwnam_r) returned NSS_STATUS_UNAVAIL and
*errnop was set to ENOENT. ENOENT is defined as number "2". And number 2 is  returned from getpwnam_r in your example code.
The sssd behaves exactly as is described in the NSS-Modules-Interface documentation.

Possible solutions:
* NSS_STATUS_UNAVAIL should be handled in nss code
* manual pages of getpwnam_r should be updated
Comment 3 Jakub Hrozek 2013-07-25 06:32:53 EDT
Seems like an glibc issue. I'm adding the glibc maintainer to CC list.
Comment 4 Jochen De Smet 2013-07-25 09:09:33 EDT
To be clear, my main issue is that the default F19 configuration breaks postfix.

Whether there's an actual bug in ssd/postfix/glibc, or if it's simple a matter of needing to remove or properly configure sssd in the default install, I'll leave to you to decide.
Comment 5 Simo Sorce 2013-07-25 12:19:44 EDT
(In reply to Jochen De Smet from comment #4)
> To be clear, my main issue is that the default F19 configuration breaks
> postfix.


> Whether there's an actual bug in ssd/postfix/glibc, or if it's simple a
> matter of needing to remove or properly configure sssd in the default
> install, I'll leave to you to decide.

you can certainly remove sss locally to work around the issue however this bug seem primarily a glibc inconsistency issue and to a lesser degree a postifx issue in the sense that it is being a little too strict.

This is what the man page says about retun errors:

       The  getpwnam()  and  getpwuid() functions return a pointer to a passwd
       structure, or NULL if the matching entry  is  not  found  or  an  error
       occurs.   If an error occurs, errno is set appropriately.  If one wants
       to check errno after the call, it should be  set  to  zero  before  the
       On  success, getpwnam_r() and getpwuid_r() return zero, and set *result
       to pwd.  If no matching password  record  was  found,  these  functions
       return  0 and store NULL in *result.  In case of error, an error number
       is returned, and NULL is stored in *result.

now, for getpwname_r() it is true that the doc says 0 is returned if the user is not found, and this is where I think glibc's bug is as it is not respecting it when sss returns NSS_STATUS_UNAVAIL ENOENT
however postfix could also be a little bit more leninet and treate 0 and ENOENT the same as this is also in the manpage:

       0 or ENOENT or ESRCH or EBADF or EPERM or ...
              The given name or uid was not found.
Comment 6 Carlos O'Donell 2013-08-20 02:15:48 EDT
I've read and re-read this issue a couple of times now and I come up with the same answer each time: this is a problem with the nss module for sss.

If you are the last lookup in a list of chained lookups and you return NSS_STATUS_UNAVAIL / ENOENT, that error will be propagated to the caller.

The error might have been hidden if you had a long list of lookups. The interface provides no way to inspect the failures of the lookups in the middle of the list. Thus if sss is in the middle of a list of lookups with the last lookup returning NSS_STATUS_NOTFOUND / ENOENT, then no error is returned.

There are many ways to resolve this problem, but somone needs to make a choice amongst them. I see no problem with glibc's behaviour. If you do, please explain why you think there is a problem and what is inconsistent about it.

I think that it is correct for the sss nss modules to return NSS_STATUS_UNAVAIL / ENOENT since that is exactly the problem. The sssd daemon is not running and the service is unavailable and that is a helpful diagnostic.

Comment 13 Carlos O'Donell 2013-08-20 11:58:48 EDT
I'm going to take this issue upstream to get a policy decision made and the documentation updated to clarify the exact situation we are facing here. Once we have a policy decision we can file specific bugs to fix issues.

In the meantime the glibc team is going to look at:

* Can we make glibc more conservative in fedora and not return an error for service failures. The API would instead return no result and no error. Errors will only be returned for critical internal failures.

I suggest others look at:

* What would it take to work around this in the nss_sss module e.g. do the wrong thing for the right reasons and don't return NSS_STATUS_UNAVAIL / ENOENT.

* Have a default config for sssd such that the service can be started right away.

Both of these solutions would be temporary until we can fix glibc.
Comment 14 Carlos O'Donell 2013-08-20 12:00:54 EDT
I don't have an ETA for fixing this as we have quite a bit of other work, but I'll keep this issue updated.
Comment 15 Stephen Gallagher 2013-08-20 13:36:19 EDT
Created attachment 788567 [details]
SSS client hacky workaround

I'm not necessarily condoning this approach, but if we *do* decide to hack together a workaround in nss_sss.so.2, the attached patch should cover it.
Comment 19 Lukas Slebodnik 2014-12-18 05:58:02 EST
After log discussion the workaround in sssd was merged in sssd upstream.

Therefore reassigning to sssd
Comment 20 Lukas Slebodnik 2014-12-18 06:00:02 EST
There is a question whether we want patch in fedora 19 or ticket should be moved to fedora 20.

Fedora 19 will be out of life in a month.
Comment 21 Jakub Hrozek 2015-01-07 11:23:41 EST
Fedora 19 is EOL now.

But I think it still makes sense to track that this issue was resolved in sssd upstream.
Comment 22 Jakub Hrozek 2015-01-07 11:24:17 EST
Upstream ticket:
Comment 23 Fedora Update System 2015-01-19 08:34:42 EST
sssd-1.12.3-3.fc21 has been submitted as an update for Fedora 21.
Comment 24 Fedora Update System 2015-01-20 16:00:57 EST
Package sssd-1.12.3-3.fc21:
* should fix your issue,
* was pushed to the Fedora 21 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing sssd-1.12.3-3.fc21'
as soon as you are able to.
Please go to the following url:
then log in and leave karma (feedback).
Comment 25 Fedora Update System 2015-01-22 05:41:08 EST
sssd-1.12.3-4.fc21 has been submitted as an update for Fedora 21.
Comment 26 Fedora Update System 2015-02-02 12:21:01 EST
sssd-1.12.3-4.fc21 has been pushed to the Fedora 21 stable repository.  If problems still persist, please make note of it in this bug report.

Note You need to log in before you can comment on or make changes to this bug.