1089738 – 'groupadd -r' is painfully slow when SSSD is running

Bug 1089738 - 'groupadd -r' is painfully slow when SSSD is running

Summary: 'groupadd -r' is painfully slow when SSSD is running

Keywords:
Status:	CLOSED RAWHIDE
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	shadow-utils
Sub Component:
Version:	20
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	---
Assignee:	Tomas Mraz
QA Contact:	Fedora Extras Quality Assurance
Docs Contact:
URL:
Whiteboard:
Duplicates (1):	1115476 (view as bug list)
Depends On:
Blocks:	1279321
TreeView+	depends on / blocked

Reported:	2014-04-21 20:25 UTC by Stephen Gallagher
Modified:	2016-02-03 14:29 UTC (History)
CC List:	18 users (show)
Fixed In Version:	shadow-utils-4.1.5.1-13.fc21
Clone Of:
Clones:	1279321 (view as bug list)
Environment:
Last Closed:	2014-07-02 12:48:36 UTC
Type:	Bug
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
Redesign automatic GID allocation (12.03 KB, patch) 2014-06-09 15:25 UTC, Stephen Gallagher	no flags	Details \| Diff
Redesign automatic GID allocation (13.76 KB, patch) 2014-06-09 18:46 UTC, Stephen Gallagher	no flags	Details \| Diff
Redesign automatic GID allocation (19.90 KB, patch) 2014-06-16 17:45 UTC, Stephen Gallagher	tmraz: review+	Details \| Diff
Show Obsolete (2) View All

Description Stephen Gallagher 2014-04-21 20:25:45 UTC

Description of problem:
Running 'groupadd -r <groupname>' on a Fedora 20 system with SSSD configured and running appears as if it is hung. The reason for this is that 'groupadd -r' attempts to look up each ID in the system range. Every ID results in a round-trip to LDAP through SSSD since they are not cached. This causes issues when installing software such as docker-io, which has the following %pre statemtent:


%pre
getent group docker > /dev/null || %{_sbindir}/groupadd -r docker
exit 0


Version-Release number of selected component (if applicable):
sssd-1.11.4-3.fc20.x86_64

How reproducible:
Every time

Steps to Reproduce:
1. Configure SSSD for LDAP IDs
2. Run 'groupadd -r docker'


Actual results:
Runs for a very long time (didn't wait to find out how long).

Expected results:
Should ideally optimize for this case.

Additional info:
Given that this is likely to hit during software install and results in minutes (or more) of lag, this looks bad.

Comment 1 Simo Sorce 2014-04-21 23:51:54 UTC

Sounds like a bug in groupadd to me.

Comment 2 Alexander Bokovoy 2014-04-22 06:44:59 UTC

This is really a bug 1063468, corresponding option is ldap_min_id

Can you try a solution outlined there?

Comment 3 Jakub Hrozek 2014-04-23 07:25:27 UTC

(In reply to Simo Sorce from comment #1)
> Sounds like a bug in groupadd to me.

Yes, I could never understand why useradd/groupadd scans the ID space so aggressively with '-r'.

Comment 4 Jakub Hrozek 2014-04-23 07:26:59 UTC

Also, should we accelerate raising the min_id defaults for the ID provider? Does this bug hit the Fedora Server's Domain Controller role?

Comment 5 Jakub Hrozek 2014-04-23 07:27:56 UTC

Upstream ticket:
https://fedorahosted.org/sssd/ticket/2238

Comment 6 Stephen Gallagher 2014-06-09 12:16:34 UTC

Let me see if I can explain the problem more completely.

1) There are numerous packages containing the command 'groupadd -r' in the %post section of the RPM installation (including docker-io as noted in this bug).
2) 'groupadd -r' is overly-agressive in seeking out available GIDs to use. It calls getgrgid() repeatedly for every entry in the SYS_ID_MIN->SYS_ID_MAX range, creating an array of available IDs and then iterates through all of those available entries the local database manually with gr_rewind() (according to the comments, this is in case there are entries that have been created but not committed yet).
3) As I was looking into the bug, I discovered that the implementation is also broken, as it will iterate through 1-SYS_ID_MAX, not starting at SYS_ID_MIN.

The algorithm used here is definitely inefficient and should be fixed. It's clearly designed to protect against a race-condition issue when nss_db is in play, but it fails to take any remote name-service library properly into account. I'm going to prepare a patch that improves the performance here for the common case, possibly at the expense of performance in the rare case of nss_db with uncommitted changes.

Reassigning this ticket to shadow-utils.

Comment 7 Stephen Gallagher 2014-06-09 15:25:34 UTC

Created attachment 904763 [details]
Redesign automatic GID allocation

Previously, this allocation was optimized for an outdated
deployment style (that of /etc/group alongside nss_db). The issue
here is that this results in extremely poor performance when using
SSSD, Winbind or nss_ldap.

There were actually three serious bugs here that have been addressed:

1) Running getgrent() loops won't work in most SSSD or Winbind
environments, as full group enumeration is disabled by default.
This could easily result in auto-allocating a group that was
already in use. (This might result in a security issue as well, if
the shared GID is a privileged group).

2) For system groups, the loop was always iterating through the
complete SYS_GID_MIN->SYS_GID_MAX range. On SSSD and Winbind, this
means hundreds of round-trips to LDAP (unless the GIDs were
specifically configured to be ignored by the SSSD or winbindd).
To a user with a slow connection to their LDAP server, this would
appear as if groupadd -r was hung. (Though it would eventually
complete).

3) This patch also adds better error-handling for errno from
getgrgid(), since if this function returns an unexpected error, we
should not be treating it as "ID is available". This could result
in assigning a GID that was already in use, with all the same
issues as 1) above.

This patch changes the algorithm to be more favorable for LDAP
environments, at the expense of some performance when using nss_db.
Given that the DB is a local service, this should have a negligible
effect from a user's perspective.

With the new algorithm, we simply iterate down from [SYS_]GID_MAX
with getgrgid() (and gr_locate_gid()) until we come to the first
unused GID. We then select that and return it.

Comment 8 Stephen Gallagher 2014-06-09 15:27:29 UTC

In my own personal testing, this improved the creation time of 'groupadd -r' from tens of seconds (with a remote LDAP server and SSSD in use) down to nearly instantaneous.

Comment 9 Simo Sorce 2014-06-09 16:29:53 UTC

Patch looks mostly good to me although there seem to be some indenting issues ?

Anyway there is one thing that need simprovement, it seem to me you cycle from max_gid down to min_gid in all cases. This is ok (and expected) for the system gids. But it is not what is done normally for non-system gids, which instead grow monotonically. If I read the patch right you do cucle from max to min also for the normal user case.

Comment 10 Stephen Gallagher 2014-06-09 18:20:46 UTC

(In reply to Simo Sorce from comment #9)
> Patch looks mostly good to me although there seem to be some indenting
> issues ?
> 

The original code was all tabs, my editor switched the lines I modified to spaces.


> Anyway there is one thing that need simprovement, it seem to me you cycle
> from max_gid down to min_gid in all cases. This is ok (and expected) for the
> system gids. But it is not what is done normally for non-system gids, which
> instead grow monotonically. If I read the patch right you do cucle from max
> to min also for the normal user case.

You are correct, I misread the original code and thought it was following the descending approach for both. I'll correct that and send an updated patch shortly.

Comment 11 Stephen Gallagher 2014-06-09 18:46:25 UTC

Created attachment 905729 [details]
Redesign automatic GID allocation

Updated patch restores the divergent behavior of system groups and non-system groups. System groups will now search down from SYS_GID_MAX and non-system groups will search up from GID_MIN.

Comment 12 Simo Sorce 2014-06-09 19:46:45 UTC

(In reply to Stephen Gallagher from comment #10)
> (In reply to Simo Sorce from comment #9)
> > Patch looks mostly good to me although there seem to be some indenting
> > issues ?
> > 
> 
> The original code was all tabs, my editor switched the lines I modified to
> spaces.
Ugh, makes it a little hard to review, but ok..

 
> > Anyway there is one thing that need simprovement, it seem to me you cycle
> > from max_gid down to min_gid in all cases. This is ok (and expected) for the
> > system gids. But it is not what is done normally for non-system gids, which
> > instead grow monotonically. If I read the patch right you do cucle from max
> > to min also for the normal user case.
> 
> You are correct, I misread the original code and thought it was following
> the descending approach for both. I'll correct that and send an updated
> patch shortly.

I am wondering if we should use a bisect search across the range, rather than a linear search, in the worst case the linear search is always O(N) which is quite bad if you have to check 60k gids...
The only issue would be that there is a chance to take the "wrong" turn if there holes in the range and not fill the gaps, but perhaps the code can fall back to a linear search if the bisect finds no ids ...

Comment 13 Stephen Gallagher 2014-06-09 20:09:30 UTC

(In reply to Simo Sorce from comment #12)
> (In reply to Stephen Gallagher from comment #10)
> > (In reply to Simo Sorce from comment #9)
> > > Patch looks mostly good to me although there seem to be some indenting
> > > issues ?
> > > 
> > 
> > The original code was all tabs, my editor switched the lines I modified to
> > spaces.
> Ugh, makes it a little hard to review, but ok..
> 

Yeah, but at least my additions look good :)

>  
> > > Anyway there is one thing that need simprovement, it seem to me you cycle
> > > from max_gid down to min_gid in all cases. This is ok (and expected) for the
> > > system gids. But it is not what is done normally for non-system gids, which
> > > instead grow monotonically. If I read the patch right you do cucle from max
> > > to min also for the normal user case.
> > 
> > You are correct, I misread the original code and thought it was following
> > the descending approach for both. I'll correct that and send an updated
> > patch shortly.
> 
> I am wondering if we should use a bisect search across the range, rather
> than a linear search, in the worst case the linear search is always O(N)
> which is quite bad if you have to check 60k gids...
> The only issue would be that there is a chance to take the "wrong" turn if
> there holes in the range and not fill the gaps, but perhaps the code can
> fall back to a linear search if the bisect finds no ids ...

That really won't solve the problem and will just add more complexity. We have to assume that many if not most of the groups added will have been done via this same algorithm. So if we're following the same binary search for all creations, we're going to end up with the same O(n) search time, plus the occasional fixed entry in there throwing the whole search off. So at the end of it, we probably end up with 1) IDs all over the range and 2) no improvement in performance.

I agree that it's pretty bad to get stuck checking 60k GIDs though. I think at that point, the admin may seriously want to just consider setting the min/max range on SSSD manually so that it flips back a quick "not found" result. We can't engineer a perfect solution for every possible setup, so we should probably fix the common case and recognize that if an environment has that large a group set, they are probably 1) not managing many local groups with groupadd (probably either using LDAP groups or else using puppet to control /etc/group directly) and 2) capable of tweaking performance knobs.

Comment 14 Simo Sorce 2014-06-09 21:47:17 UTC

(In reply to Stephen Gallagher from comment #13)
> (In reply to Simo Sorce from comment #12)

> > I am wondering if we should use a bisect search across the range, rather
> > than a linear search, in the worst case the linear search is always O(N)
> > which is quite bad if you have to check 60k gids...
> > The only issue would be that there is a chance to take the "wrong" turn if
> > there holes in the range and not fill the gaps, but perhaps the code can
> > fall back to a linear search if the bisect finds no ids ...
> 
> That really won't solve the problem and will just add more complexity. We
> have to assume that many if not most of the groups added will have been done
> via this same algorithm. So if we're following the same binary search for
> all creations, we're going to end up with the same O(n) search time, plus
> the occasional fixed entry in there throwing the whole search off. So at the
> end of it, we probably end up with 1) IDs all over the range and 2) no
> improvement in performance.
> 
> I agree that it's pretty bad to get stuck checking 60k GIDs though. I think
> at that point, the admin may seriously want to just consider setting the
> min/max range on SSSD manually so that it flips back a quick "not found"
> result. We can't engineer a perfect solution for every possible setup, so we
> should probably fix the common case and recognize that if an environment has
> that large a group set, they are probably 1) not managing many local groups
> with groupadd (probably either using LDAP groups or else using puppet to
> control /etc/group directly) and 2) capable of tweaking performance knobs.

A bisect search has a fixed cost og O(log2(N)), which is much better than O(N) for the worst case. log2(60k) ~= 16 (60k < 2^16)
So in max 16 steps you always have the right answer which means a bisect will be better as soon as you have more than 16 gid allocated in the range and you are not asking for a specific gid. By always following the left branch if empty you get actually a linear allocation.

It is a little bit more complex but the savings are huge and the occasional hit doies not really throw off anything, it just causes an allocation beyond a hole, no big deal if it happens IMO.

The max number of lookups for a max gid of 2^32 is 32 of course, which beats 2^32 lookups done linearly any time :-)

Comment 15 Dmitri Pal 2014-06-09 21:59:11 UTC

AFAIU the situation groupadd it trying to find an unused GID for a group, right?
Can we have a file that would define the range that should be searched instead of searching everything including remote sources? We already have first X ids left out by default in SSSD so this is the range that groupadd should scan. 
May be we need to teach groupadd to read sssd.conf and get range from there. If it is there - check only this range. If not use same defaults as SSSD. Then we would never spend time scanning central LDAP.

Comment 16 Tomas Mraz 2014-06-10 09:17:07 UTC

I'd very much prefer keeping the patch as simple as possible and without complexities such as bisect search (which is actually changing behaviour of groupadd in case of holes in group uid assignment) and reading sssd.conf.

In general I agree with Stephen's comment#13.

Comment 17 Stephen Gallagher 2014-06-10 09:59:10 UTC

I misunderstood Simo's recommendation originally. I still see three problems with it:

1) As Simo and Tomas noted, it may result in holes in the range if any of the bisect points have a manually-assigned group (worst-case at (GID_MAX - GID_MIN)/2) which would result in changing the expected behavior of groupadd. I suspect this is a conversation worth having with upstream, however. This can cause difficult-to-find bugs if the user is expecting the old behavior (such as during a system provision) and occasionally has outliers because one system or another happens to have one of those manually-assigned groups in the "wrong place". However, there is a strong argument to be made that auto-assignment is the wrong approach for such users.

2) We still have to fall back on this approach if we fail to find an open ID, since that will be the only way to fill the holes in the range, which means that the worst-case time is still 2^32 + 16 either way.

One more thing to note: in common environments, the lookups will mostly be happening locally. So while there may be a lot of lookups, most of them will be happening on the local system, not over the network. If we go to a bisect search, all of the misses will result in network lookups. This may end up taking more wall clock time, not less, even if it is fewer total lookups.

I'd suggest we avoid premature optimization at this point and go with the simple solution for now and discuss with upstream if it becomes a major bottleneck.

Comment 18 Tomas Mraz 2014-06-10 13:29:33 UTC

A few minor problems and nitpicks with the patch:

1.
Do not use underscore at the beginning of static functions - this convention is not used in shadow utils code and I do not think it is good idea to introduce it with this patch.

2.
The indentation should be kept (with using tabs, etc.) as Simo already noted.

3.
+	result = _get_ranges(sys_group, &gid_min, &gid_max, &preferred_min);
+	if (result == EINVAL) {
+        (void) fprintf (stderr,
+                        _("%s: Invalid configuration: GID_MIN (%lu), GID_MAX (%lu)\n"),
+                        Prog, (unsigned long) gid_min, (unsigned long) gid_max);
+        return -1;
 	}

The error reporting here is less precise than in the original code - please either move the fprintf() into the get_ranges function or ensure by other means that the error reporting remains the same as in original code.

4.
+                /* An unexpected error occurred. Report it. */
+                fprintf (stderr,
+                         _("%s: Can't get unique system GID (%s)\n"),
+                         Prog, strerror(result));
+                SYSLOG ((LOG_ERR,
+                         "Error checking available GIDs: %s",
+                         strerror(result)));
+                /*
+                 * We will continue anyway. Hopefully a later GID
+                 * will work properly.
+                 */

This looks like a recipe for spamming stderr and syslog, I'd prefer either failing or marking that the error was already reported and not to report it if it happens repeatedly.

5.
+    fprintf (stderr,
+             _("%s: Can't get unique system GID (no more available GIDs)\n"),
+             Prog);

The 'system' word should be printed only when adding system group.

But there is one more problem with this patch and approach. The original code does the expensive "equilibristics" because it tries to avoid reusing gaps in the gid assignment which might be caused by previous group being deleted. And so it avoids reusing deleted group's gids which might have for example some files owned on the system. This heuristics of course fails if you delete the group with the largest (or for system group smallest) gid last. But that is something sysadmins are aware of and it is also useful if you make a mistake in your last groupadd command and want to correct it by groupdel and a new groupadd.

What I would propose is to use the already opened local group file support (gr_next(), etc. functions) to search for the largest (smallest for system) group id and only then use the getgrgid() to check whether there is any conflict with possible remote group source. (And if there is then continue incrementally (decrementally for system)).

This would have the best performance in case the remote gids and local gids do not conflict.

Comment 19 Stephen Gallagher 2014-06-16 17:45:12 UTC

Created attachment 909203 [details]
Redesign automatic GID allocation

(In reply to Tomas Mraz from comment #18)
> A few minor problems and nitpicks with the patch:
> 
> 1.
> Do not use underscore at the beginning of static functions - this convention
> is not used in shadow utils code and I do not think it is good idea to
> introduce it with this patch.
> 

Fixed

> 2.
> The indentation should be kept (with using tabs, etc.) as Simo already noted.
> 

Fixed (though I find it ugly).

> 3.
> +	result = _get_ranges(sys_group, &gid_min, &gid_max, &preferred_min);
> +	if (result == EINVAL) {
> +        (void) fprintf (stderr,
> +                        _("%s: Invalid configuration: GID_MIN (%lu),
> GID_MAX (%lu)\n"),
> +                        Prog, (unsigned long) gid_min, (unsigned long)
> gid_max);
> +        return -1;
>  	}
> 
> The error reporting here is less precise than in the original code - please
> either move the fprintf() into the get_ranges function or ensure by other
> means that the error reporting remains the same as in original code.
> 

You are correct. I moved it into get_ranges().

> 4.
> +                /* An unexpected error occurred. Report it. */
> +                fprintf (stderr,
> +                         _("%s: Can't get unique system GID (%s)\n"),
> +                         Prog, strerror(result));
> +                SYSLOG ((LOG_ERR,
> +                         "Error checking available GIDs: %s",
> +                         strerror(result)));
> +                /*
> +                 * We will continue anyway. Hopefully a later GID
> +                 * will work properly.
> +                 */
> 
> This looks like a recipe for spamming stderr and syslog, I'd prefer either
> failing or marking that the error was already reported and not to report it
> if it happens repeatedly.
> 

I added a flag to print this only once, though I'm unsure we actually want to avoid spamming. If there's a problem, that will get it noticed quickly...

> 5.
> +    fprintf (stderr,
> +             _("%s: Can't get unique system GID (no more available
> GIDs)\n"),
> +             Prog);
> 
> The 'system' word should be printed only when adding system group.
> 

Fixed.


> But there is one more problem with this patch and approach. The original
> code does the expensive "equilibristics" because it tries to avoid reusing
> gaps in the gid assignment which might be caused by previous group being
> deleted. And so it avoids reusing deleted group's gids which might have for
> example some files owned on the system. This heuristics of course fails if
> you delete the group with the largest (or for system group smallest) gid
> last. But that is something sysadmins are aware of and it is also useful if
> you make a mistake in your last groupadd command and want to correct it by
> groupdel and a new groupadd.
> 

I hadn't realized this was the purpose behind that. It *kind of* makes sense, although it's an extremely flawed premise. There are many reasons that the highest or lowest GID could be close to one extreme or the other. Frankly, I think that if filling in deleted GIDs is a serious problem, and environment should be using explicit GID assignments, but... (see below)


> What I would propose is to use the already opened local group file support
> (gr_next(), etc. functions) to search for the largest (smallest for system)
> group id and only then use the getgrgid() to check whether there is any
> conflict with possible remote group source. (And if there is then continue
> incrementally (decrementally for system)).
> 
> This would have the best performance in case the remote gids and local gids
> do not conflict.

I don't really want to change expected behavior, even if it *is* terribly flawed. I've implemented the algorithm you have described above in my new patch.

I've tested it by creating four new groups, two system groups, two non-system groups, then removing the first of each pair and re-adding them to confirm that they end getting a new GID and not refilling the range. I also tested this with a configured LDAP server that had IDs sitting in the ranges, invisible to the gr_next() loop (to verify that this properly handles those cases).

Comment 20 Tomas Mraz 2014-06-30 13:19:03 UTC

Comment on attachment 909203 [details]
Redesign automatic GID allocation

The only issue I've found is that you forgot to set the nospam to 1 once you print the error message. I'll fix it myself and apply the patch in rawhide.

Comment 21 Brian Lane 2014-07-01 22:14:46 UTC

This change breaks groupadd. Using 4.1.5.1-12 on a rawhide system that was previously working.

Output is:

groupadd: Can't get unique GID (No such file or directory). Suppressing additional messages.

[long delay]

groupadd: Can't get unique GID (no more available GIDs)


4.1.5.1-11 works just fine.

Comment 22 Tomas Mraz 2014-07-02 10:45:35 UTC

What's in your /etc/nsswitch.conf?

This error is returned from getgrgid().

Unfortunately I am afraid that we will have to modify the patch to allow adding groups in case of some errors returned otherwise for example having sss in nsswitch.conf with sssd not running would cause breaking groupadd.

Comment 23 Stephen Gallagher 2014-07-02 11:20:36 UTC

Ok, I'll take a look at it. I'm not entirely certain offhand whether the bug is in groupadd, SSSD or glibc. The SSSD is *supposed* to be returning an error code that says "Ignore me, I'm not running" that getgrgid() should honor.

I'll dig into it this morning.

Tomas, in order to unbreak Rawhide, can we push a -13 that reverts this patch for the time being?

Comment 24 Tomas Mraz 2014-07-02 11:26:09 UTC

Instead of reverting I'll just make it to ignore the getgrgid error return for now. I think this change will not make it more broken than the original implementation was.

Comment 25 Stephen Gallagher 2014-07-02 12:17:27 UTC

So after some investigation, there are three things here that are interacting to cause this problem. We can "solve" the issue in groupadd, SSSD *or* glibc, with varying degrees of correctness.

SSSD:
When SSSD is not running, the nss_sss.so.2 module will return NSS_STATUS_UNAVAIL as required by glibc and documented at http://www.gnu.org/software/libc/manual/html_node/NSS-Modules-Interface.html

glibc:
When glibc gets NSS_STATUS_UNAVAIL, it will proceed with the next item in the nsswitch.conf list and if that works (getting an NSS_STATUS_SUCCESS or NSS_STATUS_NOTFOUND message), the NSS_STATUS_UNAVAIL is simply ignored. However, if-and-only-if the last module in the nsswitch.conf line returns NSS_STATUS_UNAVAIL, that error will be transmitted down to the caller as errno 111 (ECONNREFUSED)

groupadd:
The check_gid() function I wrote is fairly simple: it checks whether getgrgid() returns NON-NULL (the GID exists) or it returns NULL and errno, then it passes the errno up the chain.


Possible solutions from top to bottom:

1) Modify groupadd so that we check for ECONNREFUSED and ignore it, knowing that this means that a network-facing daemon was unable to make a connection. The problem with this approach is that it's a band-aid. We will have to do this in every application that has this issue.

2) Modify SSSD so that we "lie" and return NSS_STATUS_NOTFOUND instead of NSS_STATUS_UNAVAIL. This is technically a violation of the API contract. I don't like lying to the consumers, but this would provide a band-aid for all consumers of SSSD. It still wouldn't help other providers like nss_ldap or winbind, which will continue returning NSS_STATUS_UNAVAIL when appropriate. This may be *good enough* for now, since only 'sss' is part of the default nsswitch configuration on Fedora.

3) Modify glibc so that its behavior is consistent regardless of where in the list the NSS_STATUS_NOTFOUND occurred. I assume that the reason for reporting it is because it could indicate that the results of the call should not fully be trusted, since they won't include information from the remote store. But in that case, I would expect that NULL/ECONNREFUSED should be returned no matter where the module resides in the nsswitch.conf order.

Recommendations welcome.

Comment 26 Stephen Gallagher 2014-07-02 12:46:00 UTC

OK, so after much discussion on IRC we've decided to go with the groupadd band-aid where we ignore all errno values for now. The reasoning is this:

1) The errno value that's returned is actually the one provided by the NSS module itself, not one translated by glibc. We therefore cannot provide a "whitelist" of safe-to-ignore errno values.

2) Because glibc is inconsistent about whether it even bothers sending these errors down to the caller, we cannot trust whether an earlier provider in the nsswitch.conf order *should* have done so, but was cleared by a later provider. So failing here is giving a false sense of safety regarding potential ID overlap.

3) The situation is no worse than it was prior to this bug and its patch. Before the new allocation patch, we were never consulting SSSD (or other remote sources) before creating a new GID, so there was always a risk of overlap. Even in this situation, we've minimized that risk to happen only when the SSSD isn't running. So that's still a large improvement.

Comment 27 Brian Lane 2014-07-03 01:26:33 UTC

*** Bug 1115476 has been marked as a duplicate of this bug. ***

Comment 28 Carlos O'Donell 2016-01-12 14:37:58 UTC

I noticed some things in this description which I wanted to clarify from a glibc perspective.

(In reply to Stephen Gallagher from comment #25)
> So after some investigation, there are three things here that are
> interacting to cause this problem. We can "solve" the issue in groupadd,
> SSSD *or* glibc, with varying degrees of correctness.
> 
> SSSD:
> When SSSD is not running, the nss_sss.so.2 module will return
> NSS_STATUS_UNAVAIL as required by glibc and documented at
> http://www.gnu.org/software/libc/manual/html_node/NSS-Modules-Interface.html

This is not what SSSD should be doing and was discussed in another issue.

Upstream glibc documentation has actually changed to document what module/daemons should be doing.

You must return NSS_STATUS_NOTFOUND, and I'll explain why below.

> glibc:
> When glibc gets NSS_STATUS_UNAVAIL, it will proceed with the next item in
> the nsswitch.conf list and if that works (getting an NSS_STATUS_SUCCESS or
> NSS_STATUS_NOTFOUND message), the NSS_STATUS_UNAVAIL is simply ignored.
> However, if-and-only-if the last module in the nsswitch.conf line returns
> NSS_STATUS_UNAVAIL, that error will be transmitted down to the caller as
> errno 111 (ECONNREFUSED)

Which is the correct behaviour.

The status of the last module is always returned, just like "|" works in a shell. The analogy is identical.

> groupadd:
> The check_gid() function I wrote is fairly simple: it checks whether
> getgrgid() returns NON-NULL (the GID exists) or it returns NULL and errno,
> then it passes the errno up the chain.
 
OK.
 
> Possible solutions from top to bottom:
> 
> 1) Modify groupadd so that we check for ECONNREFUSED and ignore it, knowing
> that this means that a network-facing daemon was unable to make a
> connection. The problem with this approach is that it's a band-aid. We will
> have to do this in every application that has this issue.

It is not a band-aid.

You have an NSS module that can't get an answer from a data source, it must assume the daemon is down and has no data, so the answer is transiently whatever it has in the cache.

> 2) Modify SSSD so that we "lie" and return NSS_STATUS_NOTFOUND instead of
> NSS_STATUS_UNAVAIL. This is technically a violation of the API contract. I
> don't like lying to the consumers, but this would provide a band-aid for all
> consumers of SSSD. It still wouldn't help other providers like nss_ldap or
> winbind, which will continue returning NSS_STATUS_UNAVAIL when appropriate.
> This may be *good enough* for now, since only 'sss' is part of the default
> nsswitch configuration on Fedora.

You are not lying. This is exactly the return you must give.

Consider the case where SSSD is in /etc/nsswich.conf with no daemon started. In that case you also get the same problem and must return NSS_STATUS_NOTFOUND. You are not connected to a data source and have no data and so there is no answer. The module must operate correctly independent of the data source.

> 3) Modify glibc so that its behavior is consistent regardless of where in
> the list the NSS_STATUS_NOTFOUND occurred. I assume that the reason for
> reporting it is because it could indicate that the results of the call
> should not fully be trusted, since they won't include information from the
> remote store. But in that case, I would expect that NULL/ECONNREFUSED should
> be returned no matter where the module resides in the nsswitch.conf order.

Please file a bug against glibc if you think there is inconsistent behaviour.

Comment 29 Lukas Slebodnik 2016-01-12 15:00:21 UTC

(In reply to Carlos O'Donell from comment #28)
> I noticed some things in this description which I wanted to clarify from a
> glibc perspective.
> 
> (In reply to Stephen Gallagher from comment #25)
> > So after some investigation, there are three things here that are
> > interacting to cause this problem. We can "solve" the issue in groupadd,
> > SSSD *or* glibc, with varying degrees of correctness.
> > 
> > SSSD:
> > When SSSD is not running, the nss_sss.so.2 module will return
> > NSS_STATUS_UNAVAIL as required by glibc and documented at
> > http://www.gnu.org/software/libc/manual/html_node/NSS-Modules-Interface.html
> 
> This is not what SSSD should be doing and was discussed in another issue.
> 
This behaviour was changed in sssd-1.12.3-3.fc21
https://bugzilla.redhat.com/show_bug.cgi?id=988068#c23

But it is not enabled by default in upstream.
https://git.fedorahosted.org/cgit/sssd.git/commit/?id=5bb0c0596765dd5dd1973b7fc2d1e830bca3e345
https://git.fedorahosted.org/cgit/sssd.git/commit/?id=d51bc5f43fffa516446ef62c2b860be9fa939c9d

There is a configure time option "--enable-sss-default-nss-plugin".
Should we enable it by default so it will be available also for other
distribution? or should we completely remove old behaviour.
Would there be a problem on rhel 6?

Comment 30 Stephen Gallagher 2016-01-12 15:07:10 UTC

(In reply to Carlos O'Donell from comment #28)
> > 2) Modify SSSD so that we "lie" and return NSS_STATUS_NOTFOUND instead of
> > NSS_STATUS_UNAVAIL. This is technically a violation of the API contract. I
> > don't like lying to the consumers, but this would provide a band-aid for all
> > consumers of SSSD. It still wouldn't help other providers like nss_ldap or
> > winbind, which will continue returning NSS_STATUS_UNAVAIL when appropriate.
> > This may be *good enough* for now, since only 'sss' is part of the default
> > nsswitch configuration on Fedora.
> 
> You are not lying. This is exactly the return you must give.
> 
> Consider the case where SSSD is in /etc/nsswich.conf with no daemon started.
> In that case you also get the same problem and must return
> NSS_STATUS_NOTFOUND. You are not connected to a data source and have no data
> and so there is no answer. The module must operate correctly independent of
> the data source.

OK, the documentation prior to https://sourceware.org/git/?p=glibc.git;a=commitdiff;h=d4e301c5c65393837e438b6d81feabfbfde7b9c7;hp=23256f5ed889266223380c02b2750d19e3fe666b was very ambiguous on this point. SSSD (and others) generally return NSS_STATUS_UNAVAIL for the cases where SSSD has been configured but is not running. My understanding now is that NSS_STATUS_UNAVAIL really should only ever be returned in true *error* cases and that if SSSD or another service is functioning normally it should be returning NSS_STATUS_NOTFOUND.

This probably needs a new bug opened against SSSD (and winbind... and nss-pam-ldapd...)

Thank you for providing a clarifying note in the glibc documentation, Carlos.

Comment 31 Stephen Gallagher 2016-01-12 15:07:30 UTC

(In reply to Carlos O'Donell from comment #28)
> > 2) Modify SSSD so that we "lie" and return NSS_STATUS_NOTFOUND instead of
> > NSS_STATUS_UNAVAIL. This is technically a violation of the API contract. I
> > don't like lying to the consumers, but this would provide a band-aid for all
> > consumers of SSSD. It still wouldn't help other providers like nss_ldap or
> > winbind, which will continue returning NSS_STATUS_UNAVAIL when appropriate.
> > This may be *good enough* for now, since only 'sss' is part of the default
> > nsswitch configuration on Fedora.
> 
> You are not lying. This is exactly the return you must give.
> 
> Consider the case where SSSD is in /etc/nsswich.conf with no daemon started.
> In that case you also get the same problem and must return
> NSS_STATUS_NOTFOUND. You are not connected to a data source and have no data
> and so there is no answer. The module must operate correctly independent of
> the data source.

OK, the documentation prior to https://sourceware.org/git/?p=glibc.git;a=commitdiff;h=d4e301c5c65393837e438b6d81feabfbfde7b9c7;hp=23256f5ed889266223380c02b2750d19e3fe666b was very ambiguous on this point. SSSD (and others) generally return NSS_STATUS_UNAVAIL for the cases where SSSD has been configured but is not running. My understanding now is that NSS_STATUS_UNAVAIL really should only ever be returned in true *error* cases and that if SSSD or another service is functioning normally it should be returning NSS_STATUS_NOTFOUND.

This probably needs a new bug opened against SSSD (and winbind... and nss-pam-ldapd...)

Thank you for providing a clarifying note in the glibc documentation, Carlos.

Comment 32 Thomas Bétrancourt 2016-02-03 14:29:02 UTC

Hello,

Any chance that this fix will also fix useradd -r issue?

Currently, useradd myuser takes about 5 seconds and useradd -r myuser about 6-7 minutes...

Note You need to log in before you can comment on or make changes to this bug.