We have a number of network printers shared through a print server running F21. At our school, staff are allowed to print, but students aren't. Staff are members of a group called 'allstaff'. In cups, we've configured the printers to default to deny and AllowUser @allstaff. The print server uses sssd to connect to a FreeIPA server that has all the user information. There are currently 388 members of 'allstaff'. Most of the time, everything works correctly, but, occasionally members of 'allstaff' are denied access to the printers with the error message 'Denying user "xxxxx" access to printer "SSR"...' I've tracked the problem to cupsdCheckGroup() in scheduler/auth.c. After looking at the code, it became obvious that its method of checking whether a user is a member of a group is to iterate through every member of the group until it finds a match. The problem arises when getgrnam() occasionally returns a group struct with fewer than 388 member entries. I suspect the problem is that cups is using getgrnam() rather than getgrnam_r(), but I decided to try a patch that would approach the problem from a completely different angle. Using getgrouplist(), we check which gids the user is a member of and compare those with the gid of the group we want to check. This should be faster in the case of count(groups user is in) < count(users in group), which I believe is the more common case. If getgrouplist() isn't available, we go back to using getgrnam(). I've put the patch into production on our print server, and we've had no more invalid denials. I'm attaching two copies of the patch, one against branch-1.7 that can be applied against the latest 1.7.5 rpms and one against git master.
Created attachment 1004841 [details] Use getgrouplist to reduce lookups - git master
Created attachment 1004842 [details] Use getgrouplist to reduce lookups - 1.7
Possibly the one controversial part of the patch is this: getgrouplist() writes data to a buffer, and returns -1 while setting the value of an int passed by reference to the number of groups if the buffer isn't big enough. There is, of course, the possibility that that new value won't be correct the next time getgrouplist() is checked if the user has just been added to a group. I wasn't comfortable using a while loop because there is the (incredibly small) possibility that someone could cause a DOS by continually adding a user to new groups, but I might be overthinking things. My current solution is to try ten times and then give up, but I would happily simplify if you think I'm being ridiculously paranoid.
Ok, I've found a case in the patch where we were dereferencing a NULL pointer, and, upon review, I really don't like the for loop on getgrouplist(), so I'm posting a revised patch that is cleaner. This patch is in production here.
Created attachment 1005359 [details] Use getgrouplist to reduce lookups - git master
Created attachment 1005360 [details] Use getgrouplist to reduce lookups - 1.7
Thanks. As a first quick review, this part of the loop could be changed: + free(gids); + gids = malloc(ngroups * sizeof (gid_t)); This uses free() and malloc(), which could be combined into realloc(), but also doesn't check for integer overflow. Is getgrouplist()'s ngroup return value guaranteed not to overflow a 32-bit integer when multiplied by sizeof gid_t? I'd suggest either using calloc(), or using realloc() after checking INT_MAX / sizeof (gid_t) > ngroups.
Reported upstream for feedback on this general approach.
The reason I used free()/malloc() rather than realloc() was that this seemed the easiest way to avoid a memory leak if realloc() failed. ngroup is not guaranteed not to overflow when multiplied by sizeof(gid_t), so I've switched both malloc()'s to calloc().
Created attachment 1005556 [details] Use getgrouplist to reduce lookups - git master
Created attachment 1005557 [details] Use getgrouplist to reduce lookups - 1.7
This message is a reminder that Fedora 21 is nearing its end of life. Approximately 4 (four) weeks from now Fedora will stop maintaining and issuing updates for Fedora 21. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as EOL if it remains open with a Fedora 'version' of '21'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version. Thank you for reporting this issue and we are sorry that we were not able to fix it before Fedora 21 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged change the 'version' to a later Fedora version prior this bug is closed as described in the policy above. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete.
Fedora 21 changed to end-of-life (EOL) status on 2015-12-01. Fedora 21 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. If you are unable to reopen this bug, please file a new report against the current release. If you experience problems, please add a comment to this bug. Thank you for reporting this bug and we are sorry it could not be fixed.