Bug 798655 - Password logins failing due to a process with high UID
Password logins failing due to a process with high UID
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: sssd (Show other bugs)
6.2
Unspecified Unspecified
unspecified Severity unspecified
: rc
: ---
Assigned To: Stephen Gallagher
Kaushik Banerjee
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2012-02-29 09:09 EST by Mike
Modified: 2015-03-30 09:47 EDT (History)
4 users (show)

See Also:
Fixed In Version: sssd-1.8.0-11.el6
Doc Type: Bug Fix
Doc Text:
No documentation required
Story Points: ---
Clone Of:
Environment:
Last Closed: 2012-06-20 07:55:20 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Tool to reproduce the issue (357 bytes, text/plain)
2012-02-29 10:46 EST, Stephen Gallagher
no flags Details

  None (edit)
Description Mike 2012-02-29 09:09:44 EST
Description of problem:

I'm not sure if this is a bug in the kernel or an issue with sssd, but the problem is exhibited in sssd, so I'm starting there.  Please reassign as necessary.

When I log into a system using a password with kerberos auth, it will succeed on the first attempt, but fail on subsequent attempts (once a ccache entry exists).  It fails in get_uid_from_pid (find_uid.c), more specifically when calling strtouint32(), while looping through processes checking Uid in /proc/<pid>/status and encounters a UID of -1.

        num = strtouint32(p, &endptr, 10);
        error = errno;
        if (error != 0) {
            DEBUG(1, ("strtol failed [%s].\n", strerror(error)));
            return error;
        }
                                          

(Tue Feb 28 14:44:46 2012) [sssd[be[EMPLOYEES]]] [get_uid_from_pid] (1): strtol failed [Numerical result out of range].
(Tue Feb 28 14:44:46 2012) [sssd[be[EMPLOYEES]]] [get_active_uid_linux] (1): get_uid_from_pid failed.
(Tue Feb 28 14:44:46 2012) [sssd[be[EMPLOYEES]]] [check_if_uid_is_active] (1): get_uid_table failed.
(Tue Feb 28 14:44:46 2012) [sssd[be[EMPLOYEES]]] [check_if_ccache_file_is_used] (1): check_if_uid_is_active failed.
(Tue Feb 28 14:44:46 2012) [sssd[be[EMPLOYEES]]] [krb5_auth_send] (1): check_if_ccache_file_is_used failed.



It's encountering a Uid of -1 because an nrpe process is defaulting to the UID of (2^32 - 1), which as far as I can tell is a perfectly acceptable UID since it's in the unsigned 32 range.  With a UID of 4294967295, /proc/<pid>/status is showing -1, instead of 4294967295.

[root@host tmp]$ ps -ef | grep nrpe
4294967295 32590   1  0 Feb28 ?        00:00:01 /usr/sbin/nrpe -c /etc/nagios/nrpe.cfg -d

[root@host tmp]$ grep ^Uid /proc/32590/status
Uid:    -1      -1      -1      -1



Version-Release number of selected component (if applicable):

kernel-2.6.32-220.el6.x86_64
sssd-1.5.1-66.el6_2.3.x86_64

How reproducible:

Steps to Reproduce:
1. Run a process with a UID of 2^32-1
2. While using kerberos for authentication, login to the host twice
  
Actual results:
Login fails.


Expected results:
Login succeeds.
Comment 2 Mike 2012-02-29 09:48:23 EST
I checked the nrpe source, and it's defaulting to calling setuid(-1) when it
drops privileges and the 'nrpe' user (more specifically, the nrpe_user as defined in  nrpe.cfg) doesn't exist on the system.  So the -1 in
/proc/<pid>/status makes sense.
Comment 3 Stephen Gallagher 2012-02-29 10:11:02 EST
Ok, the problem here is that SSSD assumes that PIDs are unsigned 32-bit integers, but the standard type of pid_t is actually a *signed* 32-bit integer.

What's happening is that we're using strtoul32() which internally converts the 
string to a signed long long and then checks that it's > 0.

Apparently we were working under a faulty assumption that UIDs were guaranteed to be positive. I'll switch this conversion to use strtol32() instead of strtoul32() (and then cast it to uint32_t after this).

Thanks for the bug report!
Comment 4 Stephen Gallagher 2012-02-29 10:18:03 EST
Upstream ticket:
https://fedorahosted.org/sssd/ticket/1216
Comment 5 Stephen Gallagher 2012-02-29 10:46:56 EST
Created attachment 566574 [details]
Tool to reproduce the issue

You can use the attached file nobody.c to reproduce this issue.

Build it with:
gcc -o nobody nobody.c

To run it:
setenforce 0
./nobody

If it works, you will see a message telling you that it's going into an infinite loop.


So, to reproduce this issue:

1) Configure SSSD for Kerberos atuh 
2) Start SSSD (do not start ./nobody until later)
3) Log in online with a Kerberos user
4) Start the "nobody" tool
5) Try to restart SSSD

Actual results:
SSSD fails to start completely, and the following log message appears in sssd_DOMAIN.log:
(Tue Feb 28 14:44:46 2012) [sssd[be[DOMAIN]]] [get_uid_from_pid] (1): strtol
failed [Numerical result out of range].

Expected results:
SSSD should start as expected.


I wasn't able to duplicate the original situation where the login would fail (might be due to differences between SSSD on RHEL 6.2 and 6.3), but the same behavior causes issues with restart, which would cause an outage if the monitor had to restart the sssd_be process.
Comment 8 Kaushik Banerjee 2012-05-02 14:35:44 EDT
Verified in version sssd-1.8.0-25


Output of Beaker automation run:

::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
:: [   LOG    ] :: verify bz 798655
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::

:: [   PASS   ] :: Running '> /var/log/sssd/sssd_LDAP-KRB5.log'
Stopping sssd: [  OK  ]
:: [   PASS   ] :: Running 'service sssd stop'
:: [   PASS   ] :: Running 'rm -fr /var/lib/sss/db/*.ldb'
Starting sssd: [  OK  ]
[  OK  ]
:: [   PASS   ] :: Running 'service sssd start'
:: [   PASS   ] :: napping for 5 secs...
:: [   PASS   ] :: Running 'restart_clearing_cache'
spawn ssh -q -l puser1 localhost echo 'login successful'
puser1@localhost's password: 
login successful
:: [   PASS   ] :: Authentication successful, as expected
:: [   PASS   ] :: Running 'auth_success puser1 12345678'
:: [   PASS   ] :: Running 'gcc -o /root/nobody /root/nobody.c'
:: [   PASS   ] :: Running '/root/nobody &'
spawn ssh -q -l puser1 localhost echo 'login successful'
puser1@localhost's password: 
login successful
:: [   PASS   ] :: Authentication successful, as expected
:: [   PASS   ] :: Running 'auth_success puser1 12345678'
:: [   PASS   ] :: File '/var/log/sssd/sssd_LDAP-KRB5.log' should not contain 'strtol failed \[Numerical result out of range\]'
./bugzilla-automation.sh: line 257: 21804 Killed                  /root/nobody
Stopping sssd: [  OK  ]
:: [   PASS   ] :: Running 'service sssd stop'
:: [   PASS   ] :: Running 'rm -fr /var/lib/sss/db/*.ldb'
Starting sssd: [  OK  ]
[  OK  ]
:: [   PASS   ] :: Running 'service sssd start'
:: [   PASS   ] :: napping for 5 secs...
:: [   PASS   ] :: Running 'restart_clearing_cache'
'6cf818f6-cb75-4699-8445-dc11feb60f90'
verify-bz-798655 result: PASS
Comment 9 Stephen Gallagher 2012-06-12 09:38:54 EDT
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
No documentation required
Comment 11 errata-xmlrpc 2012-06-20 07:55:20 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2012-0747.html

Note You need to log in before you can comment on or make changes to this bug.