Bug 860119

Summary: tomcat crashing when attemping to login after enabling pam authentication
Product: Red Hat Satellite 5 Reporter: Mark Huth <mhuth>
Component: ServerAssignee: Jan Pazdziora <jpazdziora>
Status: CLOSED CURRENTRELEASE QA Contact: Martin Korbel <mkorbel>
Severity: medium Docs Contact:
Priority: high    
Version: 541CC: cperry, jpazdziora, mkorbel, mmraka, mzazrivec, tlestach
Target Milestone: ---Keywords: Patch
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-10-01 21:58:06 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 924171    
Attachments:
Description Flags
Simpler reproducer
none
Removes the expected password prompts from PAM_conv none

Description Mark Huth 2012-09-25 03:05:50 UTC
Description of problem:
tomcat crashing after enabling pam authentication in Satellite:

*** glibc detected *** /usr/lib/jvm/java/bin/java: double free or corruption (out): 0x00007f6c8c84d410 ***
======= Backtrace: =========
/lib64/libc.so.6[0x39bf675916]
/lib64/libc.so.6[0x39bf678443]
/lib64/security/pam_krb5.so(+0xa30e)[0x7f6c9df7a30e]
/lib64/security/pam_krb5.so(+0xafa5)[0x7f6c9df7afa5]
/lib64/security/pam_krb5.so(pam_sm_authenticate+0x2af)[0x7f6c9df741df]
/lib64/libpam.so.0[0x342d202cee]
/lib64/libpam.so.0(pam_authenticate+0x40)[0x342d202600]
/usr/lib/libjpam.so(Java_net_sf_jpam_Pam_authenticate+0x17a)[0x7f6d301fc1f6]
/usr/lib/jvm/java-1.6.0-ibm-1.6.0.9.1.x86_64/jre/lib/amd64/default/libj9vm24.so(+0x2a2a3)[0x7f6d4e7462a3]

Version-Release number of selected component (if applicable):
Satellite 5.4.1 (haven't tried on other versions including 5.5)

How reproducible:
Always

Steps to Reproduce:
1. Setup satellite to use pam authentication.  Note, it doesn't matter which pam module is used for authentication, eg pam_krb5 or pam_unix.
2. Need to be using a non-English locale, eg LANG="de_DE.UTF-8" in /etc/sysconfig/i18n
3. Restart tomcat after setting up the new locale
4. Try to login via the WebUI with a satellite user configured for pam auth

Actual results:
The first login will always fail, even if the correct password is specified, but tomcat doesn't crash.  However when you try to login again, then tomcat crashes.

Expected results:
tomcat doesn't crash

Additional info:
This is only repoduceable when using a non-english locale, eg german or french.

Comment 1 Mark Huth 2012-09-25 03:08:38 UTC
Cores are on 10.64.0.122 (root/redhat) in /root/00698552

# gdb -c core.20120917.072819.11504.0001.dmp /usr/lib/jvm/java/bin/java
(gdb) bt
...
#9  0x00000039bf6328a5 in raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
#10 0x00000039bf634085 in abort () at abort.c:92
#11 0x00000039bf66ffe7 in __libc_message (do_abort=2, fmt=0x39bf7577c0 "*** glibc detected *** %s: %s: 0x%s ***\n") at ../sysdeps/unix/sysv/linux/libc_fatal.c:198
#12 0x00000039bf675916 in malloc_printerr (action=3, str=0x39bf757b00 "double free or corruption (out)", ptr=<value optimized out>) at malloc.c:6311
#13 0x00000039bf678443 in _int_free (av=0x39bf98ee80, p=0x7fe918079570, have_lock=0) at malloc.c:4811
#14 0x00007fe94239b30e in _pam_krb5_maybe_free_responses (responses=0x7fe918079410, n_responses=1) at prompter.c:63
#15 0x00007fe94239bfa5 in _pam_krb5_prompt_for (pamh=<value optimized out>, prompt=<value optimized out>, response=0x7fe9d4699600) at prompter.c:540
#16 0x00007fe9423951df in pam_sm_authenticate (pamh=0x7fe91805dae0, flags=0, argc=1, argv=<value optimized out>) at auth.c:132
#17 0x000000342d202cee in _pam_dispatch_aux (pamh=0x7fe91805dae0, flags=0, choice=1) at pam_dispatch.c:110
#18 _pam_dispatch (pamh=0x7fe91805dae0, flags=0, choice=1) at pam_dispatch.c:407
#19 0x000000342d202600 in pam_authenticate (pamh=0x7fe91805dae0, flags=0) at pam_auth.c:34
#20 0x00007fe9d44591f6 in Java_net_sf_jpam_Pam_authenticate (pEnv=0x7fe92401e700, pObj=0x7fe918009c80, pServiceName=0x7fe918009c78, pUsername=0x7fe918009c70, pPassword=0x7fe918009c68, debug=0 '\000') at Pam.c:267
#21 0x00007fe9f2ae92a3 in VMprJavaSendNative () from /usr/lib/jvm/java-1.6.0-ibm-1.6.0.9.1.x86_64/jre/lib/amd64/default/libj9vm24.so
1.6.0-ibm-1.6.0.9.1.x86_64/jre/lib/amd64/default/libj9vm24.so
...
(gdb) frame 14
#14 0x00007fe94239b30e in _pam_krb5_maybe_free_responses (responses=0x7fe918079410, n_responses=1) at prompter.c:63
63					xstrfree(responses[i].resp);
(gdb) p responses
$1 = (struct pam_response *) 0x7fe918079410
(gdb) p responses[0]
$2 = {resp = 0x7fe918079580 "", resp_retcode = 403150080}
(gdb) p responses[0].resp
$3 = 0x7fe918079580 ""
(gdb) p/c responses[0].resp
$4 = 128 '\200'
(gdb) p/c responses[0].resp[0]
$5 = 0 '\000'

Comment 6 Mark Huth 2012-09-27 02:04:08 UTC
Yep, the problem is present on RHEL5 too.  Simply change the locale to something other than en_US, eg de_DE, restart tomcat, try to login via the WebUI and boom!

*** glibc detected *** /usr/lib/jvm/java/bin/java: free(): invalid pointer: 0x0000003937f52a38 ***
======= Backtrace: =========
/lib64/libc.so.6[0x3937c711df]
/lib64/libc.so.6(cfree+0x4b)[0x3937c7163b]
/lib64/security/pam_krb5.so[0x2aaacc654dad]
/lib64/security/pam_krb5.so[0x2aaacc654f03]
/lib64/security/pam_krb5.so(pam_sm_authenticate+0x372)[0x2aaacc650e72]
/lib64/libpam.so.0(_pam_dispatch+0x277)[0x393b402dc7]
/lib64/libpam.so.0(pam_authenticate+0x42)[0x393b4026d2]
/usr/lib/libjpam.so(Java_net_sf_jpam_Pam_authenticate+0x149)[0x2aaac39a816f]
/usr/lib/jvm/java-1.6.0-ibm-1.6.0.9.1.x86_64/jre/lib/amd64/default/libj9vm24.so[0x2aaaaacfd2a3]

Going to try using sssd...

Comment 7 Mark Huth 2012-10-03 20:45:52 UTC
Created attachment 621149 [details]
Simpler reproducer

This is a simpler reproducer than going via the Satellite WebUI.  Found it attached to https://bugzilla.redhat.com/show_bug.cgi?id=219916 "RHN Satellite and pam hangs when accounts have password expired" and made some small modifications to it.

To test with the reproducer:

1) Setup pam authentication on a Satellite (my test Satellite was 5.4.1 on RHEL6 64bit) and use pam_unix in /etc/pam.d/rhn-satellite, like so:

[root@rhel6sat ~]# cat /etc/pam.d/rhn-satellite
auth        required      pam_env.so
auth        sufficient    pam_unix.so
auth        required      pam_deny.so
account     required      pam_unix.so

2) Create a testuser
# useradd testuser
# passwd testuser 
... and make the password redhat

3) Download SimplePam.java onto the Satellite

4) Compile the reproducer
# javac -extdirs /usr/share/java SimplePam.java 

5) Run the reproducer with an English locale and it works fine:
# LC_ALL=en_US java -cp .:/usr/share/java/* SimplePam

<output>
Current locale = en_US
Logging start
log4j:WARN No appenders could be found for logger (net.sf.jpam.Pam).
log4j:WARN Please initialize the log4j system properly.
service_name is rhn-satellite
password is redhat
username is testuser
Trying to get a handle to the PAM service...
...Service handle was created.
Trying to see if the user is a valid system user...
...User testuser is a real user.
Trying to pass info to the pam_acct_mgmt function...
...User testuser is permitted access.
LOGIN SUCCESSFUL
</output>

6) Run the reproducer with non-English locale and it falls over:
# LC_ALL=de_DE java -cp .:/usr/share/java/* SimplePam

<output>
Current locale = de_DE
Logging start
log4j:WARN No appenders could be found for logger (net.sf.jpam.Pam).
log4j:WARN Please initialize the log4j system properly.
service_name is rhn-satellite
password is redhat
username is testuser
Trying to get a handle to the PAM service...
...Service handle was created.
Trying to see if the user is a valid system user...
*** glibc detected *** java: free(): invalid pointer: 0x00007f5c08000078 ***
======= Backtrace: =========
/lib64/libc.so.6[0x36f7a75916]
/lib64/security/pam_unix.so(+0x4e14)[0x7f5be078ce14]
/lib64/security/pam_unix.so(pam_sm_authenticate+0x1f3)[0x7f5be078b353]
/lib64/libpam.so.0[0x36fe602cee]
/lib64/libpam.so.0(pam_authenticate+0x40)[0x36fe602600]
/usr/lib/libjpam.so(Java_net_sf_jpam_Pam_authenticate+0x17a)[0x7f5be0fc71f6]
/usr/lib/jvm/java-1.6.0-ibm-1.6.0.9.1.x86_64/jre/lib/amd64/default/libj9vm24.so(+0x2a2a3)[0x7f5c0f0bd2a3]
======= Memory map: ========
...
</output>

Comment 8 Mark Huth 2012-10-04 02:14:52 UTC
Looks like using sssd might be a workaround.  I just have to workout how to use sssd now.

[root@rhel6sat ~]# cat /etc/pam.d/rhn-satellite
auth        required      pam_env.so
auth        sufficient    pam_sss.so
auth        required      pam_deny.so
account     required      pam_sss.so

[root@rhel6sat ~]# LC_ALL=de_DE java -cp .:/usr/share/java/* SimplePam
Current locale = de_DE
Logging start
2012-10-04 12:03:01,323 [main] DEBUG net.sf.jpam.Pam - Debug mode active.
service_name is rhn-satellite
password is redhat
username is sssduser
Trying to get a handle to the PAM service...
...Service handle was created.
Trying to see if the user is a valid system user...
...Failed to authenticate for an unknown error: 7
...cs_password error: User sssduser is not authenticated
...Call returned with error: 7
LOGIN UNSUCCESSFUL

I wasn't able to login (coz I haven't figured out how to setup sssd yet) but at least the JVM didn't crash this time when using a non-English locale.  This also seems to show the problem isn't in Satellite but in one of the underlying components.

I'll continue to try to get sssd working and will close this out if the reproducer and the Satellite WebUI doesn't crash when using sssd.

Comment 10 Mark Huth 2012-10-04 20:27:20 UTC
This statement in my previous update I found is incorrect:

In all the stacktraces I've seen, the same error is reported:
*** glibc detected *** java: free(): invalid pointer: 0x00007f6714000078 ***
... with the same pointer address causing the problem. 

The pointer addresses similar but are NOT the same.  I wasn't looking closely enough at the middle few characters.

Comment 11 Mark Huth 2012-10-05 05:12:59 UTC
Ok, I've identified the problem and have a resolution for it.  It may not be the best resolution yet and could probably do with some fine-tuning but it works in my testing.

Ok, first for the problem and yes, it was in the jpam code.  After tracing the reproducer via gdb I was able to see where the problem was happening.

The problem is in jpam-0.4/src/c/Pam.c in PAM_conv

The jpam code has built-in password prompts it expects to receive back from pam(?) when receiving the password:

 97 /* Expected prompts on pam_authenticate / pam_conv */
 98 #define PASS_PROMPT_SECUREID    "Enter PASSCODE: "
 99 #define PASS_PROMPT_DEFAULT "Password: "
100 #define PASS_PROMPT_MACOSX  "Password:"
..

And they are used here in PAM_conv:

156     /* jpam does not yet support password changing, we just copy password */
157     case PAM_PROMPT_ECHO_OFF:
158     case PAM_PROMPT_ECHO_ON:
159         /* Get a prompt, fill the password */
160         if (    (! strcmp(msg[replies]->msg, PASS_PROMPT_SECUREID)) ||
161             (! strcmp(msg[replies]->msg, PASS_PROMPT_DEFAULT)) ||
162             (! strcmp(msg[replies]->msg, PASS_PROMPT_MACOSX)) )
163         {
164             if (debug)
165                 printf("***Sending password\n");
166             reply[replies].resp = COPY_STRING(password);
167         }
168         break;

So if the prompt returned from pam matches any of the expected password prompts then we enter lines 164->166.  And line 166 is where we make a copy of the password response we receive.  However if we don't receive a password prompt that matches one of the expects prompts then we don't copy the password and reply[replies].resp left uninitialized.  reply gets copyied to resp:

188    *resp = reply;

and then later in do_pam_conversation in src/sss_client/pam_sss.c (which is what calls PAM_conv) we try to free resp[0].resp:

395	                    answer = strndup(resp[0].resp, MAX_AUTHTOK_SIZE);
396	                    _pam_overwrite((void *)resp[0].resp);
397	                    free(resp[0].resp); 

And since resp[0].resp was never initialized we get the invalid pointer error!

(gdb) p/x resp[0]
$20 = {resp = 0x7f45c8000078, resp_retcode = 0xc8000078}
...
*** glibc detected *** java: free(): invalid pointer: 0x00007f45c8000078 ***

Now, why wouldn't the password response be the same as one of the expected password prompts?  When it is a foreign language is when!

If the locale is fr_FR the password prompt returned is:
(gdb) p	msg[replies].msg
$15 = 0x7f3e20019950 "Mot de passe\240:	"

... and for es_ES:
(gdb) p msg[0].msg
$2 = 0x7fb6247aebb0 "Contrase\361a: "

And thus we fail these 3 tests:
160         if (    (! strcmp(msg[replies]->msg, PASS_PROMPT_SECUREID)) ||
161             (! strcmp(msg[replies]->msg, PASS_PROMPT_DEFAULT)) ||
162             (! strcmp(msg[replies]->msg, PASS_PROMPT_MACOSX)) )
... and don't copy the password, end up with an unitialized pointer and then get the invalid pointer error when trying to free it.

I removed these 3 lines from the code and lo and behold it all worked.  I tested with the reproducer and with Satellite WebUI logins and it worked for the non-English locales I tested (fr, de and es) and for the pam auth modules I tested (pam_unix, pam_krb5 and pam_sss).  By worked I mean the JVM didn't crash.

Comment 12 Mark Huth 2012-10-05 05:26:52 UTC
Created attachment 621935 [details]
Removes the expected password prompts from PAM_conv

As I mentioned this is how I was able to get this to work - that is, no JVM crashes and authentication succeeding - by simply removing the tests comparing the expected password prompts.   This may not be the best resolution but it did work in my testing.  I'm also kind of surprised - this code seems to have been in jpam (and Satellite) for a while and it doesn't seem to have caused any issues before this.

Alternately, we could just alter this page:
http://jpam.sourceforge.net/documentation/limitations.html
... and add 'doesn't support non-English locales and potentially other 3rd party pam mechanisms, eg CentrifyDC that don't return an expected password prompt'.  

Just kidding by the way on that last suggestion.

Thoughts?

Comment 17 Jan Pazdziora 2012-10-17 05:47:14 UTC
Making bugzilla public.

Comment 18 Jan Pazdziora 2012-10-17 09:17:15 UTC
Patch applied to Spacewalk master, 38970f47172edaa9fb93ed9a103a7eb93d3639ad.

Comment 22 Clifford Perry 2013-10-01 21:58:06 UTC
Satellite 5.6 has been released. This bug was tracked under the release.  

This bug was either VERIFIED or RELEASE_PENDING (re-verified prior shortly
before release). 

Moving to CLOSED CURRENT_RELEASE. 

Text from Upgrade Erratum follows:

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHEA-2013-1395.html