Bug 1700849

Summary:	gssproxy-0.8.1-11.fc31 crashed
Product:	[Fedora] Fedora	Reporter:	Lukas Slebodnik <lslebodn>
Component:	gssproxy	Assignee:	Robbie Harwood <rharwood>
Status:	CLOSED RAWHIDE	QA Contact:	Fedora Extras Quality Assurance <extras-qa>
Severity:	unspecified	Docs Contact:
Priority:	unspecified
Version:	rawhide	CC:	abokovoy, gdeschner, jpazdziora, rharwood, ssorce
Target Milestone:	---	Keywords:	Regression
Target Release:	---
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:	gssproxy-0.8.2-1.fc31	Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2019-04-18 15:33:55 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Lukas Slebodnik 2019-04-17 13:24:12 UTC

Description of problem:
I could see some random failures in installation of freeIPA server.
And it started with the new release of gssproxy

I assume it might be related to following crasd

Version-Release number of selected component (if applicable):
sh$ rpm -q gssproxy
gssproxy-0.8.1-11.fc31.x86_64

Steps to Reproduce:
I am not sure hot to trigger the crash


Additional info:
           PID: 27637 (gssproxy)
           UID: 0 (root)
           GID: 0 (root)
        Signal: 6 (ABRT)
     Timestamp: Wed 2019-04-17 15:09:18 CEST (12min ago)
  Command Line: /usr/sbin/gssproxy -D --debug-level=3
    Executable: /usr/sbin/gssproxy
 Control Group: /system.slice/gssproxy.service
          Unit: gssproxy.service
         Slice: system.slice
       Boot ID: 182b63df397a49c084973e851b70a2c8
    Machine ID: b307ff79cbff4dadb25ac31bbdb1ebaf
      Hostname: kvm-02-guest24.testrelm.test
       Storage: /var/lib/systemd/coredump/core.gssproxy.0.182b63df397a49c084973e851b70a2c8.27637.1555506558000000.lz4
       Message: Process 27637 (gssproxy) of user 0 dumped core.
                
                Stack trace of thread 27642:
                #0  0x00007fe6b244bea5 raise (libc.so.6)
                #1  0x00007fe6b24368d5 abort (libc.so.6)
                #2  0x00007fe6b248eed7 __libc_message (libc.so.6)
                #3  0x00007fe6b24957cc malloc_printerr (libc.so.6)
                #4  0x00007fe6b24984ac _int_malloc (libc.so.6)
                #5  0x00007fe6b2499d5f malloc (libc.so.6)
                #6  0x000055836f6ffc14 gp_export_gssx_cred (gssproxy)
                #7  0x000055836f701adf gp_accept_sec_context (gssproxy)
                #8  0x000055836f6fdd1d gp_rpc_process_call (gssproxy)
                #9  0x000055836f6f6adc gp_worker_main (gssproxy)
                #10 0x00007fe6b25df4c0 start_thread (libpthread.so.0)
                #11 0x00007fe6b250e933 __clone (libc.so.6)
                
                Stack trace of thread 27637:
                #0  0x00007fe6b250ec6e epoll_wait (libc.so.6)
                #1  0x00007fe6b1528ee7 n/a (libev.so.4)
                #2  0x00007fe6b152b292 ev_run (libev.so.4)
                #3  0x000055836f6f3194 main (gssproxy)
                #4  0x00007fe6b2437f73 __libc_start_main (libc.so.6)
                #5  0x000055836f6f327e _start (gssproxy)
                
                Stack trace of thread 27638:
                #0  0x00007fe6b25e5515 pthread_cond_wait@@GLIBC_2.3.2 (libpthread.so.0)
                #1  0x000055836f6f6a43 gp_worker_main (gssproxy)
                #2  0x00007fe6b25df4c0 start_thread (libpthread.so.0)
                #3  0x00007fe6b250e933 __clone (libc.so.6)
                
                Stack trace of thread 27639:
                #0  0x00007fe6b25e5515 pthread_cond_wait@@GLIBC_2.3.2 (libpthread.so.0)
                #1  0x000055836f6f6a43 gp_worker_main (gssproxy)
                #2  0x00007fe6b25df4c0 start_thread (libpthread.so.0)
                #3  0x00007fe6b250e933 __clone (libc.so.6)


(gdb) bt
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1  0x00007fe6b24368d5 in __GI_abort () at abort.c:79
#2  0x00007fe6b248eed7 in __libc_message (action=action@entry=do_abort, fmt=fmt@entry=0x7fe6b259ca98 "%s\n") at ../sysdeps/posix/libc_fatal.c:181
#3  0x00007fe6b24957cc in malloc_printerr (str=str@entry=0x7fe6b259eb98 "malloc(): invalid size (unsorted)") at malloc.c:5345
#4  0x00007fe6b24984ac in _int_malloc (av=av@entry=0x7fe6a8000020, bytes=bytes@entry=2408) at malloc.c:3748
#5  0x00007fe6b2499d5f in __GI___libc_malloc (bytes=2408) at malloc.c:3073
#6  0x000055836f6ffc14 in gp_encrypt_buffer (out=0x7fe6a806fb80, buf=0x7fe6a80a1ec0, len=<optimized out>, key=0x5583701203d0, context=0x5583700a9f40) at src/gp_export.c:239
#7  gp_export_gssx_cred (min=min@entry=0x7fe6af2d74e0, gpcall=gpcall@entry=0x7fe6af3d7960, in=in@entry=0x7fe6af2d7508, out=0x7fe6a806fb10) at src/gp_export.c:392
#8  0x000055836f701adf in gp_accept_sec_context (gpcall=0x7fe6af3d7960, arg=0x7fe6af2d7730, res=0x7fe6af2d7690) at src/gp_rpc_accept_sec_context.c:146
#9  0x000055836f6fdd1d in gp_rpc_execute (res=0x7fe6af2d7690, arg=0x7fe6af2d7730, proc=9, gpcall=0x7fe6af3d7960) at src/gp_rpc_process.c:321
#10 gp_rpc_process_call (gpcall=gpcall@entry=0x7fe6af3d7960, inbuf=0x5583700b0550 "_\302\062+", inlen=2752, outbuf=outbuf@entry=0x7fe6af3d7950, outlen=outlen@entry=0x7fe6af3d7958)
    at src/gp_rpc_process.c:381
#11 0x000055836f6f6adc in gp_handle_query (w=<optimized out>, q=0x5583700f6d40) at src/gp_workers.c:437
#12 gp_worker_main (pvt=0x5583700f4af0) at src/gp_workers.c:389
#13 0x00007fe6b25df4c0 in start_thread (arg=<optimized out>) at pthread_create.c:479
#14 0x00007fe6b250e933 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

I would say that "malloc(): invalid size (unsorted)" is not typical abort message from glibc. Might be caused  by improper handling of memory in software.

Comment 1 Robbie Harwood 2019-04-17 20:28:02 UTC

Do you have a coredump I could look at here?  I'll keep looking, but if you can share one that might help.

Comment 2 Lukas Slebodnik 2019-04-18 10:36:39 UTC

(In reply to Robbie Harwood from comment #1)
> Do you have a coredump I could look at here?  I'll keep looking, but if you
> can share one that might help.

Coredump is useless; you can see it in backtrace.

(In reply to Lukas Slebodnik from comment #0)
> (gdb) bt
> #0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
> #1  0x00007fe6b24368d5 in __GI_abort () at abort.c:79
> #2  0x00007fe6b248eed7 in __libc_message (action=action@entry=do_abort,
> fmt=fmt@entry=0x7fe6b259ca98 "%s\n") at ../sysdeps/posix/libc_fatal.c:181
> #3  0x00007fe6b24957cc in malloc_printerr (str=str@entry=0x7fe6b259eb98
> "malloc(): invalid size (unsorted)") at malloc.c:5345
> #4  0x00007fe6b24984ac in _int_malloc (av=av@entry=0x7fe6a8000020,
> bytes=bytes@entry=2408) at malloc.c:3748
> #5  0x00007fe6b2499d5f in __GI___libc_malloc (bytes=2408) at malloc.c:3073

You can see that malloc was called from gp_encrypt_buffer and it should return
buffer with 2408 bytes. Which is not anything unusual. And should pass.

Usually, it fail in case of currupted heap/glibc metadata (double free) ...
 
> #6  0x000055836f6ffc14 in gp_encrypt_buffer (out=0x7fe6a806fb80,
> buf=0x7fe6a80a1ec0, len=<optimized out>, key=0x5583701203d0,
> context=0x5583700a9f40) at src/gp_export.c:239
> #7  gp_export_gssx_cred (min=min@entry=0x7fe6af2d74e0,
> gpcall=gpcall@entry=0x7fe6af3d7960, in=in@entry=0x7fe6af2d7508,
> out=0x7fe6a806fb10) at src/gp_export.c:392
> #8  0x000055836f701adf in gp_accept_sec_context (gpcall=0x7fe6af3d7960,
> arg=0x7fe6af2d7730, res=0x7fe6af2d7690) at
> src/gp_rpc_accept_sec_context.c:146
> #9  0x000055836f6fdd1d in gp_rpc_execute (res=0x7fe6af2d7690,
> arg=0x7fe6af2d7730, proc=9, gpcall=0x7fe6af3d7960) at
> src/gp_rpc_process.c:321
> #10 gp_rpc_process_call (gpcall=gpcall@entry=0x7fe6af3d7960,
> inbuf=0x5583700b0550 "_\302\062+", inlen=2752,
> outbuf=outbuf@entry=0x7fe6af3d7950, outlen=outlen@entry=0x7fe6af3d7958)
>     at src/gp_rpc_process.c:381
> #11 0x000055836f6f6adc in gp_handle_query (w=<optimized out>,
> q=0x5583700f6d40) at src/gp_workers.c:437
> #12 gp_worker_main (pvt=0x5583700f4af0) at src/gp_workers.c:389
> #13 0x00007fe6b25df4c0 in start_thread (arg=<optimized out>) at
> pthread_create.c:479
> #14 0x00007fe6b250e933 in clone () at
> ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Comment 3 Simo Sorce 2019-04-18 12:49:58 UTC

This is a bug in a patch we rushed through to fix some other issue :-/
The problem should be eliminated by the following PR that completely changes the code that causes this issue:
https://pagure.io/gssproxy/pull-request/246#

Comment 4 Lukas Slebodnik 2019-04-18 13:42:26 UTC

(In reply to Simo Sorce from comment #3)
> This is a bug in a patch we rushed through to fix some other issue :-/
> The problem should be eliminated by the following PR that completely changes
> the code that causes this issue:
> https://pagure.io/gssproxy/pull-request/246#

and because it is in official release we might even consider it as a CVE (index out of bound)

Comment 5 Simo Sorce 2019-04-18 15:09:28 UTC

No it is just a crash bug, the issue is not exploitable by an attacker, as the buffer is encrypted in a key controlled by gssproxy itself.

Comment 6 Robbie Harwood 2019-04-18 15:15:34 UTC

A coredump isn't useless - it would have told me what the value of `len` actually is, which is what I immediately wanted to know.  And Simo's right that it's just a crash.

Comment 7 Lukas Slebodnik 2019-04-18 15:28:25 UTC

(In reply to Simo Sorce from comment #5)
> No it is just a crash bug, the issue is not exploitable by an attacker, as
> the buffer is encrypted in a key controlled by gssproxy itself.

Are you sure?

 387     ret_maj = gss_export_cred(&ret_min, *in, &token);
 388     if (ret_maj) {
 389         goto done;
 390     }
 391 
 392     ret = gp_encrypt_buffer(handle->context, handle->key,
 393                             token.length, token.value,
 394                             &out->cred_handle_reference);

I am not very familiar with gssapi but I assume user might control content of
token. Or is there some checking (CRC, ...) in gss_export_cred?

 199 static int gp_encrypt_buffer(krb5_context context, krb5_keyblock *key,
 200                              size_t len, void *buf, octet_string *out)
 201 {
 202     int ret;
 203     krb5_data data_in;
 204     krb5_enc_data enc_handle;
 205     size_t cipherlen;
 206     char *packed = NULL;
 207     uint32_t netlen;
 208 
 209     if (len > (uint32_t)(-1)) {
 210         /* Needs to fit in 4 bytes of payload, so... */
 211         ret = ENOMEM;
 212         goto done;
 213     }
 214 
 215     packed = malloc(len);
 216     if (!packed) {
 217         ret = errno;
 218         goto done;
 219     }
 220 
 221     netlen = htonl(len);
 222     memcpy(packed, (uint8_t *)&netlen, 4);
 223     memcpy(packed + 4, buf, len);

And token is directly passed to memcpy.

Sure it is a crashes (sometimes :-). But seeing error from glibc "malloc(): invalid size (unsorted)" makes me wonder
whether it is just a "crash" bug. Fortunately, such version is just in rawhide

Comment 8 Lukas Slebodnik 2019-04-18 15:29:54 UTC

(In reply to Robbie Harwood from comment #6)
> A coredump isn't useless - it would have told me what the value of `len`
> actually is, which is what I immediately wanted to know.  And Simo's right
> that it's just a crash.

It was useless because you could see the same info in the description of this bug.
I would recommend to read description more carefully next time.

Comment 9 Simo Sorce 2019-04-18 15:43:19 UTC

Just for the record, the user cannot manipulate the token.

The token is generated by gssproxy itself, then encrypted and handed to the user for storage.
It's then returned by the user to gssproxy in encrypted form, where decryption happens.

The process is fully internal, only storage is deferred to users, but they do not have the key to manipulate the contents.
At most what a userspace program can do is send random data, this generally will end up in a decryption error.

Comment 10 Lukas Slebodnik 2019-04-18 15:59:29 UTC

(In reply to Simo Sorce from comment #9)
> Just for the record, the user cannot manipulate the token.
> 
> The token is generated by gssproxy itself, then encrypted and handed to the
> user for storage.
> It's then returned by the user to gssproxy in encrypted form, where
> decryption happens.
> 
> The process is fully internal, only storage is deferred to users, but they
> do not have the key to manipulate the contents.
> At most what a userspace program can do is send random data, this generally
> will end up in a decryption error.

OK, thank you for explanation and fixing the bug.
PR works for me.