Bug 1566782

Summary: memory management issue in the sssd_nss_ex interface can cause the ns-slapd process on IPA server to crash
Product: Red Hat Enterprise Linux 7 Reporter: Marc Sauton <msauton>
Component: sssdAssignee: Sumit Bose <sbose>
Status: CLOSED ERRATA QA Contact: ipa-qe <ipa-qe>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 7.5CC: abokovoy, admin_eng, aheverle, anazmy, apeddire, cobrown, databases-maint, ddas, dpal, gparente, grajaiya, hkhot, jhrozek, jvilicic, knoel, ldelouw, lslebodn, michael.ward, mkosek, mreznik, mrhodes, mrichter, mzidek, nkinder, pbonzini, pbrezina, rharwood, sali, sbose, sgoveas, sumenon, tmihinto, tscherf, vashirov, vmishra
Target Milestone: rcKeywords: ZStream
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: sssd-1.16.0-20.el7 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1570527 (view as bug list) Environment:
Last Closed: 2018-10-30 10:42:26 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1570527    
Attachments:
Description Flags
core dump
none
A test case
none
core dump none

Comment 7 Corey Brown 2018-04-13 21:19:22 UTC
Created attachment 1421591 [details]
core dump

Comment 32 Sumit Bose 2018-04-20 07:43:50 UTC
Upstream ticket:
https://pagure.io/SSSD/sssd/issue/3715

Comment 36 Jakub Hrozek 2018-04-20 12:15:02 UTC
Created attachment 1424497 [details]
A test case

Comment 37 Jakub Hrozek 2018-04-20 12:18:59 UTC
To run the test:
 - have a user in a directory (any kind of a directory is fine) who is a member of more than 10 groups
 - compile the program with:
   $ gcc test.c -lsss_nss_idmap -o test
 - run the program under valgrind and specify the user as the first argument:
   $ valgrind ./test mytestuser
  - run the test again to make sure the second test run hits the memory cache:
   $ valgrind ./test mytestuser

With the unpatched code, you should see valgrind errors, you should not see them with the patched code.

Comment 38 Jakub Hrozek 2018-04-20 12:46:02 UTC
* master:
 * 2c4dc7a4d98c439c69625f12ba4c3c8253f4cc5b                   
 * 46a4c265629d9b725c41f22849741ce7342bdd85

Comment 44 Jakub Hrozek 2018-05-28 07:19:39 UTC
*** Bug 1576746 has been marked as a duplicate of this bug. ***

Comment 47 Sudhir Menon 2018-06-13 09:45:28 UTC
Here are the observations.

1. Able to observe the errors with sssd-1.16.0-19.el7.x86_64 and ipa-server-4.5.4-10.el7.x86_64

[root@master ~]# sss_cache -E
No cache object matched the specified search

[root@master ~]# valgrind ./test ipauser1
==16598== Memcheck, a memory error detector
==16598== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==16598== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==16598== Command: ./test ipauser1
==16598== 
==16598== Invalid read of size 8
==16598==    at 0x4C2E060: memcpy@@GLIBC_2.14 (vg_replace_strmem.c:1022)
==16598==    by 0x4E38E61: sss_nss_getgrouplist_timeout (in /usr/lib64/libsss_nss_idmap.so.0.4.0)
==16598==    by 0x400671: main (in /root/test)
==16598==  Address 0x582d090 is 0 bytes inside a block of size 40 free'd
==16598==    at 0x4C2BB58: realloc (vg_replace_malloc.c:785)
==16598==    by 0x4E3899A: ??? (in /usr/lib64/libsss_nss_idmap.so.0.4.0)
==16598==    by 0x4E38E1C: sss_nss_getgrouplist_timeout (in /usr/lib64/libsss_nss_idmap.so.0.4.0)
==16598==    by 0x400671: main (in /root/test)
==16598==  Block was alloc'd at
==16598==    at 0x4C29BC3: malloc (vg_replace_malloc.c:299)
==16598==    by 0x4E38DE4: sss_nss_getgrouplist_timeout (in /usr/lib64/libsss_nss_idmap.so.0.4.0)
==16598==    by 0x400671: main (in /root/test)
==16598== 
==16598== Invalid read of size 8
==16598==    at 0x4C2E06E: memcpy@@GLIBC_2.14 (vg_replace_strmem.c:1022)
==16598==    by 0x4E38E61: sss_nss_getgrouplist_timeout (in /usr/lib64/libsss_nss_idmap.so.0.4.0)
==16598==    by 0x400671: main (in /root/test)
==16598==  Address 0x582d0a0 is 16 bytes inside a block of size 40 free'd
==16598==    at 0x4C2BB58: realloc (vg_replace_malloc.c:785)
==16598==    by 0x4E3899A: ??? (in /usr/lib64/libsss_nss_idmap.so.0.4.0)
==16598==    by 0x4E38E1C: sss_nss_getgrouplist_timeout (in /usr/lib64/libsss_nss_idmap.so.0.4.0)
==16598==    by 0x400671: main (in /root/test)
==16598==  Block was alloc'd at
==16598==    at 0x4C29BC3: malloc (vg_replace_malloc.c:299)
==16598==    by 0x4E38DE4: sss_nss_getgrouplist_timeout (in /usr/lib64/libsss_nss_idmap.so.0.4.0)
==16598==    by 0x400671: main (in /root/test)
==16598== 
==16598== Invalid free() / delete / delete[] / realloc()
==16598==    at 0x4C2ACBD: free (vg_replace_malloc.c:530)
==16598==    by 0x4E38E6E: sss_nss_getgrouplist_timeout (in /usr/lib64/libsss_nss_idmap.so.0.4.0)
==16598==    by 0x400671: main (in /root/test)
==16598==  Address 0x582d090 is 0 bytes inside a block of size 40 free'd
==16598==    at 0x4C2BB58: realloc (vg_replace_malloc.c:785)
==16598==    by 0x4E3899A: ??? (in /usr/lib64/libsss_nss_idmap.so.0.4.0)
==16598==    by 0x4E38E1C: sss_nss_getgrouplist_timeout (in /usr/lib64/libsss_nss_idmap.so.0.4.0)
==16598==    by 0x400671: main (in /root/test)
==16598==  Block was alloc'd at
==16598==    at 0x4C29BC3: malloc (vg_replace_malloc.c:299)
==16598==    by 0x4E38DE4: sss_nss_getgrouplist_timeout (in /usr/lib64/libsss_nss_idmap.so.0.4.0)
==16598==    by 0x400671: main (in /root/test)
==16598== 
ipauser1: sss_nss_getgrouplist_ex 34: Numerical result out of range
==16598== 
==16598== HEAP SUMMARY:
==16598==     in use at exit: 56 bytes in 1 blocks
==16598==   total heap usage: 7 allocs, 7 frees, 308 bytes allocated
==16598== 
==16598== LEAK SUMMARY:
==16598==    definitely lost: 56 bytes in 1 blocks
==16598==    indirectly lost: 0 bytes in 0 blocks
==16598==      possibly lost: 0 bytes in 0 blocks
==16598==    still reachable: 0 bytes in 0 blocks
==16598==         suppressed: 0 bytes in 0 blocks
==16598== Rerun with --leak-check=full to see details of leaked memory
==16598== 
==16598== For counts of detected and suppressed errors, rerun with: -v
==16598== ERROR SUMMARY: 6 errors from 3 contexts (suppressed: 0 from 0)

2. With latest version of sssd the fix is seen.

[root@master ~]# cat /etc/redhat-release 
Red Hat Enterprise Linux Server release 7.5 (Maipo)
[root@master ~]# rpm -q ipa-server sssd
ipa-server-4.5.4-10.el7_5.2.x86_64
sssd-1.16.0-19.el7_5.5.x86_64

ipauser1 is a member of more than 10 local ipa groups
[root@master ~]# id ipauser1
uid=687800001(ipauser1) gid=687800001(ipauser1) groups=687800001(ipauser1),687800004(group3),687800003(group2),687800005(group4),687800007(group1),687800009(group7),687800006(group5),687800008(group6),687800011(group9),687800012(group10),687800002(editors),687800010(group8),687800013(group11)

[root@master ~]# valgrind ./test ipauser1
==16657== Memcheck, a memory error detector
==16657== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==16657== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==16657== Command: ./test ipauser1
==16657== 
ipauser1: sss_nss_getgrouplist_ex 34: Numerical result out of range
==16657== 
==16657== HEAP SUMMARY:
==16657==     in use at exit: 0 bytes in 0 blocks
==16657==   total heap usage: 6 allocs, 6 frees, 411 bytes allocated
==16657== 
==16657== All heap blocks were freed -- no leaks are possible
==16657== 
==16657== For counts of detected and suppressed errors, rerun with: -v
==16657== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

Comment 48 Sudhir Menon 2018-06-13 09:52:52 UTC
Moving the bug back to ONQA as this was marked VERIFIED by mistake instead of bz1570527.
Marked the bug 1570527 as verified.

Comment 50 Corey Brown 2018-06-25 17:58:35 UTC
Created attachment 1454445 [details]
core dump

Comment 54 Vinay Mishra 2018-07-09 17:45:36 UTC
Hello Team,

cst is facing issue in this case 02107061 also. Trying to collect core in it.

Thanks
Vinay

Comment 59 Michal Reznik 2018-07-20 10:36:36 UTC
Verified.

Was able to hit the issue on sssd-1.16.0-19:

[root@kvm-02-guest13 ~]# rpm -qa | grep sssd
sssd-proxy-1.16.0-19.el7.x86_64
sssd-ad-1.16.0-19.el7.x86_64
[root@kvm-02-guest13 ~]#
[root@kvm-02-guest13 ~]# rpm -qa | grep ipa-server
ipa-server-common-4.6.4-2.el7.noarch
ipa-server-4.6.4-2.el7.x86_64
[root@kvm-02-guest13 ~]# 
[root@kvm-02-guest13 ~]# id foo
uid=1200600001(foo) gid=1200600001(foo) groups=1200600001(foo),1200600004(group2),1200600005(group3),1200600007(group5),1200600009(group7),1200600010(group8),1200600012(group10),1200600013(group11),1200600008(group6),1200600003(group1),1200600006(group4),1200600011(group9)
[root@kvm-02-guest13 ~]#
[root@kvm-02-guest13 ~]# sss_cache -E
[root@kvm-02-guest13 ~]# valgrind ./test foo
==9230== Memcheck, a memory error detector
==9230== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==9230== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==9230== Command: ./test foo
==9230== 
==9230== Invalid read of size 8
==9230==    at 0x4C2E060: memcpy@@GLIBC_2.14 (vg_replace_strmem.c:1022)
==9230==    by 0x4E38E61: sss_nss_getgrouplist_timeout (in /usr/lib64/libsss_nss_idmap.so.0.4.0)
==9230==    by 0x400671: main (in /root/test)
==9230==  Address 0x582d090 is 0 bytes inside a block of size 40 free'd
==9230==    at 0x4C2BB58: realloc (vg_replace_malloc.c:785)
==9230==    by 0x4E3899A: ??? (in /usr/lib64/libsss_nss_idmap.so.0.4.0)
==9230==    by 0x4E38E1C: sss_nss_getgrouplist_timeout (in /usr/lib64/libsss_nss_idmap.so.0.4.0)
==9230==    by 0x400671: main (in /root/test)
==9230==  Block was alloc'd at
==9230==    at 0x4C29BC3: malloc (vg_replace_malloc.c:299)
==9230==    by 0x4E38DE4: sss_nss_getgrouplist_timeout (in /usr/lib64/libsss_nss_idmap.so.0.4.0)
==9230==    by 0x400671: main (in /root/test)
==9230== 
==9230== Invalid read of size 8
==9230==    at 0x4C2E06E: memcpy@@GLIBC_2.14 (vg_replace_strmem.c:1022)
==9230==    by 0x4E38E61: sss_nss_getgrouplist_timeout (in /usr/lib64/libsss_nss_idmap.so.0.4.0)
==9230==    by 0x400671: main (in /root/test)
==9230==  Address 0x582d0a0 is 16 bytes inside a block of size 40 free'd
==9230==    at 0x4C2BB58: realloc (vg_replace_malloc.c:785)
==9230==    by 0x4E3899A: ??? (in /usr/lib64/libsss_nss_idmap.so.0.4.0)
==9230==    by 0x4E38E1C: sss_nss_getgrouplist_timeout (in /usr/lib64/libsss_nss_idmap.so.0.4.0)
==9230==    by 0x400671: main (in /root/test)
==9230==  Block was alloc'd at
==9230==    at 0x4C29BC3: malloc (vg_replace_malloc.c:299)
==9230==    by 0x4E38DE4: sss_nss_getgrouplist_timeout (in /usr/lib64/libsss_nss_idmap.so.0.4.0)
==9230==    by 0x400671: main (in /root/test)
==9230== 
==9230== Invalid free() / delete / delete[] / realloc()
==9230==    at 0x4C2ACBD: free (vg_replace_malloc.c:530)
==9230==    by 0x4E38E6E: sss_nss_getgrouplist_timeout (in /usr/lib64/libsss_nss_idmap.so.0.4.0)
==9230==    by 0x400671: main (in /root/test)
==9230==  Address 0x582d090 is 0 bytes inside a block of size 40 free'd
==9230==    at 0x4C2BB58: realloc (vg_replace_malloc.c:785)
==9230==    by 0x4E3899A: ??? (in /usr/lib64/libsss_nss_idmap.so.0.4.0)
==9230==    by 0x4E38E1C: sss_nss_getgrouplist_timeout (in /usr/lib64/libsss_nss_idmap.so.0.4.0)
==9230==    by 0x400671: main (in /root/test)
==9230==  Block was alloc'd at
==9230==    at 0x4C29BC3: malloc (vg_replace_malloc.c:299)
==9230==    by 0x4E38DE4: sss_nss_getgrouplist_timeout (in /usr/lib64/libsss_nss_idmap.so.0.4.0)
==9230==    by 0x400671: main (in /root/test)
==9230== 
foo: sss_nss_getgrouplist_ex 34: Numerical result out of range
==9230== 
==9230== HEAP SUMMARY:
==9230==     in use at exit: 48 bytes in 1 blocks
==9230==   total heap usage: 7 allocs, 7 frees, 287 bytes allocated
==9230== 
==9230== LEAK SUMMARY:
==9230==    definitely lost: 48 bytes in 1 blocks
==9230==    indirectly lost: 0 bytes in 0 blocks
==9230==      possibly lost: 0 bytes in 0 blocks
==9230==    still reachable: 0 bytes in 0 blocks
==9230==         suppressed: 0 bytes in 0 blocks
==9230== Rerun with --leak-check=full to see details of leaked memory
==9230== 
==9230== For counts of detected and suppressed errors, rerun with: -v
==9230== ERROR SUMMARY: 6 errors from 3 contexts (suppressed: 0 from 0)

On sssd-1.16.2-7 the issue is fixed.

[root@kvm-02-guest13 ~]# rpm -qa | grep sssd-1
sssd-1.16.2-7.el7.x86_64
[root@kvm-02-guest13 ~]# 
[root@kvm-02-guest13 ~]# sss_cache -E
[root@kvm-02-guest13 ~]# valgrind ./test foo
==9797== Memcheck, a memory error detector
==9797== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==9797== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==9797== Command: ./test foo
==9797== 
foo: sss_nss_getgrouplist_ex 34: Numerical result out of range
==9797== 
==9797== HEAP SUMMARY:
==9797==     in use at exit: 0 bytes in 0 blocks
==9797==   total heap usage: 6 allocs, 6 frees, 384 bytes allocated
==9797== 
==9797== All heap blocks were freed -- no leaks are possible
==9797== 
==9797== For counts of detected and suppressed errors, rerun with: -v
==9797== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
[root@kvm-02-guest13 ~]#

Comment 61 errata-xmlrpc 2018-10-30 10:42:26 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:3158