Bug 1010844 - if pam_fprintd.so is included in the pam config file, pam leaks file descriptors from pipe2
if pam_fprintd.so is included in the pam config file, pam leaks file descript...
Status: CLOSED DUPLICATE of bug 1024825
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: fprintd (Show other bugs)
6.5
Unspecified Unspecified
medium Severity high
: rc
: ---
Assigned To: Bastien Nocera
Desktop QE
:
Depends On:
Blocks: 1075802 1020045
  Show dependency treegraph
 
Reported: 2013-09-23 03:47 EDT by Ján Rusnačko
Modified: 2014-08-21 07:12 EDT (History)
8 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1020045 (view as bug list)
Environment:
Last Closed: 2014-08-21 07:12:15 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Ján Rusnačko 2013-09-23 03:47:38 EDT
Description of problem:
During performance testing of Red Hat Directory Server we found server aborting from g_logv function. This is undesirable, as we would not want to see Directory Server abort due to configuration issue.

Version-Release number of selected component (if applicable):
glib2-2.26.1-3

How reproducible:
consistently

Abort occurs when g_logv is trying to log this error:

g_logv (log_domain=0x7f82c40cde06 "GLib", log_level=<value optimized out>, format=
    0x7f82c40d45d8 "Cannot create pipe main loop wake-up: %s\n", args1=0x7f82d7ffc290) at gmessages.c:557

And it aborts at this line 557.
539          if ((test_level & G_LOG_FLAG_FATAL) && !masquerade_fatal)
540                {
[...]
556    #else /* !G_ENABLE_DEBUG || !SIGTRAP */
557              abort ();
558    #endif /* !G_ENABLE_DEBUG || !SIGTRAP */
559    #endif /* !G_OS_WIN32 */

The errno is 24 "Too many open files".
(gdb) p errno
$3 = 24

Stacktrace:

Core was generated by `./ns-slapd -D /etc/dirsrv/slapd-ibm-x3655-01 -i /var/run/dirsrv/slapd-ibm-x3655'.
Program terminated with signal 6, Aborted.
#0  0x00007f6fa0c0b925 in raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
64      return INLINE_SYSCALL (tgkill, 3, pid, selftid, sig);
(gdb) bt
#0  0x00007f6fa0c0b925 in raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
#1  0x00007f6fa0c0d105 in abort () at abort.c:92
#2  0x00007f6f704da98a in g_logv (log_domain=0x7f6f70531e06 "GLib", log_level=<value optimized out>, format=
     0x7f6f705385d8 "Cannot create pipe main loop wake-up: %s\n", args1=0x7f6f71ddf290) at gmessages.c:557
#3  0x00007f6f704daa23 in g_log (log_domain=<value optimized out>, log_level=<value optimized out>, format=<value optimized out>) at gmessages.c:577
#4  0x00007f6f704d0ba0 in g_main_context_init_pipe (context=0x7f6ef80690b0) at gmain.c:520
#5  0x00007f6f704d0c53 in g_main_context_new () at gmain.c:615
#6  0x00007f6f704d0e1d in g_main_context_default () at gmain.c:653
#7  0x00007f6f903bbe51 in dbus_connection_setup_with_g_main (connection=0x7f6ef80674d0, context=0x0) at dbus-gmain.c:568
#8  0x00007f6f903bbf8a in dbus_g_bus_get (type=<value optimized out>, error=0x0) at dbus-gmain.c:763
#9  0x00007f6f905d6fc1 in create_manager (pamh=0x7f6ef80577e0, flags=<value optimized out>, argc=0, argv=<value optimized out>) at pam_fprintd.c:126
#10 do_auth (pamh=0x7f6ef80577e0, flags=<value optimized out>, argc=0, argv=<value optimized out>) at pam_fprintd.c:379
#11 pam_sm_authenticate (pamh=0x7f6ef80577e0, flags=<value optimized out>, argc=0, argv=<value optimized out>) at pam_fprintd.c:430
#12 0x00007f6f98a80cee in _pam_dispatch_aux (pamh=0x7f6ef80577e0, flags=0, choice=1) at pam_dispatch.c:110
#13 _pam_dispatch (pamh=0x7f6ef80577e0, flags=0, choice=1) at pam_dispatch.c:407
#14 0x00007f6f98a80600 in pam_authenticate (pamh=0x7f6ef80577e0, flags=0) at pam_auth.c:34
#15 0x00007f6f98c8fef4 in do_one_pam_auth (pb=0x299fac0, method=<value optimized out>, final_method=1, pam_service=<value optimized out>,
     map_ident_attr=<value optimized out>, fallback=0, pw_response_requested=1) at ldap/servers/plugins/pam_passthru/pam_ptimpl.c:314
#16 0x00007f6f98c9039a in pam_passthru_do_pam_auth (pb=0x299fac0, cfg=<value optimized out>) at ldap/servers/plugins/pam_passthru/pam_ptimpl.c:465
#17 0x00007f6f98c9111a in pam_passthru_bindpreop (pb=0x299fac0) at ldap/servers/plugins/pam_passthru/pam_ptpreop.c:433
#18 0x00007f6fa31a180a in plugin_call_func (list=0x2742210, operation=401, pb=0x299fac0, call_one=0) at ldap/servers/slapd/plugin.c:1453
#19 0x00007f6fa31a1a3f in plugin_call_list (pb=0x299fac0, whichfunction=401) at ldap/servers/slapd/plugin.c:1415
#20 plugin_call_plugins (pb=0x299fac0, whichfunction=401) at ldap/servers/slapd/plugin.c:398
#21 0x000000000040f6d2 in do_bind (pb=0x299fac0) at ldap/servers/slapd/bind.c:706
#22 0x0000000000414134 in connection_dispatch_operation () at ldap/servers/slapd/connection.c:569
#23 connection_threadmain () at ldap/servers/slapd/connection.c:2339
#24 0x00007f6fa15d1b53 in _pt_root (arg=0x29c01f0) at ../../../mozilla/nsprpub/pr/src/pthreads/ptthread.c:156
#25 0x00007f6fa0f749d1 in start_thread (arg=0x7f6f71de2700) at pthread_create.c:301
#26 0x00007f6fa0cc1a8d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115

Manipulating limits did not prevent this error from occuring:

     echo "fs.file-max = 64000" >> /etc/sysctl.conf      
     echo "* soft nofile 8192" >> /etc/security/limits.conf      
     echo "* hard nofile 8192" >> /etc/security/limits.conf      
     echo "ulimit -n 8192" >> /etc/profile

Additional info:
Comment 1 Noriko Hosoi 2013-09-23 20:00:33 EDT
Hi Colin, 
Do you think this problem is related to Bug 1003940 (reassigned to fprintd)?  Once we get a patched fprintd (fprintd-0.1-20.git04fd09cfa.el6.x86_64 + http://cgit.freedesktop.org/libfprint/fprintd/commit/?id=00ac488cde0f9b916d58af2c1d4f1ae9e05293d0), could this bug be fixed, as well?  Or is it a totally different issue?
Thanks!
--noriko
Comment 2 RHEL Product and Program Management 2013-10-13 22:00:21 EDT
This request was not resolved in time for the current release.
Red Hat invites you to ask your support representative to
propose this request, if still desired, for consideration in
the next release of Red Hat Enterprise Linux.
Comment 3 Noriko Hosoi 2013-10-16 16:31:10 EDT
Original summary: Directory Server crashes in g_logv logging "Cannot create pipe main loop wake-up"

Directory Server (389-ds-base) uses pam module from its pam plug-in.  When the pam passthrough authentication is enabled on the Directory Server, it calls pam_authenticate, in which pipe2 is called eventually.

Stacktrace stopped in pipe2:
Breakpoint 5, pipe2 () at ../sysdeps/unix/syscall-template.S:82
82    T_PSEUDO (SYSCALL_SYMBOL, SYSCALL_NAME, SYSCALL_NARGS)
(gdb) bt
#0  pipe2 () at ../sysdeps/unix/syscall-template.S:82
#1  0x00007f56c98d2b04 in g_main_context_init_pipe (context=0x7f56c40188e0) at gmain.c:515
#2  0x00007f56c98d2c53 in g_main_context_new () at gmain.c:615
#3  0x00007f56c98d2e1d in g_main_context_default () at gmain.c:653
#4  0x00007f56ec229e51 in dbus_connection_setup_with_g_main () from /usr/lib64/libdbus-glib-1.so.2
#5  0x00007f56ec229f8a in dbus_g_bus_get () from /usr/lib64/libdbus-glib-1.so.2
#6  0x00007f56ec451fd1 in create_manager (pamh=0x7f56c4001130, flags=<value optimized out>, argc=0,
    argv=<value optimized out>) at pam_fprintd.c:126
#7  do_auth (pamh=0x7f56c4001130, flags=<value optimized out>, argc=0, argv=<value optimized out>)
    at pam_fprintd.c:379
#8  pam_sm_authenticate (pamh=0x7f56c4001130, flags=<value optimized out>, argc=0, argv=<value optimized out>)
    at pam_fprintd.c:430
#9  0x00007f56f2741cee in _pam_dispatch_aux (pamh=0x7f56c4001130, flags=0, choice=1) at pam_dispatch.c:110
#10 _pam_dispatch (pamh=0x7f56c4001130, flags=0, choice=1) at pam_dispatch.c:407
#11 0x00007f56f2741600 in pam_authenticate (pamh=0x7f56c4001130, flags=0) at pam_auth.c:34
#12 0x00007f56f2950ef4 in do_one_pam_auth (pb=0x152c9a0, method=<value optimized out>, final_method=1,
    pam_service=<value optimized out>, map_ident_attr=<value optimized out>, fallback=0, pw_response_requested=1)
    at ldap/servers/plugins/pam_passthru/pam_ptimpl.c:314
#13 0x00007f56f295139a in pam_passthru_do_pam_auth (pb=0x152c9a0, cfg=<value optimized out>)
    at ldap/servers/plugins/pam_passthru/pam_ptimpl.c:465
#14 0x00007f56f295211a in pam_passthru_bindpreop (pb=0x152c9a0)
    at ldap/servers/plugins/pam_passthru/pam_ptpreop.c:433
[...]

When calling pipe2 in g_main_context_init_pipe, a flag O_CLOEXEC is passed to the system call, which is a close-on-exec flag.  pam_authenticate execs "/sbin/unix_chkpwd".

    (gdb) l 515  (in g_main_context_init_pipe (context=0x7f56c40188e0) at gmain.c:515  --  GLIB - Library of useful routines for C programming)
    510      if (context->wake_up_pipe[0] != -1)
    511        return;
    512   
    513    #ifdef HAVE_PIPE2
    514      /* if this fails, we fall through and try pipe */
    515      pipe2 (context->wake_up_pipe, O_CLOEXEC);
    516    #endif
    517      if (context->wake_up_pipe[0] == -1)
    518        {
    519          if (pipe (context->wake_up_pipe) < 0)

I followed the pam code in gdb but the pipe2 file descriptors were not closed with the close system call in the parent process (Directory server), which makes the Directory Server abort when it reaches the max file descriptor limit (1024, by default).

Also, valgrind --track-fds=yes reports the file descriptor leaks:
==11184== Warning: invalid file descriptor 1024 in syscall pipe2()
[...]
==11184== FILE DESCRIPTORS: 1008 open at exit.
==11184== Open file descriptor 1022:
==11184==    at 0x3568EDBEC7: pipe2 (syscall-template.S:82)

This "open at exit" is reported as many as the pam auth request is sent.  (In my test 998 times.)

I'm running the test using this Distro 	RHEL6.5-20131009.0 Server x86_64.

This is the pam related package list on the system.
# rpm -qa | egrep pam
pam_passwdqc-1.0.5-6.el6.x86_64
pam_passwdqc-debuginfo-1.0.5-6.el6.x86_64
pam-devel-1.1.1-17.el6.x86_64
pam-debuginfo-1.1.1-17.el6.x86_64
fprintd-pam-0.1-21.git04fd09cfa.el6.x86_64
pam-1.1.1-17.el6.x86_64

This is the glib related package list on the system.
# rpm -qa | egrep glib
glibc-headers-2.12-1.130.el6.x86_64
glib2-debuginfo-2.26.1-3.el6.x86_64
glibc-debuginfo-common-2.12-1.130.el6.x86_64
glibc-common-2.12-1.130.el6.x86_64
glibc-2.12-1.130.el6.x86_64
dbus-glib-0.86-6.el6_4.x86_64
glibc-devel-2.12-1.130.el6.x86_64
glibc-debuginfo-2.12-1.130.el6.x86_64
glib2-2.26.1-3.el6.x86_64
glib2-devel-2.26.1-3.el6.x86_64

Please note that this leak is not observed if the following line is not in the config file.
auth        sufficient    pam_fprintd.so
Comment 4 Colin Walters 2013-10-16 17:05:20 EDT
(In reply to Noriko Hosoi from comment #1)
> Hi Colin, 
> Do you think this problem is related to Bug 1003940 (reassigned to fprintd)?
> Once we get a patched fprintd (fprintd-0.1-20.git04fd09cfa.el6.x86_64 +
> http://cgit.freedesktop.org/libfprint/fprintd/commit/
> ?id=00ac488cde0f9b916d58af2c1d4f1ae9e05293d0), could this bug be fixed, as
> well?

Most likely it would also be fixed by that patch, yes.
Comment 5 Colin Walters 2013-10-16 17:07:32 EDT
I will note once again that 389DS is to the best of my knowledge the only component in RHEL which performs multiple pam conversations per process, and this greatly increases its exposure to issues from attempting to dynamically unload libraries and reload them again.
Comment 6 Bastien Nocera 2013-10-17 15:15:41 EDT
I'll put this on needinfo until you can test the packages that are now in brew, as mentioned in bug 1003940.
Comment 7 Noriko Hosoi 2013-10-17 15:26:25 EDT
I tested using this combination and found the fd leaks are still observed.
# rpm -qa | egrep fprintd
fprintd-0.1-21.git04fd09cfa.el6.x86_64
fprintd-debuginfo-0.1-21.git04fd09cfa.el6.x86_64
fprintd-pam-0.1-21.git04fd09cfa.el6.x86_64

Unfortunately, the answer was 'no'...
> (In reply to Noriko Hosoi from comment #1)
>> Hi Colin, 
>> Do you think this problem is related to Bug 1003940 (reassigned to fprintd)?
>> Once we get a patched fprintd (fprintd-0.1-20.git04fd09cfa.el6.x86_64 +
>> http://cgit.freedesktop.org/libfprint/fprintd/commit/
>> ?id=00ac488cde0f9b916d58af2c1d4f1ae9e05293d0), could this bug be fixed, as
>> well?

> Most likely it would also be fixed by that patch, yes.
Comment 8 Colin Walters 2013-10-17 18:15:52 EDT
(In reply to Noriko Hosoi from comment #7)
> I tested using this combination and found the fd leaks are still observed.
> # rpm -qa | egrep fprintd
> fprintd-0.1-21.git04fd09cfa.el6.x86_64
> fprintd-debuginfo-0.1-21.git04fd09cfa.el6.x86_64
> fprintd-pam-0.1-21.git04fd09cfa.el6.x86_64

Hmm.  We could change pam-fprintd.so to create a private DBus connection with a private GMainContext which it then tears down.  That'd ensure we leak neither the DBus fd nor the main context wakeup pipe.

But...why is this code getting run twice?  My understanding of the -znodelete fix is that pam-fprintd.so along with all of its dependencies should remain in memory, and so we shouldn't keep creating main contexts.
Comment 9 Noriko Hosoi 2013-10-18 01:58:36 EDT
Thank you for your response, Colin.

(In reply to Colin Walters from comment #8)
> (In reply to Noriko Hosoi from comment #7)
> > I tested using this combination and found the fd leaks are still observed.
> > # rpm -qa | egrep fprintd
> > fprintd-0.1-21.git04fd09cfa.el6.x86_64
> > fprintd-debuginfo-0.1-21.git04fd09cfa.el6.x86_64
> > fprintd-pam-0.1-21.git04fd09cfa.el6.x86_64
> 
> Hmm.  We could change pam-fprintd.so to create a private DBus connection
> with a private GMainContext which it then tears down.  That'd ensure we leak
> neither the DBus fd nor the main context wakeup pipe.
> 
> But...why is this code getting run twice?  My understanding of the
> -znodelete fix is that pam-fprintd.so along with all of its dependencies
> should remain in memory, and so we shouldn't keep creating main contexts.

Sorry, I think that's a part of pam implementation.  We, Directory Sever, are just calling pam_authenticate for auth...

We decided to recommend DS users not to include fprintd in the pam configuration, for now.  So, we could lower the priority and severity.  Could you please talk to our manager, Nathan Kinder, if it is needed?  Thanks!
Comment 10 Nathan Kinder 2013-10-18 12:19:48 EDT
(In reply to Noriko Hosoi from comment #9)
> We decided to recommend DS users not to include fprintd in the pam
> configuration, for now.  So, we could lower the priority and severity. 
> Could you please talk to our manager, Nathan Kinder, if it is needed? 
> Thanks!

This is correct.  There's no reason that Directory Server needs to use pam_fprintd.so, but it would be nice to not leak descriptors in case someone just points DS at the system-auth PAM service file (which includes pam_frintd.so).  We will update our documentation saying "don't do that", but there is still potential that someone misses this note and doesn't find out that there is a problem until their production Directory Server stops working when it exhausts it's file descriptors.

This isn't urgent, but it would be nice if we could do something about it for RHEL 6.6.
Comment 11 Colin Walters 2013-10-18 15:25:38 EDT
(In reply to Nathan Kinder from comment #10)

> This is correct.  There's no reason that Directory Server needs to use
> pam_fprintd.so, but it would be nice to not leak descriptors in case someone
> just points DS at the system-auth PAM service file (which includes
> pam_frintd.so). 

Right.

> We will update our documentation saying "don't do that",
> but there is still potential that someone misses this note and doesn't find
> out that there is a problem until their production Directory Server stops
> working when it exhausts it's file descriptors.

I think we should strive for a state where RHEL works "out of the box", and so I'll take a look at what we can do from the fprintd/glib sides.
Comment 12 Cynthia Davis 2014-08-19 14:58:07 EDT
Hi Bastien,

I have a customer asking me if we are aware of this issue and wondering if so when it will be fixed.  Since 6.6 Beta is out seems it is under consideration for a later release.  Can you please provide me with an update of where we are with re: to updating the documentation?  Thanks very much!

Cynthia Baldwin, RHCE
Senior Support Relationship Manager
IRC: cbaldwin
Comment 13 Bastien Nocera 2014-08-21 07:12:15 EDT

*** This bug has been marked as a duplicate of bug 1024825 ***

Note You need to log in before you can comment on or make changes to this bug.