Bug 506998

Summary: [nss_ldap] Second call to getpwnam returns NULL after server idletimeout
Product: [Fedora] Fedora Reporter: Ray Strode [halfline] <rstrode>
Component: nss_ldapAssignee: Nalin Dahyabhai <nalin>
Status: CLOSED WONTFIX QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: low    
Version: 11CC: alexandre.magaz, dpal, jmccann, nalin, omoris, rstrode
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 506734 Environment:
Last Closed: 2010-06-28 13:09:59 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 517000    

Description Ray Strode [halfline] 2009-06-19 18:05:43 UTC
+++ This bug was initially created as a clone of Bug #506734 +++

+++ This bug was initially created as a clone of Bug #499489 +++
--- Additional comment from alexm.cat on 2009-06-17 08:22:04 EDT ---

It works for me, but not at all. GDM no longer crashes after 10 minutes of inactivity, but It does when I try to log in after every 10 minutes. This is the traceback I get:

Jun 17 14:05:32 PL-REC019 gdm-binary[21771]: nss_ldap: could not search LDAP server - Server is unavailable
Jun 17 14:05:32 PL-REC019 gdm-simple-slave[23053]: WARNING: Failed to add user authorization: Message did not receive a reply (timeout by message bus)
Jun 17 14:05:32 PL-REC019 gdm[23230]: ******************* START **********************************
Jun 17 14:05:33 PL-REC019 gdm[23230]: [Thread debugging using libthread_db enabled]
Jun 17 14:05:33 PL-REC019 gdm[23230]: 0x008cd424 in __kernel_vsyscall ()
Jun 17 14:05:33 PL-REC019 gdm[23230]: #0  0x008cd424 in __kernel_vsyscall ()
Jun 17 14:05:33 PL-REC019 gdm[23230]: #1  0x0045ed43 in __waitpid_nocancel () from /lib/libc.so.6
Jun 17 14:05:33 PL-REC019 gdm[23230]: #2  0x080666a1 in gdm_signal_handler_backtrace ()
Jun 17 14:05:33 PL-REC019 gdm[23230]: #3  0x08066791 in signal_handler ()
Jun 17 14:05:33 PL-REC019 gdm[23230]: #4  <signal handler called>
Jun 17 14:05:33 PL-REC019 gdm[23230]: #5  0x008cd424 in __kernel_vsyscall ()
Jun 17 14:05:33 PL-REC019 gdm[23230]: #6  0x003ec7c1 in raise () from /lib/libc.so.6
Jun 17 14:05:33 PL-REC019 gdm[23230]: #7  0x003ee092 in abort () from /lib/libc.so.6
Jun 17 14:05:33 PL-REC019 gdm[23230]: #8  0x007546fd in g_assertion_message () from /lib/libglib-2.0.so.0
Jun 17 14:05:33 PL-REC019 gdm[23230]: #9  0x00754cbd in g_assertion_message_expr () from /lib/libglib-2.0.so.0
Jun 17 14:05:33 PL-REC019 gdm[23230]: #10 0x08061f84 in start_session_timeout ()
Jun 17 14:05:33 PL-REC019 gdm[23230]: #11 0x0072b341 in ?? () from /lib/libglib-2.0.so.0
Jun 17 14:05:33 PL-REC019 gdm[23230]: #12 0x0072d1e8 in g_main_context_dispatch () from /lib/libglib-2.0.so.0
Jun 17 14:05:33 PL-REC019 gdm[23230]: #13 0x007307f8 in ?? () from /lib/libglib-2.0.so.0
Jun 17 14:05:33 PL-REC019 gdm[23230]: #14 0x00730caf in g_main_loop_run () from /lib/libglib-2.0.so.0
Jun 17 14:05:33 PL-REC019 gdm[23230]: #15 0x0804d2cf in main ()
Jun 17 14:05:33 PL-REC019 gdm[23230]: 
Jun 17 14:05:33 PL-REC019 gdm[23230]: Thread 1 (Thread 0xb8055720 (LWP 23053)):
Jun 17 14:05:33 PL-REC019 gdm[23230]: #0  0x008cd424 in __kernel_vsyscall ()
Jun 17 14:05:33 PL-REC019 gdm[23230]: No symbol table info available.
Jun 17 14:05:33 PL-REC019 gdm[23230]: #1  0x0045ed43 in __waitpid_nocancel () from /lib/libc.so.6
Jun 17 14:05:33 PL-REC019 gdm[23230]: No symbol table info available.
Jun 17 14:05:33 PL-REC019 gdm[23230]: #2  0x080666a1 in gdm_signal_handler_backtrace ()
Jun 17 14:05:33 PL-REC019 gdm[23230]: No symbol table info available.
Jun 17 14:05:33 PL-REC019 gdm[23230]: #3  0x08066791 in signal_handler ()
Jun 17 14:05:33 PL-REC019 gdm[23230]: No symbol table info available.
Jun 17 14:05:33 PL-REC019 gdm[23230]: #4  <signal handler called>
Jun 17 14:05:33 PL-REC019 gdm[23230]: No symbol table info available.
Jun 17 14:05:33 PL-REC019 gdm[23230]: #5  0x008cd424 in __kernel_vsyscall ()
Jun 17 14:05:33 PL-REC019 gdm[23230]: No symbol table info available.
Jun 17 14:05:33 PL-REC019 gdm[23230]: #6  0x003ec7c1 in raise () from /lib/libc.so.6
Jun 17 14:05:33 PL-REC019 gdm[23230]: No symbol table info available.
Jun 17 14:05:33 PL-REC019 gdm[23230]: #7  0x003ee092 in abort () from /lib/libc.so.6
Jun 17 14:05:33 PL-REC019 gdm[23230]: No symbol table info available.
Jun 17 14:05:33 PL-REC019 gdm[23230]: #8  0x007546fd in g_assertion_message () from /lib/libglib-2.0.so.0
Jun 17 14:05:33 PL-REC019 gdm[23230]: No symbol table info available.
Jun 17 14:05:33 PL-REC019 gdm[23230]: #9  0x00754cbd in g_assertion_message_expr () from /lib/libglib-2.0.so.0
Jun 17 14:05:33 PL-REC019 gdm[23230]: No symbol table info available.
Jun 17 14:05:33 PL-REC019 gdm[23230]: #10 0x08061f84 in start_session_timeout ()
Jun 17 14:05:33 PL-REC019 gdm[23230]: No symbol table info available.
Jun 17 14:05:33 PL-REC019 gdm[23230]: #11 0x0072b341 in ?? () from /lib/libglib-2.0.so.0
Jun 17 14:05:33 PL-REC019 gdm[23230]: No symbol table info available.
Jun 17 14:05:33 PL-REC019 gdm[23230]: #12 0x0072d1e8 in g_main_context_dispatch () from /lib/libglib-2.0.so.0
Jun 17 14:05:33 PL-REC019 gdm[23230]: No symbol table info available.
Jun 17 14:05:33 PL-REC019 gdm[23230]: #13 0x007307f8 in ?? () from /lib/libglib-2.0.so.0
Jun 17 14:05:33 PL-REC019 gdm[23230]: No symbol table info available.
Jun 17 14:05:33 PL-REC019 gdm[23230]: #14 0x00730caf in g_main_loop_run () from /lib/libglib-2.0.so.0
Jun 17 14:05:33 PL-REC019 gdm[23230]: No symbol table info available.
Jun 17 14:05:33 PL-REC019 gdm[23230]: #15 0x0804d2cf in main ()
Jun 17 14:05:33 PL-REC019 gdm[23230]: No symbol table info available.
Jun 17 14:05:33 PL-REC019 gdm[23230]: The program is running.  Quit anyway (and detach it)? (y or n) [answered Y; input not from terminal]
Jun 17 14:05:33 PL-REC019 gdm[23230]: ******************* END **********************************

Notice the LDAP error, it's always before the traceback.

--- Additional comment from alexm.cat on 2009-06-19 08:35:56 EDT ---

And this with UTF-8:

Jun 19 11:53:39 PL-REC019 gdm-binary[1450]: nss_ldap: could not search LDAP server - Server is unavailable
Jun 19 11:53:39 PL-REC019 gdm-simple-slave[2128]: WARNING: Failed to add user authorization: could not find user "t7809263" on system

--- Additional comment from rstrode on 2009-06-19 13:58:28 EDT ---

As you pointed out above, the second log has:

Jun 19 11:53:39 PL-REC019 gdm-simple-slave[2128]: DEBUG(+): GdmSlave: Requesting user authorization
Jun 19 11:53:39 PL-REC019 gdm-binary[1450]: nss_ldap: could not search LDAP server - Server is unavailable
Jun 19 11:53:39 PL-REC019 gdm-simple-slave[2128]: WARNING: Failed to add user authorization: could not find user "t7809263" on system

In this case, the AddUserAuthentication call is successfully invoked, it makes it across the bus to the other side, and then it fails in

_create_xauth_file_for_user is trying to do

password_entry = getpwnam ("t7809263");

and failing.

This failure appears to be a bug in nss_ldap.   It should automatically detect when the server goes away and try to reconnect.  That appears to be the root problem.  All the other problems are poor error handling.

Since I need to fix the bad error handling, too, I'm going to clone this report instead of reassign.

Comment 1 Ondrej Moriš 2010-01-08 10:10:17 UTC
Any progress here? It is possible to consider it as nss_ldap bug?

Comment 2 Ray Strode [halfline] 2010-01-08 15:07:53 UTC
Yes, it is.  No idea on progress.

Comment 3 Dmitri Pal 2010-01-11 19:00:50 UTC
Is it still reproducible with latest GDM patches?

Can you please try with SSSD? SSSD is now a preferred method for getting user identities online and offline and handles the LDAP timeout nicely.

Comment 4 Bug Zapper 2010-04-27 15:06:05 UTC
This message is a reminder that Fedora 11 is nearing its end of life.
Approximately 30 (thirty) days from now Fedora will stop maintaining
and issuing updates for Fedora 11.  It is Fedora's policy to close all
bug reports from releases that are no longer maintained.  At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '11'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 11's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 11 is end of life.  If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora please change the 'version' of this 
bug to the applicable version.  If you are unable to change the version, 
please add a comment here and someone will do it for you.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events.  Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 5 Bug Zapper 2010-06-28 13:09:59 UTC
Fedora 11 changed to end-of-life (EOL) status on 2010-06-25. Fedora 11 is 
no longer maintained, which means that it will not receive any further 
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of 
Fedora please feel free to reopen this bug against that version.

Thank you for reporting this bug and we are sorry it could not be fixed.