Bug 562143 - File descriptor leak in gdm with XDMCP
Summary: File descriptor leak in gdm with XDMCP
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Fedora
Classification: Fedora
Component: gdm
Version: 12
Hardware: All
OS: Linux
low
medium
Target Milestone: ---
Assignee: jmccann
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2010-02-05 11:47 UTC by Michael Young
Modified: 2015-01-14 23:24 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2010-11-04 15:34:13 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
Possible patch (1.63 KB, patch)
2010-03-10 11:41 UTC, Michael Young
no flags Details | Diff
gdm-2.24.1 patch (1.67 KB, patch)
2010-03-10 15:51 UTC, Alex
no flags Details | Diff
An alternate patch (632 bytes, patch)
2010-03-31 15:59 UTC, Michael Young
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
GNOME Bugzilla 606724 0 Normal RESOLVED Xdmcp fail to show greeter at second time 2020-08-13 02:36:32 UTC

Description Michael Young 2010-02-05 11:47:16 UTC
I am starting to get errors like the following with gdm-binary
Feb  5 11:19:38 hostname gdm-binary[16120]: CRITICAL: could not add display to
access file: Too many open files
with the result that new users can't connect and just get a black screen. If I examine lsof there are many entries like
...
gdm-binar 16120 root   46u   REG      253,0       54    400308 /var/run/gdm/auth-for-gdm-Tpyyg8/database
...
gdm-binar 16120 root 1021u   REG      253,0       55    417121 /var/run/gdm/auth-for-username-BON1Sx/database
gdm-binar 16120 root 1022u   REG      253,0       55    417126 /var/run/gdm/auth-for-gdm-5lepCn/database

but we don't have anything like 1000 users connected. hence it looks like gdm is opening these xauth files but (at least in some cases) not closing them, and eventually exhausting its file allocation.
The host involved receives almost all its connections from remote XDMCP connections.

Comment 1 Alex 2010-03-05 18:56:25 UTC
We also have problems with connecting users to GDM, when number of open files exceeds 1000:

# cat /var/log/messages |grep database
...
Feb 24 18:25:10 hostname gdm-binary[3759]: CRITICAL: could not create display access file: Unable to open '/var/run/gdm/auth-for-gdm-mz2yRV/database': Too many open files

# lsof -n |grep -c database
...
1016

Increasing max number of open files didn't help:

# ulimit -a
...
open files                      (-n) 32768

A temporary workaround for a few days is:

# killall gdm-binary

Version-Release number of selected component (if applicable):
2.6.27.41-170.2.117.fc10.x86_64

Comment 2 Michael Young 2010-03-10 11:41:13 UTC
Created attachment 399058 [details]
Possible patch

I think the problem is that the xdmcp session was exiting without cleaning up properly. Something along the lines of the attached patch is needed. I have tested it (briefly) and the file descriptors are now closed at the end of the session.

Comment 3 Alex 2010-03-10 15:51:10 UTC
Created attachment 399120 [details]
gdm-2.24.1 patch

(In reply to comment #2)

We've adapted your patch for Gdm version 2.24.1-4 on our LTSP server. At a first glance everything is OK now with closing database files. We will check number of open files periodically and i'll give later the results.
Thank you.

Comment 4 Ray Strode [halfline] 2010-03-12 22:05:23 UTC
Related upstream bug:

https://bugzilla.gnome.org/show_bug.cgi?id=606724

Comment 5 maksym 2010-03-13 08:25:58 UTC
(In reply to comment #4)
> Related upstream bug:
> 
> https://bugzilla.gnome.org/show_bug.cgi?id=606724    

I do not think so, bug #606724 definitely answer "Maximum number of open XDMCP sessions from host" - that is not related to leak pure leak of file descriptors not been closed, but session management level.

Comment 6 Michael Young 2010-03-13 10:08:36 UTC
(In reply to comment #5)
> (In reply to comment #4)
> > Related upstream bug:
> > 
> > https://bugzilla.gnome.org/show_bug.cgi?id=606724    
> 
> I do not think so, bug #606724 definitely answer "Maximum number of open XDMCP
> sessions from host" - that is not related to leak pure leak of file descriptors
> not been closed, but session management level.    

I thought that at first, but if you compare the patch for this bug with the patch for the other, then they are both trying to do the same sort of thing in a different way. So along the problems seem different, I think they have the same underlying cause of a finished xdmcp session not being cleaned up properly.

Comment 7 maksym 2010-03-13 10:40:31 UTC
(In reply to comment #6)
> (In reply to comment #5)
> > (In reply to comment #4)
> > > Related upstream bug:
> > > 
> > > https://bugzilla.gnome.org/show_bug.cgi?id=606724    
[...]
> I thought that at first, but if you compare the patch for this bug with the
> patch for the other, then they are both trying to do the same sort of thing in
> a different way. So along the problems seem different, I think they have the
> same underlying cause of a finished xdmcp session not being cleaned up
> properly.    

As you are author of original patch that works fine for us in intensive LTSP environment (we have approx 20...40 xdmcp terminal users per day) we can test patch from https://bugzilla.gnome.org/show_bug.cgi?id=606724 to see if it will have same behavior i.e. fix descriptors leak on users disconnect...

BTW
thats seems a regression from previous version of GDM we used with before upgrading to FC10

Comment 8 maksym 2010-03-13 11:37:38 UTC
(In reply to comment #7)
[...]
> As you are author of original patch that works fine for us in intensive LTSP
> environment (we have approx 20...40 xdmcp terminal users per day) we can test
> patch from https://bugzilla.gnome.org/show_bug.cgi?id=606724 to see if it will
> have same behavior i.e. fix descriptors leak on users disconnect...

I tested patch but situation is worth - it even unable to login. /var/log/messages contains line:

Mar 13 13:28:45 elbrus gdm-binary[29052]: WARNING: GdmXdmcpDisplayFactory: Failed to look up session id 1680284739
Mar 13 13:28:45 elbrus gdm-binary[29052]: WARNING: GdmXdmcpDisplayFactory: Failed to look up session id 1680284740
Mar 13 13:28:45 elbrus gdm-binary[29052]: WARNING: GdmXdmcpDisplayFactory: Failed to look up session id 1680284741
Mar 13 13:28:45 elbrus gdm-binary[29052]: WARNING: GdmXdmcpDisplayFactory: Failed to look up session id 1680284742
[...]
Mar 13 13:28:46 elbrus gdm-binary[29052]: WARNING: GdmXdmcpDisplayFactory: Failed to look up session id 1680285750
Mar 13 13:28:46 elbrus gdm-binary[29052]: WARNING: GdmXdmcpDisplayFactory: Failed to look up session id 1680285751
Mar 13 13:28:46 elbrus gdm-binary[29052]: WARNING: GdmXdmcpDisplayFactory: Failed to look up session id 1680285752
Mar 13 13:28:46 elbrus gdm-binary[29052]: CRITICAL: could not add display to access file: Too many open files
Mar 13 13:28:46 elbrus gdm-binary[29052]: WARNING: Unable to set up access control for display 1
Mar 13 13:28:46 elbrus gdm-binary[29052]: WARNING: GdmDisplay: display lasted 0,000427 seconds  
Mar 13 13:28:48 elbrus gdm-binary[29052]: CRITICAL: could not add display to access file: Too many open files
Mar 13 13:28:48 elbrus gdm-binary[29052]: WARNING: Unable to set up access control for display 1
Mar 13 13:28:48 elbrus gdm-binary[29052]: WARNING: GdmDisplay: display lasted 0,000593 seconds  
Mar 13 13:28:49 elbrus gdm-simple-greeter[29136]: WARNING: Could not ask power manager if user can suspend: The name org.free
Mar 13 13:28:49 elbrus gdm-simple-greeter[29136]: WARNING: Could not ask power manager if user can suspend: The name org.free
Mar 13 13:28:49 elbrus gdm-simple-greeter[29136]: WARNING: Unable to run ck-history: Помилка виконання дочірнього процесу "ck
Mar 13 13:28:52 elbrus gdm-binary[29052]: CRITICAL: could not add display to access file: Too many open files
Mar 13 13:28:52 elbrus gdm-binary[29052]: WARNING: Unable to set up access control for display 1
Mar 13 13:28:52 elbrus gdm-binary[29052]: WARNING: GdmDisplay: display lasted 0,000371 seconds  
Mar 13 13:29:00 elbrus gdm-binary[29052]: CRITICAL: could not add display to access file: Too many open files
Mar 13 13:29:00 elbrus gdm-binary[29052]: WARNING: Unable to set up access control for display 1
Mar 13 13:29:00 elbrus gdm-binary[29052]: WARNING: GdmDisplay: display lasted 0,000397 seconds  
Mar 13 13:29:04 elbrus init: prefdm main process ended, respawning

May be proposed patch from https://bugzilla.gnome.org/show_bug.cgi?id=606724 misbehave with gdm-2.24.1-fix-xdmcp.patch present in rpm package?

Comment 9 Michael Young 2010-03-13 12:42:44 UTC
Of the two patches I like mine better as I think it is a cleaner way to do it, and I am not convinced the other patch does a full clean up. Where I am less clear is if this is the best place to set up this finish hook - I have dipped into the code rather than studying it fully and the hook might fit better somewhere else, eg. in gdm-xdmcp-display-factory.c.

Comment 10 Alex 2010-03-13 15:21:42 UTC
(In reply to comment #9)
> Of the two patches I like mine better as I think it is a cleaner way to do it,

Your patch really works in our configuration, everything seems to be OK now.

Comment 11 Michael Young 2010-03-31 15:59:44 UTC
Created attachment 403776 [details]
An alternate patch

Here is another possible patch. This moves the code to a lower level, and is more minimal, though a bit of a hack. It does still fix the file leak though.

Comment 12 Michael Young 2010-05-12 09:17:00 UTC
The patch in Comment 2 is now upstream in 2.30.1 onwards
http://git.gnome.org/browse/gdm/commit/?id=0c34aa7949bc24a2a8b3217cefb3c978b892591b
Can we have it back-ported to Fedora 12 please?

Comment 13 Bug Zapper 2010-11-03 22:58:26 UTC
This message is a reminder that Fedora 12 is nearing its end of life.
Approximately 30 (thirty) days from now Fedora will stop maintaining
and issuing updates for Fedora 12.  It is Fedora's policy to close all
bug reports from releases that are no longer maintained.  At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '12'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 12's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 12 is end of life.  If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora please change the 'version' of this 
bug to the applicable version.  If you are unable to change the version, 
please add a comment here and someone will do it for you.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events.  Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 14 Michael Young 2010-11-04 15:34:13 UTC
This works in Fedora 14 (and I would guess F13 as well from the gdm version).


Note You need to log in before you can comment on or make changes to this bug.