Bug 487552 - Leaking drm descriptors on every console switch
Summary: Leaking drm descriptors on every console switch
Keywords:
Status: CLOSED RAWHIDE
Alias: None
Product: Fedora
Classification: Fedora
Component: xorg-x11-drv-i810
Version: rawhide
Hardware: All
OS: Linux
high
high
Target Milestone: ---
Assignee: Kristian Høgsberg
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2009-02-26 17:28 UTC by Zdenek Kabelac
Modified: 2009-03-05 14:59 UTC (History)
1 user (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2009-03-05 14:18:43 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)

Description Zdenek Kabelac 2009-02-26 17:28:11 UTC
Description of problem:

Well I'm surely not the only one who noticed, that latest Xorg releases are leaking resources:

http://lists.freedesktop.org/archives/intel-gfx/2009-February/001530.html
http://lists.freedesktop.org/archives/intel-gfx/2009-February/001533.html
http://lists.freedesktop.org/archives/intel-gfx/2009-February/001352.html

In my case - each console switch takes out ~600 leaked descriptors.
(That gives me ~50 switches before descriptors for user are exhausted)

I've tried  UXA/EXA/git versions of drm/intel 

Feel free to move to the right packages - I'm experiencing this issue with intel driver so I've opened bugzilla for intel driver.

Version-Release number of selected component (if applicable):
xorg-x11-drv-i810-2.6.0-7.fc11.x86_64
libdrm-2.4.5-0.fc11.x86_64
kernel  2.6.29-rc6

How reproducible:


Steps to Reproduce:
1.as root: lsof | grep "drm mm object" | wc -l 
2.switch to console and back 
3.compare numbers
  
Actual results:


Expected results:


Additional info:

Comment 1 Zdenek Kabelac 2009-03-02 22:11:33 UTC
this is upstream bugzilla: http://bugs.freedesktop.org/show_bug.cgi?id=20404

Also note - the only 'steady' state is permanently growing size of Xorg and amount of allocated descriptors.

Comment 2 Zdenek Kabelac 2009-03-03 10:51:42 UTC
I think I should add new comment here -  it looks like something changed when I have made latest upgrade of kernel & xorg.

Currently it looks like the size of descriptors oscillates around 5400 descriptors for drm objects and it is not growing by hundreds with each console switch. 

I'm running latest vanialla kernel:
2.6.29-rc6  commit:2450cf51a1bdba7037e91b1bcc494b01c58aaf66
xorg-x11-server-Xorg-1.6.0-3.fc11.x86_64
xorg-x11-drv-intel-2.6.0-11.fc11.x86_64

Using EXA acceleration.

When there will be some time I'll do some more tests.

Comment 3 Zdenek Kabelac 2009-03-03 11:10:23 UTC
Ok I've just noticed the problem is still there -  Xorg grows RESident memory with each console switch in `top` - except  lsof no longer shows such high growing numbers - so I've switched to different script to discover those leaking descriptors:

cat /proc/`pidof X`/smaps | grep "drm mm" | wc -l

here the number still goes up with each console switch like this:

7838
8267
8856

Comment 4 Zdenek Kabelac 2009-03-03 17:54:40 UTC
Ok - I'm not really sure what has caused difference with my 'cat' and lsof - now both seems to be giving same numbers again - currently 10279

here is one of those smaps entries:

7f3d6b6cf000-7f3d6b6d0000 rw-s 00000000 00:08 134823    /drm mm object (deleted)
Size:                  4 kB
Rss:                   4 kB
Pss:                   4 kB
Shared_Clean:          0 kB
Shared_Dirty:          0 kB
Private_Clean:         0 kB
Private_Dirty:         4 kB
Referenced:            4 kB
Swap:                  0 kB
KernelPageSize:        4 kB
MMUPageSize:           4 kB

Comment 5 Zdenek Kabelac 2009-03-03 19:27:46 UTC
Here is output from /proc


cat /proc/dri/0/gem_objects 

10459 objects
128901120 object bytes
6 pinned
50532352 pin bytes
52359168 gtt bytes
218152960 gtt total



after console switch:

10830 objects
130420736 object bytes
6 pinned
50532352 pin bytes
51458048 gtt bytes
218152960 gtt total


See the difference on the object count

I assume something is not right in  linux/drivers/gpu/drm/drm_gem.c  drm_gem_object_free()

Comment 6 Kristian Høgsberg 2009-03-04 23:01:51 UTC
Please always mention the specific intel part you're using when filing bugs.  I've build a new version of the intel driver that fixes this for i965.

Comment 7 Zdenek Kabelac 2009-03-05 10:13:16 UTC
Yes - we spend 2 days on locating this bug with Lukas:

http://cgit.freedesktop.org/xorg/driver/xf86-video-intel/commit/?id=d4c64f01b9429a8fb314e43f40d1f02bb8aab30f


BTW - your commit http://cgit.freedesktop.org/xorg/driver/xf86-video-intel/commit/?id=095a001f755201d3c19335b67a84c57b1d080a83

seems to kill switching from console back to Xorg on my machine - I'd to revert this commit.  
(my hw T61, i965, 4GB)

Comment 8 Kristian Høgsberg 2009-03-05 14:18:43 UTC
(In reply to comment #7)
> Yes - we spend 2 days on locating this bug with Lukas:
> 
> http://cgit.freedesktop.org/xorg/driver/xf86-video-intel/commit/?id=d4c64f01b9429a8fb314e43f40d1f02bb8aab30f

Great, thanks.
 
> BTW - your commit
> http://cgit.freedesktop.org/xorg/driver/xf86-video-intel/commit/?id=095a001f755201d3c19335b67a84c57b1d080a83
> 
> seems to kill switching from console back to Xorg on my machine - I'd to revert
> this commit.  
> (my hw T61, i965, 4GB)

You need a rawhide kernel for this for this to not hit a deadlock in the drm code.  I'll add a requires to the xorg-x11-drv-intel rpm.

Comment 9 Zdenek Kabelac 2009-03-05 14:59:56 UTC
Just to add a final note to this bugzilla, I'm usually running vanilla latest kernel with my own set of options (rawhide kernel is way tooooo slow with all those debugs)

Anyway the lock goes away with this kernel commit: 5ad8b7d12605e88d1e532061699102797fdefe08 (drm: fix double lock typo) 

I've been using kernel compiled just before this patch went in....


Note You need to log in before you can comment on or make changes to this bug.