Bug 465699

Summary: Wallpaper background yields CPU spike
Product: [Fedora] Fedora Reporter: Didier <d-bugzilla>
Component: gnome-desktopAssignee: Ray Strode [halfline] <rstrode>
Status: CLOSED CANTFIX QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: medium    
Version: rawhideCC: bnocera, control-center-maint, d-bugzilla, mclasen, rstrode, sokerlp, xgl-maint
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2008-10-20 14:54:46 EDT Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
Bug Depends On:    
Bug Blocks: 457945    
Attachments:
Description Flags
Xorg strace (100% CPU spike)
none
strace of completely captured X spike
none
dmesg output
none
xorg.conf
none
Xorg.0.log none

Description Didier 2008-10-05 14:54:21 EDT
Description of problem:

When changing the wallpaper background (with right mouse > 'Change Desktop Background' or via gnome-appearance-properties), the display and keyboard are completely frozen for 5-10 seconds (mousepointer can be moved), and CPU spikes to 100%.

After 5-10 seconds, the display is updated, and system (Core2Duo T7700) is usable again.


Version-Release number of selected component (if applicable):

control-center-2.24.0.1-3.fc10.x86_64
glibc-2.8.90-12.x86_64


How reproducible:

Always


Additional info:

1. No problem in FC8

2. When stracing gnome-appearance-properties, I get this backtrace (glibc-2.8.90-12.x86_64):

*** glibc detected *** strace: malloc(): memory corruption (fast): 0x0000000000f937d0 ***
======= Backtrace: =========
/lib64/libc.so.6[0x321bc79f98]
/lib64/libc.so.6[0x321bc7d621]
/lib64/libc.so.6(__libc_malloc+0x98)[0x321bc7eaf8]
strace[0x408728]
strace[0x40598e]
strace[0x404696]
/lib64/libc.so.6(__libc_start_main+0xe6)[0x321bc1e566]
strace[0x401e69]
======= Memory map: ========
00400000-00447000 r-xp 00000000 fd:00 905609                             /usr/bin/strace
00647000-00648000 rw-p 00047000 fd:00 905609                             /usr/bin/strace
00648000-00656000 rw-p 00648000 00:00 0 
00847000-00848000 rw-p 00047000 fd:00 905609                             /usr/bin/strace
00f93000-00fb4000 rw-p 00f93000 00:00 0                                  [heap]
321a400000-321a420000 r-xp 00000000 fd:00 679938                         /lib64/ld-2.8.90.so
321a61f000-321a620000 r--p 0001f000 fd:00 679938                         /lib64/ld-2.8.90.so
321a620000-321a621000 rw-p 00020000 fd:00 679938                         /lib64/ld-2.8.90.so
321bc00000-321bd6c000 r-xp 00000000 fd:00 680007                         /lib64/libc-2.8.90.so
321bd6c000-321bf6c000 ---p 0016c000 fd:00 680007                         /lib64/libc-2.8.90.so
321bf6c000-321bf70000 r--p 0016c000 fd:00 680007                         /lib64/libc-2.8.90.so
321bf70000-321bf71000 rw-p 00170000 fd:00 680007                         /lib64/libc-2.8.90.so
321bf71000-321bf76000 rw-p 321bf71000 00:00 0 
3229600000-3229616000 r-xp 00000000 fd:00 680345                         /lib64/libgcc_s-4.3.2-20080917.so.1
3229616000-3229815000 ---p 00016000 fd:00 680345                         /lib64/libgcc_s-4.3.2-20080917.so.1
3229815000-3229816000 rw-p 00015000 fd:00 680345                         /lib64/libgcc_s-4.3.2-20080917.so.1
7f80e0000000-7f80e0021000 rw-p 7f80e0000000 00:00 0 
7f80e0021000-7f80e4000000 ---p 7f80e0021000 00:00 0 
7f80e722d000-7f80e722f000 rw-p 7f80e722d000 00:00 0 
7f80e7264000-7f80e7266000 rw-p 7f80e7264000 00:00 0 
7fffef251000-7fffef266000 rw-p 7ffffffea000 00:00 0                      [stack]
7fffef3b0000-7fffef3b1000 r-xp 7fffef3b0000 00:00 0                      [vdso]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0                  [vsyscall]
select(Aborted
Comment 1 Didier 2008-10-10 03:28:40 EDT
Created attachment 319982 [details]
Xorg strace (100% CPU spike)
Comment 2 Didier 2008-10-10 03:30:10 EDT
Additional info :

The CPU spike (Xorg process = 99.8-100%) and general temporary (= 5-10 seconds) lock-up does not only occur when manually changing the background (as described in original bug description), but with all operations involving background manipulation :

1. GDM startup ;
2. user login ;
3. desktop background when executing 'startx' in runlevel 3 ;
4. background manipulating applications such as wp_tray.


This is a fresh F10-Beta install (no F8/F9-upgrade) now mostly following Rawhide (gnome 2.24.0-3.fc10.x86_64), and happens with both an F8-migrated user profile and a fresh F10 user profile.


In comment #1 attachment : an "Xorg" strace while CPU-spiking at 100% (captured by means of a remote login).
Comment 3 Ray Strode [halfline] 2008-10-10 10:02:35 EDT
Can you try adding

gtk-enable-animations = 0

to /etc/gtk-2.0/gtkrc

running 

pkill nautilus

and then change your background?  If you do those things, does your cpu still spike?
 
Note there shouldn't be a cross fade effect anymore with this change.
Comment 4 Didier 2008-10-10 11:15:18 EDT
Modified /etc/gtk-2.0/gtkrc , and restarted nautilus.

The CPU still spikes : both immediately after killing nautilus (which probably invokes a background reload), and when manually changing the background.


Please note that entering and leaving the password-protected screensaver (which has a different star-spangled F10 background) does not yield a CPU spike.



I attached a new X strace, which completely captures the spike. Are these X straces useful, or should I refrain from attaching them ?
Comment 5 Didier 2008-10-10 11:16:52 EDT
Created attachment 320027 [details]
strace of completely captured X spike
Comment 6 Ray Strode [halfline] 2008-10-10 11:31:57 EDT
Can you attach dmesg, /var/log/Xorg.0.log and /etc/X11/xorg.conf?

if you set your background to a solid color does the cpu still spike?
Comment 7 Ray Strode [halfline] 2008-10-10 11:34:36 EDT
The straces aren't imediately telling me where the problem is but I appreciate that you took the time to get the traces and attach them.

They definitely had the potential to be useful, and they might give hints later.
Comment 8 Didier 2008-10-10 14:21:16 EDT
Comment #7 : Ray, I certainly appreciate you taking the time to look into this issue. :)

Comment #6 :

Additional observations :


1. switching between Metacity and Compiz makes no difference ;


2. a bit difficult to explain but, once I select the "No Wallpaper" background, nothing changes immediately, but each additional selection is one selection behind (another bugzilla entry ?).

Example (with e.g. Aqua background selected) :
a. select 'No Wallpaper' (solid color) -> no spike, nothing changes ;
b. select 'Blast of Red' -> spike, Blast of Red ;
c. 'No Wallpaper' -> no spike, nothing changes ;
d. change Color to 'Horz.Gradient' -> no spike, preview shows Vert. Gradient, background = solid color;
e. change Color to 'Vert.Gradient' -> no spike, preview shows Horz. Gradient, background = vert. gradient;
e. change Color to 'Solid' -> no spike, preview shows Solid, background = horz. gradient;
f. select 'Aqua' -> spike, background = Aqua.


3. As this laptop (Smolt UUID 3097071a-fdf6-451e-96a8-a442d63e5090) has an nVidia card, I'm using an xorg.conf (attached, identical to the one I use with F8 without any problems whatsoever). Removing the xorg.conf renders the desktop quite ugly, but the spikes are gone.

I am aware that nVidia is proprietary (using Livna/RPMfusion xorg-x11-drv-nvidia-177.78-2.fc10.x86_64), but implementing 'nv' instead of 'nvidia' is not an option (yet).
Comment 9 Didier 2008-10-10 14:25:33 EDT
Created attachment 320042 [details]
dmesg output
Comment 10 Didier 2008-10-10 14:27:24 EDT
Created attachment 320043 [details]
xorg.conf
Comment 11 Didier 2008-10-10 14:28:45 EDT
Created attachment 320044 [details]
Xorg.0.log

Interesting ?  :

[mi] EQ overflowing. The server is probably stuck in an infinite loop.
[mi] mieqEnequeue: out-of-order valuator event; dropping.
[mi] EQ overflowing. The server is probably stuck in an infinite loop.
[mi] mieqEnequeue: out-of-order valuator event; dropping.
Comment 12 Ray Strode [halfline] 2008-10-10 14:49:25 EDT
Thanks for the logs. 

So for F10 we added a subtle crossfade effect when logging in and switching backgrounds.  This affect stresses the video card driver somewhat, because it's doing fullscreen blending.

The kind of blending being done is the kind that video cards are really good at doing though.  It's the same kind of blending that's used for providing antialiased fonts all over your screen for instance.

It seems like the sudden in flux of calls is making the nvidia driver block the X server for a considerable amount of time.  The message you pointed out (and was partially in your strace) is probably a side effect of the driver being slow to give control back to the server.

Unfortunately, we can't fix the driver since it's not open.  We don't have great options.  Possibilities I can think of:

1) Try to detect if the animation is taking too long and cancel the animation (i suspect this won't work because we already try to complete the animation after .5 seconds.  If it's taking an order of magnitude more time than that, then cancelling the op isn't helping.

2) Try to do the blend in a different way to keep the driver from getting overwhelmed.  Right now we do gdk_display_flush() after each frame.  We could try changing the flush to an XSync call.  That will make add latency to the animation which may ruin the polish it adds.

3) We may have to CANTFIX the bug.  This obviously would be suboptimal.

None of those options are great.

One little nitch is you said changing the gtkrc *didn't* get rid of the spike.  This is sort of surprising to me.  Can you try one more time making that change just to be sure?  If disabling the fade effect doesn't get rid of the problem this issue could be something else entirely.

I'm going to move this to the X server just in case any of the X guys have any ideas that are better than 1, 2, or 3.
Comment 13 Matěj Cepl 2008-10-10 17:19:16 EDT
Well, I am afraid, unless the reporter is able to reproduce it with open-source drivers, as far as Xorg team goes we are talking about option 3 only. Time is just too short to waste it on going through the binary stuff with gdb and trying to understand how does it work. Sorry, but unless you are able to reproduce the problem after doing steps described in https://fedoraproject.org/wiki/Xorg/3rd_Party_Video_Drivers and provide us with corresponding Xorg.*.log (and /etc/X11/xorg.conf is used) this would get a big CANTFIX from us (because we really cannot do anything).

I am sorry not being able to be more helpful.
Comment 14 Didier 2008-10-13 05:02:26 EDT
Upgraded to gnome-desktop-2.24.0-4.fc10 , xorg-x11-server-Xorg-1.5.2-2.fc10 , kernel-2.6.27-3.fc10 and (proprietary, rebuilt through rpmfusion rpm's) nVidia 177.80 : CPU still spikes when a wallpaper change is invoked.

The issue can be contributed to proprietary 'nvidia', as it does not occur with 'nv' ; I guess purchasing decisions will shift to Intel (and AMD) again.


I fully agree with the CANTFIX 'solution' (as it is nVidia's choice to be proprietary, support burden should be on them too) ; however, is there anything that can be done to prevent hordes of nVidia users to be hit by this issue when F10 is released ? Does the log from comment #11 points to a clue ?

To repeat : no issues experienced with F8, and "gtk-enable-animations = 0" (comment #3) does not change the behaviour.

Note1 : thank you Ray, for your extensive explanations, much appreciated ;
Note2 : the erroneous 'No Wallpaper' behaviour (comment #8) looks to be fixed in current Rawhide.
Comment 16 Ray Strode [halfline] 2008-10-20 14:54:46 EDT
So I got ahold of some nvidia hardware today and couldn't really reproduce the problem.

I say "couldn't really" because when I first got the hardware I could reproduce the problem, but after updating it to the latest bits and getting a debuggable test environment set up, the problem disappeared.

I couldn't reproduce the problem after that no matter how hard I tried.

I'm going to CANTFIX this issue, but we may have to revisit this if a lot of users report problems when the release candidate comes out.
Comment 17 Didier 2008-10-21 04:32:14 EDT
Funny, I still experience the issue (clean profile, clean install of F10-Beta with various bits and pieces (mainly gnome & kernel) from Rawhide).

I'll report back after a full resync with F10-Preview.


TIA for the invested time, Ray.
Comment 18 Didier 2008-11-19 04:06:19 EST
For sake of reference (in case others get bitten by this bug) :

Running F10-Preview, CPU spikes still occur.


Seems to be 'resolved' by executing :
$ nvidia-settings -a InitialPixmapPlacement=2


'InitialPixmapPlacement' :
http://cgit.freedesktop.org/~aplattner/nvidia-settings/tree/src/libXNVCtrl/NVCtrl.h?id=b27db3d10d58b821e87fbe3f46166e02dc589855#n2797

"NV_CTRL_INITIAL_PIXMAP_PLACEMENT_VIDMEM creates pixmaps in video memory when enough resources are available."

(instead of creating them in system memory)


Note 1 : As said in comment #2, this issue only occurs when manipulating desktop backgrounds. As an additional data point, I'd like to note I also experienced it when using the 'fade' slide effect (slides fading into each other) with OOo Impress.

Note 2 : as nvidia-settings is executed as non-root, this 'fix' does not improve the CPU spike with both the initial GDM login screen and subsequent post-login desktop loading.

Note 3 : with InitialPixmapPlacement=2 , cairo-dock becomes *very* slow (I should probably post a bug against cairo-dock).
Comment 19 Ray Strode [halfline] 2008-11-19 09:26:31 EST
Thank you for the work around.  May be useful for others that run into this.
Comment 20 Didier 2008-11-19 11:41:21 EST
FYI : this issue is resolved in the (proprietary and beta) Nvidia 180.08 drivers.
Comment 21 Oscar 2008-12-16 13:50:39 EST
I have the same problem but I have  a different versión of the driver ithink is 17X but that's the one in the RPM Fushion  repository should i install those in nvidia's web page or there's another way to fix this
Comment 22 Didier 2008-12-19 03:31:53 EST
Oscar,

I compiled the nVidia beta drivers myself, but appearantly, the Tigro repository (http://mirror.yandex.ru/fedora/tigro/10/x86_64) is now providing precompiled kmods/akmods (currently 180.16).