Bug 860624 - Gnome-shell hangs when requested activities
Summary: Gnome-shell hangs when requested activities
Keywords:
Status: CLOSED EOL
Alias: None
Product: Fedora
Classification: Fedora
Component: xorg-x11-drv-nouveau
Version: 19
Hardware: i386
OS: Linux
unspecified
high
Target Milestone: ---
Assignee: Ben Skeggs
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard: RejectedBlocker RejectedNTH
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-09-26 10:36 UTC by Petr Kočandrle
Modified: 2015-02-17 14:28 UTC (History)
12 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-02-17 14:28:36 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
/var/log/messages (485.79 KB, text/plain)
2012-09-26 10:36 UTC, Petr Kočandrle
no flags Details
dmesg after hang (58.66 KB, text/plain)
2013-09-17 01:23 UTC, Frederick Grose
no flags Details

Description Petr Kočandrle 2012-09-26 10:36:05 UTC
Created attachment 617495 [details]
/var/log/messages

Description of problem:
The screen has gone still while top left corner was hit.

Version-Release number of selected component (if applicable):
i386 testday iso

How reproducible:
sometimes

Steps to Reproduce:
1. Hit top left corner in gnome-shell.
2. The screen freezes.
  
Actual results:
The screen was still.

Expected results:
Normal screen redrawing.

Additional info:
I was testing nouveau with testday iso. I was going to test xv, so I ran totem (accidentally as root) and I was going to run nautilus to play testing video. No video was playing in totem at the moment. When I hit top left corner, screen freezed and I was only able to move mouse cursor and to switch to console with Ctrl-Alt+F2 and also back with Alt+F1. Only mouse cursor was moving and the background was corrupted. Even killing X and trying to login again was not working - gnome-shell did not start and error message appeared (something has gone wrong...). I've collected /var/log/messages. The problem have started at 06:56.

Comment 1 Petr Kočandrle 2012-09-26 11:20:52 UTC
My HW profile is here http://www.smolts.org/client/show/pub_83fff41b-335c-475b-b78c-9605a51585a1 - VGA is ION (GeForce 9400M).

Comment 2 Tim Flink 2012-10-04 17:32:20 UTC
Comment on attachment 617495 [details]
/var/log/messages

changing attached log to text/plain

Comment 3 Adam Williamson 2012-10-04 17:40:27 UTC
Discussed at 2012-10-04 blocker review meeting: http://meetbot.fedoraproject.org/fedora-qa/2012-10-04/f18-beta-blocker-review-2.1.2012-10-04-16.00.log.txt .  This seems kind of a random bug, not necessarily a graphics issue from the logs and may be related to some kind of failure on the live media or simple RAM exhaustion (all the errors at the end of the log are pretty strange). With the lack of information and the random reproduction scenario, this is rejected as a blocker and as NTH; it can be re-proposed if it's reproducible and more information arises that contradicts the above.

Comment 4 Jacobo Cabaleiro 2012-11-18 17:13:00 UTC
I'm possibly experiencing this bug on a F18 installation.

== Description ==
Sometimes, usually when I have a high number of tabs opened on Firefox (> 5), with some CPU usage (this is an old computer) due to gnash - javascript execution, when I try to open the Activities view gnome-shell "hangs" consuming all the available CPU it can and "display" freezes. The Activities view does not get opened. This is what top shows:
top - 17:21:06 up  7:48,  5 users,  load average: 3.72, 2.75, 2.27
Tasks: 175 total,   2 running, 173 sleeping,   0 stopped,   0 zombie
%Cpu(s): 15.5 us,  8.4 sy,  0.0 ni, 73.3 id,  1.0 wa,  0.0 hi,  1.7 si,  0.0 st
KiB Mem:   2064272 total,  1944316 used,   119956 free,    27732 buffers
KiB Swap:  4194300 total,    43012 used,  4151288 free,   787604 cached

  PID USER      PR  NI  VIRT  RES  SHR S  %CPU %MEM    TIME+  COMMAND
 2547 obmun     20   0  684m 139m  19m R  89.8  6.9  57:47.69 gnome-shell
10257 obmun     20   0  810m 300m  26m S  15.8 14.9  30:08.85 firefox
10494 obmun     20   0  232m  37m  11m S   5.3  1.9  16:17.19 transmission-gt
13233 obmun     20   0  5196 1168  916 R   5.3  0.1   0:00.03 top
    1 root      20   0  7680 4596 1896 S   0.0  0.2   0:03.49 systemd
    2 root      20   0     0    0    0 S   0.0  0.0   0:00.04 kthreadd
    3 root      20   0     0    0    0 S   0.0  0.0   0:04.87 ksoftirqd/0
    5 root       0 -20     0    0    0 S   0.0  0.0   0:00.00

Applications keep running without problem (including audio), but display completely stops refreshing. Mouse cursor keeps moving as well as changing, responding to the cursor changes due to the underlaying text boxes and buttons.

I can recover if I go to a plain terminal and kill -9 the gnome-shell process. After a new gnome-shell is automatically relaunched, I can continue working normally with the system.

I've been able to attach with gdb to the hanged gnome-shell process and to obtain a bt:
#0  0xb7773424 in __kernel_vsyscall ()
#1  0x41532edc in sched_yield () at ../sysdeps/unix/syscall-template.S:81
#2  0xb62b9f0a in nouveau_fence_wait (fence=fence@entry=0xd1c9af0) at nouveau_fence.c:209
#3  0xb62b961c in nouveau_screen_fence_finish (screen=0x9b8db90, pfence=0xd1c9af0, timeout=18446744073709551615) at nouveau_screen.c:76
#4  0xb637d02f in st_finish (st=st@entry=0x9c33e28) at ../../src/mesa/state_tracker/st_cb_flush.c:100
#5  0xb637d089 in st_glFinish (ctx=0x9be08e0) at ../../src/mesa/state_tracker/st_cb_flush.c:135
#6  0xb62d1de6 in _mesa_finish (ctx=0x9be08e0) at ../../src/mesa/main/context.c:1656
#7  0xb62d29b0 in _mesa_Finish () at ../../src/mesa/main/context.c:1687
#8  0x4ac26f72 in shared_dispatch_stub_216 () at ../../../src/mapi/shared-glapi/glapi_mapi_tmp.h:14319
#9  0x4391e5fe in _cogl_winsys_onscreen_swap_region (onscreen=0x9dba4e0, user_rectangles=0xbfa51a64, n_rectangles=1) at winsys/cogl-winsys-glx.c:1253
#10 0x4391539e in cogl_onscreen_swap_region (onscreen=0x9dba4e0, rectangles=rectangles@entry=0xbfa51a64, n_rectangles=n_rectangles@entry=1)
    at ./cogl-onscreen.c:181
#11 0x453ed949 in clutter_stage_cogl_redraw (stage_window=0x9c960c8) at cogl/clutter-stage-cogl.c:482
#12 0x454680a8 in _clutter_stage_window_redraw (window=0x9c960c8) at ./clutter-stage-window.c:236
#13 0x4546425c in clutter_stage_do_redraw (stage=0x9db8fd8) at ./clutter-stage.c:1170
#14 _clutter_stage_do_update (stage=0x9db8fd8) at ./clutter-stage.c:1228
#15 0x45445a5f in master_clock_update_stages (stages=0xb882950, master_clock=0x9d67f38) at ./clutter-master-clock.c:386
#16 clutter_clock_dispatch (source=source@entry=0x9d76978, callback=0x0, user_data=0x0) at ./clutter-master-clock.c:520
#17 0x417a516b in g_main_dispatch (context=0x9a94ff0, context@entry=0x9ac1d08) at gmain.c:2715
#18 g_main_context_dispatch (context=context@entry=0x9a94ff0) at gmain.c:3219
#19 0x417a5510 in g_main_context_iterate (context=0x9a94ff0, block=block@entry=1, dispatch=dispatch@entry=1, self=<optimized out>) at gmain.c:3290
#20 0x417a5973 in g_main_loop_run (loop=0x9a95638) at gmain.c:3484
#21 0x43d60238 in meta_run () at core/main.c:545
#22 0x080498f2 in main (argc=1, argv=0xbfa51e74) at main.c:416

I don't know if this bt is the same for all the hangs, as I have just obtained it for the first time 20 minutes ago, when the last gnome-shell hang happened.

== How reproducible ==
Randomly. If I'm doing a lot of navigation, I can easily trigger this every 15 minutes or less.

== More info ==
My computer profile: http://www.smolts.org/client/show/pub_9d7fe254-01da-40af-ab77-93ed9260d8a4

App versions:
* gnome-shell-3.6.2-2.fc18.i686 (I've been experiencing this since I installed F18 on this computer using the beta TC3 install media, so this has affected previous gnome-shell or nouveau versions)
* xorg-x11-drv-nouveau.i686  1:1.0.4-1.fc18
* Kernel 3.6.6-3.fc18.i686.PAE
* Mesa 9.0-3.fc18
As I said, I'm running a F18 installation on this system, with updates-testing repo enabled. The computer should be updated (last update was carried on this morning).

Comment 5 Jacobo Cabaleiro 2012-11-18 17:31:09 UTC
Just after having sent previous comment, I experienced an identical hang, with an identical backtrace. Trigger was different: the closing of a GTK file selector dialog win which youtube video uploading page had opened. It is _the first time_ I experience this hang without having tried to open the Activities view of gnome-shell.

Top for this ocassion:
top - 18:22:15 up  8:49,  4 users,  load average: 1.03, 0.69, 0.61
Tasks: 172 total,   2 running, 170 sleeping,   0 stopped,   0 zombie
%Cpu(s): 16.1 us,  8.5 sy,  0.0 ni, 72.6 id,  1.2 wa,  0.0 hi,  1.6 si,  0.0 st
KiB Mem:   2064272 total,  1774924 used,   289348 free,    40956 buffers
KiB Swap:  4194300 total,    44148 used,  4150152 free,   769372 cached

  PID USER      PR  NI  VIRT  RES  SHR S  %CPU %MEM    TIME+  COMMAND
13401 obmun     20   0  527m 107m  42m R  98.8  5.3   5:51.44 gnome-shell
14359 obmun     20   0  5196 1168  916 R  11.6  0.1   0:00.03 top
14046 obmun     20   0  145m  16m 5784 S   5.8  0.8   0:18.62 mplayer
    1 root      20   0  7680 4612 1912 S   0.0  0.2   0:03.72 systemd
    2 root      20   0     0    0    0 S   0.0  0.0   0:00.06 kthreadd
    3 root      20   0     0    0    0 S   0.0  0.0   0:05.47 ksoftirqd/0
    5 root       0 -20     0    0    0 S   0.0  0.0   0:00.00 kworker/0:0H

gdb backtrace (seems identical):
#0  0xb7778424 in __kernel_vsyscall ()
#1  0x41532edc in sched_yield () at ../sysdeps/unix/syscall-template.S:81
#2  0xb61f2f0a in nouveau_fence_wait (fence=fence@entry=0xc107cd0) at nouveau_fence.c:209
#3  0xb61f261c in nouveau_screen_fence_finish (screen=0xa14ce48, pfence=0xc107cd0, timeout=18446744073709551615) at nouveau_screen.c:76
#4  0xb62b602f in st_finish (st=st@entry=0xa1f23a8) at ../../src/mesa/state_tracker/st_cb_flush.c:100
#5  0xb62b6089 in st_glFinish (ctx=0xa19cb50) at ../../src/mesa/state_tracker/st_cb_flush.c:135
#6  0xb620ade6 in _mesa_finish (ctx=0xa19cb50) at ../../src/mesa/main/context.c:1656
#7  0xb620b9b0 in _mesa_Finish () at ../../src/mesa/main/context.c:1687
#8  0x4ac26f72 in shared_dispatch_stub_216 () at ../../../src/mapi/shared-glapi/glapi_mapi_tmp.h:14319
#9  0x4391e5fe in _cogl_winsys_onscreen_swap_region (onscreen=0xa377a10, user_rectangles=0xbfbf2804, n_rectangles=1) at winsys/cogl-winsys-glx.c:1253
#10 0x4391539e in cogl_onscreen_swap_region (onscreen=0xa377a10, rectangles=rectangles@entry=0xbfbf2804, n_rectangles=n_rectangles@entry=1)
    at ./cogl-onscreen.c:181
#11 0x453ed949 in clutter_stage_cogl_redraw (stage_window=0xa2528c8) at cogl/clutter-stage-cogl.c:482
#12 0x454680a8 in _clutter_stage_window_redraw (window=0xa2528c8) at ./clutter-stage-window.c:236
#13 0x4546425c in clutter_stage_do_redraw (stage=0xa376548) at ./clutter-stage.c:1170
#14 _clutter_stage_do_update (stage=0xa376548) at ./clutter-stage.c:1228
#15 0x45445a5f in master_clock_update_stages (stages=0xb82be58, master_clock=0xa28a878) at ./clutter-master-clock.c:386
#16 clutter_clock_dispatch (source=source@entry=0xa331de0, callback=0x0, user_data=0x0) at ./clutter-master-clock.c:520
#17 0x417a516b in g_main_dispatch (context=0xa03aff0, context@entry=0xa067cc0) at gmain.c:2715
#18 g_main_context_dispatch (context=context@entry=0xa03aff0) at gmain.c:3219
#19 0x417a5510 in g_main_context_iterate (context=0xa03aff0, block=block@entry=1, dispatch=dispatch@entry=1, self=<optimized out>) at gmain.c:3290
#20 0x417a5973 in g_main_loop_run (loop=0xa03b638) at gmain.c:3484
#21 0x43d60238 in meta_run () at core/main.c:545
#22 0x080498f2 in main (argc=1, argv=0xbfbf2c14) at main.c:416

Comment 6 Faizal Luthfi 2013-01-25 10:38:04 UTC
This is maybe caused by selinux, It may happens because there is application that runs as root. Try to disable selinux before running application as root, but don't forget to reenable selinux after closing the application.

Comment 7 Faizal Luthfi 2013-01-25 22:57:52 UTC
Sorry for my previous comment, I had tried to disable selinux, but the problem still happened. This bug is also discussed in https://bugzilla.gnome.org/show_bug.cgi?id=692340

Comment 8 Petr Kočandrle 2013-01-26 13:04:22 UTC
I almost wanted to say that I'm not experiencing the problem on fully updated F18, but I am... :-) And I have also SELinux disabled. But killing gnome-shell still don't work for me, the gnome session cannot be started until I restart the machine.

Comment 9 Frederick Grose 2013-02-03 22:06:42 UTC
With F18 and an instance of beesu nautilus open, I've noticed that opening a new instance of gedit will almost always change the ownership of /run/user/1000/dconf/user to root:root, which then leads to the buggy behavior when the mouse pointer touches the top-left corner, or ALT F1 is pressed, or the startkey is pressed and released.

When opening a new instance of gedit as above doesn't leave the ownership of the /dconf/user file as root, it seems to have changed it back to 1000:1000.  Even so, when such an instance of gedit is closed, the ownership of /run/user/1000/dconf/user changes to root:root.

When gnome-shell hangs, I find that I can switch to a root console,
# chown 1000:1000 /run/user/dconf/user

and switch back to a working gnome session.

Comment 10 Martin 2013-06-11 16:40:46 UTC
Petr and Frederick, are you able to reproduce it on latest Fedora 18 or 19?

Comment 11 Frederick Grose 2013-06-11 18:28:12 UTC
No change in Comment 9 observations with yum update in Fedora 18.

Comment 12 Faizal Luthfi 2013-06-12 02:09:55 UTC
The problem happened in latest fedora 18.

Comment 13 Petr Kočandrle 2013-06-25 22:10:44 UTC
I was able to reproduce this bug on fully updated F19 too.

Comment 14 Bruno Jesus 2013-09-13 19:12:52 UTC
Affects me on Fedora 19 too... i need to completely restart the system to get Gnome Session running again

Comment 15 Ben Skeggs 2013-09-17 00:54:08 UTC
Can I get dmesg logs from after one of these hangs from F19/F20 please?

Comment 16 Ben Skeggs 2013-09-17 00:54:22 UTC
Can I get dmesg logs from after one of these hangs from F19/F20 please?

Comment 17 Frederick Grose 2013-09-17 01:23:40 UTC
Created attachment 798498 [details]
dmesg after hang

(In reply to Ben Skeggs from comment #15)
> Can I get dmesg logs from after one of these hangs from F19/F20 please?

I triggered the hang as described in Comment #9, and generated dmesg.txt while in a root console.  After restoring the ownership of /run/user/1000/dconf/user and returning to a graphical desktop, the dmesg report was unchanged. (I repeated the test 3 or more times.)

Comment 18 Ben Skeggs 2013-09-17 02:50:51 UTC
(In reply to Frederick Grose from comment #17)
> Created attachment 798498 [details]
> dmesg after hang
> 
> (In reply to Ben Skeggs from comment #15)
> > Can I get dmesg logs from after one of these hangs from F19/F20 please?
> 
> I triggered the hang as described in Comment #9, and generated dmesg.txt
> while in a root console.  After restoring the ownership of
> /run/user/1000/dconf/user and returning to a graphical desktop, the dmesg
> report was unchanged. (I repeated the test 3 or more times.)

I didn't notice until now, but you're using radeon, not nouveau.. Please file a new bug against xorg-x11-drv-ati for your issue, it's not related (even if it has the same trigger).

Comment 19 Petr Kočandrle 2013-09-25 13:35:32 UTC
First time I tried on F19 following lines appeared in the dmesg:
[  301.696952] nouveau E[  PGRAPH][0000:05:00.0] DATA_ERROR INVALID_VALUE
[  301.696972] nouveau E[  PGRAPH][0000:05:00.0]  DATA_ERROR
[  301.696992] nouveau E[  PGRAPH][0000:05:00.0] ch 5 [0x000f814000 totem[2937]] subc 3 class 0x8397 mthd 0x0e04 data 0xffaf0000
[  301.697067] nouveau E[  PGRAPH][0000:05:00.0] DATA_ERROR INVALID_VALUE
[  301.697079] nouveau E[  PGRAPH][0000:05:00.0]  DATA_ERROR
[  301.697093] nouveau E[  PGRAPH][0000:05:00.0] ch 5 [0x000f814000 totem[2937]] subc 3 class 0x8397 mthd 0x0e08 data 0xffd00000

I've tried restart and reproduce it for the second time, but after the hang there was nothing additional what wasn't there before the hang, so I have doubts about lines above having something to do with it.

Comment 20 Petr Kočandrle 2013-09-25 13:42:04 UTC
For the third time it was there again...

Also Ive noticed, that if I move the root's totem window first before I hit top left corner, it works ok. But when I run totem as root, don't touch the window and just hit top left corner, it freezes.

Comment 21 D. Charles Pyle 2014-04-13 08:51:55 UTC
I know this is a bit late but I just saw this today.  I am not in the CC list.  The problem is inherent in Gnome 3 when running certain applications as superuser while also in a session running under a standard user.

Running the command line as follows for such applications as nautilus or gedit, which cause the above problem, will prevent the crashes:

beesu - 'unset XDG_RUNTIME_DIR ; nautilus'

beesu - 'unset XDG_RUNTIME_DIR ; gedit'

To run these in a way that also opens specific files or folders at the same time, put the path in quotes with the above lines, as follows:

beesu - 'unset XDG_RUNTIME_DIR ; nautilus' "/etc/default/"

beesu - 'unset XDG_RUNTIME_DIR ; gedit' "/etc/default/grub"

Doing as above stops these crashes when moving the mouse to activities corner. Hope this helps.

Comment 22 Martin 2014-04-14 13:35:52 UTC
Are you reproducing this on F19 or F20? Can you confirm, if it's reproducible on F20?

Comment 23 D. Charles Pyle 2014-04-14 14:57:41 UTC
(In reply to Martin Holec from comment #22)
> Are you reproducing this on F19 or F20? Can you confirm, if it's
> reproducible on F20?

Yes, this was reproducible on both F19 and F20, which was why I had to add similar such command lines as I gave in comment 21 directly into code for the beesu scripts in order to prevent the crashes that occurred when people were moving their mouse cursors to the activities corner while working with gedit as superuser.  The same thing happened with nautilus as well as DBus errors with gnome-terminal.

Unfortunately for this situation, I am no longer using either F19 or F20, so I do not know if the situation still is the case.  I have moved to F21 rawhide and no longer am using Gnome because of feature losses, repeatedly-broken extensions, and problems with how it handled my dual monitor setup.

Comment 24 Fedora End Of Life 2015-01-09 17:23:14 UTC
This message is a notice that Fedora 19 is now at end of life. Fedora 
has stopped maintaining and issuing updates for Fedora 19. It is 
Fedora's policy to close all bug reports from releases that are no 
longer maintained. Approximately 4 (four) weeks from now this bug will
be closed as EOL if it remains open with a Fedora 'version' of '19'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 19 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 25 Fedora End Of Life 2015-02-17 14:28:36 UTC
Fedora 19 changed to end-of-life (EOL) status on 2015-01-06. Fedora 19 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.


Note You need to log in before you can comment on or make changes to this bug.