Description of problem: During normal jobs gnome-shell hangs Version-Release number of selected component (if applicable): 3.2.1-2.fc16.x86_64 How reproducible: Almost always Steps to Reproduce: 1. enter gnome session and wor for about 2-3 hours 2. the session freezes: moves and seems to be able to click on menu items, but without actual effect; no keyboard, no Alt+F2 or reload of gnome-shell possible 3. Ctrl+Alt+F2 and in console session kill -SIGHUP <pid_of gnome-shell process> Ctrl+ALt+F1 to come back and all is ok again. Sometimes the sighup causes gnome-shell to crash and losing all the desktop session, Actual results: gnome session hangs Expected results: To be able to work normally Additional info: I think I began to have this behaviour after kernel-3.1.1-1.fc16.x86_64 I'm using this option at boot (but I was using it with 3.1.0 kernel too) i915.i915_enable_rc6=1 Also as I have a laptop (Asus U36SD) with socalled Optimus technology, I disable the nvidia discrete card in /etc/rc.d/rc.local with the command #!/bin/bash echo "Disabling Nvidia videa adapter..." | tee -a /var/log/nvida_disabled.log /sbin/modprobe acpi_call echo '\_SB.PCI0.PEG0.GFX0.DOFF' > /proc/acpi/call acpi_call kernel module compiled myself each kernel upgrade using the source: acpi-call_240611.orig.tar.gz $ cat /proc/cmdline BOOT_IMAGE=/boot/vmlinuz-3.1.1-2.fc16.x86_64 root=UUID=ce058d6c-d2ed-49e5-9869-965799f246a5 ro rd.md=0 rd.lvm=0 rd.dm=0 KEYTABLE=us quiet SYSFONT=latarcyrheb-sun16 rhgb rd.luks=0 LANG=en_US.UTF-8 i915.i915_enable_rc6=1 elevator=deadline 00:02.0 VGA compatible controller: Intel Corporation 2nd Generation Core Processor Family Integrated Graphics Controller (rev 09) (prog-if 00 [VGA controller]) Subsystem: ASUSTeK Computer Inc. Device 1682 Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0 Interrupt: pin A routed to IRQ 48 Region 0: Memory at dc400000 (64-bit, non-prefetchable) [size=4M] Region 2: Memory at b0000000 (64-bit, prefetchable) [size=256M] Region 4: I/O ports at e000 [size=64] Expansion ROM at <unassigned> [disabled] Capabilities: <access denied> Kernel driver in use: i915 Kernel modules: i915 Any way to debug this? Only rows I find in Xorg.0.log are: [ 10911.891] (II) intel(0): Printing DDC gathered Modelines: [ 10911.891] (II) intel(0): Modeline "1366x768"x0.0 69.30 1366 1425 1464 1472 768 773 782 785 -hsync -vsync (47.1 kHz) [ 16044.192] (II) AIGLX: Suspending AIGLX clients for VT switch [ 16079.534] (II) AIGLX: Resuming AIGLX clients after VT switch [ 16079.870] (II) intel(0): EDID vendor "COR", prod id 6104 [ 16079.870] (II) intel(0): Printing DDC gathered Modelines: [ 16079.870] (II) intel(0): Modeline "1366x768"x0.0 69.30 1366 1425 1464 1472 768 773 782 785 -hsync -vsync (47.1 kHz) [ 16079.973] (**) Option "Device" "/dev/input/event4" [ 16079.973] (--) synaptics: SynPS/2 Synaptics TouchPad: touchpad found [ 16088.180] (II) AIGLX: Suspending AIGLX clients for VT switch <---- when I ctrl+Alt+F2 [ 16116.682] (II) AIGLX: Resuming AIGLX clients after VT switch <--- when I come back to X In messages: Nov 18 13:28:08 ope46 kernel: [15397.459003] CIFS VFS: Received no data, expecting 4 Nov 18 13:29:08 ope46 kernel: [15457.454946] CIFS VFS: Received no data, expecting 4 Nov 18 13:30:08 ope46 kernel: [15517.450810] CIFS VFS: Received no data, expecting 4 Nov 18 13:40:40 ope46 gnome-session[5954]: WARNING: Application 'gnome-shell.desktop' killed by signal <--- when I run the kill -SIGHUP command Nov 18 13:56:09 ope46 kernel: [17077.935596] CIFS VFS: Received no data, expecting 4 Nov 18 13:57:08 ope46 kernel: [17137.338785] CIFS VFS: Received no data, expecting 4
Based on suggestions by Adam Jackson in Fedora test mailing list " If you debuginfo-install gnome-shell, attach with gdb instead of sending SIGHUP, and run 'thread apply all backtrace', what do you get? " I ran it when the problem arose next time. I'm gong to attach the output of the session saved with the "script" command.
Created attachment 534992 [details] output during a gdb backtrace when freeze going through
Comment from Adam after my post: " The interesting part seems to be: Thread 1 (Thread 0x7fbf4aa8c9c0 (LWP 1609)): #0 0x0000003cb7ee6443 in __GI___poll (fds=<optimized out>, nfds=<optimized out>, timeout=<optimized out>) at ../sysdeps/unix/sysv/linux/poll.c:87 #1 0x0000003cbc208ba2 in ?? () from /usr/lib64/libxcb.so.1 #2 0x0000003cbc2090ff in ?? () from /usr/lib64/libxcb.so.1 #3 0x0000003cbc209184 in xcb_writev () from /usr/lib64/libxcb.so.1 #4 0x0000003cbc6456e7 in _XSend (dpy=0xb47a30, data=<optimized out>, size=<optimized out>) at xcb_io.c:436 #5 0x0000003cbc639d55 in SendZImage (dest_scanline_pad=0, dest_bits_per_pixel=32, req_yoffset=<optimized out>, req_xoffset=0, image=0x7fffe21a7240, req=<optimized out>, dpy=0xb47a30) at PutImage.c:802 This is showing gnome-shell trying to write an image to the X server, but blocking because the socket to the X server does not appear to be ready for writing. So there's (at least) three things that could be going wrong here, from probably most to least likely: 1) the write queue to the X server really might be blocked 2) libxcb could have a logic bug that's getting stuck here 3) the kernel might have a bug in poll() #1 typically only happens in two cases: either the X server is stuck away from the dispatch loop, or it's explicitly ignoring you because there's a grab in process. In the former case SIGHUP wouldn't help, simply reloading the shell won't un-stick the X server. But in the latter case, it might; if the grab is from one of the shell's other threads, then closing all of the shell's display connections would reset the grab. So my next intuition would be to gdb the X server and see what's up. If you find it waiting patiently on a call to select(), then the second case is more likely, and 'print AllClients' should show you an fd_set with only one bit set. - ajax "
Comment by Alon Levy: " I think I have the same problem here, I've followed it once, gdbing the server, it was in select, so maybe I'll try to do it again and do the 'print AllClients' - for me reproducing is 100% by doing a chvt / suspend and resume. To get back to work (i.e. workaround) I chvt to some console, do "killall -9 gnome-shell; sleep 5; DISPLAY=:0.0 gnome-shell" and quickly change back. Recently gnome-shell started to get unstuck occasionally if I wait about 10-20 seconds, but I'm not always that patient. "
In the mean time this morning I have applied the patch to xorg-x11-drv-intel. xorg-x11-drv-intel-2.17.0-1.fc16.x86_64 The former was the default as shiped with F16: 2.16.0-2 I'm going to report if I still have the problem, as it normally happens 1-2 times a day...
After installing # debuginfo-install xorg-x11-server-Xorg # debuginfo-install expat libfontenc libgcc libstdc++ xorg-x11-drv-evdev xorg-x11-drv-fbdev xorg-x11-drv-intel xorg-x11-drv-synaptics xorg-x11-drv-vesa zlib and having again the problem with 2.17.0 Intel Xorg driver, I got this with after gdb to Xorg process: ... Loaded symbols for /lib64/libnss_files.so.2 0x00007f5e0f1f8213 in __select_nocancel () at ../sysdeps/unix/syscall-template.S:82 82 T_PSEUDO (SYSCALL_SYMBOL, SYSCALL_NAME, SYSCALL_NARGS) (gdb) print AllClients $1 = {fds_bits = {140702960320512, 0 <repeats 15 times>}} (gdb) print AllClients $2 = {fds_bits = {140702960320512, 0 <repeats 15 times>}} I'm going to fully attach the gdb session. Gianluca
Created attachment 535024 [details] gdb of Xorg pid during freeze after gdb I run "print AllClients"
Created attachment 535041 [details] message when I switch to console window When I switch to a console window I currently get the message in the image. But this happens always, not only when I'm experiencing the freeze, so I don't know if it is related or not....
Any news on this? It is somehow tedious.... I think I have found some sort of correlation between applications running and problem arising... Probably something related to remmina and instantiating rdp sessions from it. I say this because until now I only experienced the problem when at office where I continuously use an external monitor connected through vga adapter on laptop and never at home where I'm only on the laptop display. So I thought this could interfere or be part of the cause... Actually, right today I got it two times in 45 minutes while working at home where I only have the laptop display at usage... But the new thing was that for the first time I was using remmina at home.... And I use it constantly at work.... Currently installed related packages are: remmina-plugins-telepathy-0.9.2-2.fc15.x86_64 remmina-plugins-nx-0.9.2-2.fc15.x86_64 remmina-0.9.3-3.fc16.x86_64 remmina-plugins-common-0.9.2-2.fc15.x86_64 remmina-plugins-xdmcp-0.9.2-2.fc15.x86_64 remmina-plugins-rdp-0.9.2-2.fc15.x86_64 remmina-plugins-vnc-0.9.2-2.fc15.x86_64 Any suggestion with this new information?
(gdb) print AllClients $1 = {fds_bits = {140702960320512, 0 <repeats 15 times>}} That's showing the server listening to more than one client, so this is not a server grab deadlock.
*** Bug 822481 has been marked as a duplicate of this bug. ***
(In reply to comment #11) > *** Bug 822481 has been marked as a duplicate of this bug. *** The above is from molecule (one of xscreensaver's hack).
Created attachment 600639 [details] backtraces of gnome-shell and Xorg I'm seeing this a log when running Eclipse on F17. I can reproduce it semi-consistently by trying to see the Debug As sub-menu from a context menu. gnome-shell-3.4.1-5.fc17 xorg-x11-server-Xorg-1.12.2-4.fc17 I'm attaching the backtrace (with debug symbols) of gnome-shell and Xorg, which seems to be the same as the other reported ones. "print AllClients" reports the same as Comment #10. Is there anything else I can gather to help find the cause of this?
This message is a reminder that Fedora 16 is nearing its end of life. Approximately 4 (four) weeks from now Fedora will stop maintaining and issuing updates for Fedora 16. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as WONTFIX if it remains open with a Fedora 'version' of '16'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version prior to Fedora 16's end of life. Bug Reporter: Thank you for reporting this issue and we are sorry that we may not be able to fix it before Fedora 16 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged to click on "Clone This Bug" and open it against that version of Fedora. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. The process we are following is described here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping
Fedora 16 changed to end-of-life (EOL) status on 2013-02-12. Fedora 16 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. Thank you for reporting this bug and we are sorry it could not be fixed.
Moving to F17 per comment 13.
I have also seen this bug both on F16 and now on F18. I have found this issue on a number of times with kernel 3.7.1 right thru to 3.8.1 and gnome-shell-3.6.0.2 root 917 1 0 17:36 ? 00:00:00 /usr/sbin/abrtd -d -s root 920 1 0 17:36 ? 00:00:00 /usr/bin/abrt-watch-log -F BUG corruption stack overflow protection fault WARNING: at nable to handle ouble fault: RTNL: assertion failed eek! page_mapcount(page) went negative! adness at NETDEV WATCHDOG ysctl table check failed INFO: possible recursive locking detected : nobody cared IRQ handler type mismatch /var/log/messages -- /usr/bin/abrt-dump-oops -xD root 930 1 0 17:36 ? 00:00:00 /usr/bin/abrt-watch-log -F Backtrace /var/log/Xorg.0.log -- /usr/bin/abrt-dump-xorg -xD nojar 2421 2054 0 17:36 ? 00:00:00 abrt-applet
Hi, Is happening 3/4 times a day with: [root@localhost log]# cat /etc/fedora-release Fedora release 18 (Spherical Cow) [root@localhost log]# uname -a Linux localhost.localdomain 3.9.4-200.fc18.x86_64 #1 SMP Fri May 24 20:10:49 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux [root@localhost log]# rpm -qa xorg-x11`* [root@localhost log]# rpm -qa xorg-x11\* xorg-x11-server-Xephyr-1.13.3-3.fc18.x86_64 xorg-x11-drv-synaptics-1.6.3-3.fc18.x86_64 xorg-x11-server-Xorg-1.13.3-3.fc18.x86_64 xorg-x11-fonts-misc-7.5-6.fc18.noarch xorg-x11-utils-7.5-7.fc18.x86_64 xorg-x11-drv-cirrus-1.5.1-3.fc18.x86_64 xorg-x11-drv-openchrome-0.3.3-1.fc18.x86_64 xorg-x11-drv-vesa-2.3.2-2.fc18.x86_64 xorg-x11-drv-void-1.4.0-12.fc18.x86_64 xorg-x11-drv-intel-2.21.8-1.fc18.x86_64 xorg-x11-xkb-utils-7.7-5.fc18.x86_64 xorg-x11-drv-vmware-12.0.2-3.20120718gite5ac80d8f.fc18.x86_64 xorg-x11-font-utils-7.5-11.fc18.x86_64 xorg-x11-xauth-1.0.7-2.fc18.x86_64 xorg-x11-drv-qxl-0.0.22-5.20120718gitde6620788.fc18.x86_64 xorg-x11-drv-wacom-0.16.1-2.fc18.x86_64 xorg-x11-server-common-1.13.3-3.fc18.x86_64 xorg-x11-drv-fbdev-0.4.3-3.fc18.x86_64 xorg-x11-drv-ati-7.1.0-5.20130408git6e74aacc5.fc18.x86_64 xorg-x11-drv-evdev-2.7.3-5.fc18.x86_64 xorg-x11-xinit-1.3.2-7.fc18.x86_64 xorg-x11-drv-ast-0.97.0-2.fc18.x86_64 xorg-x11-drv-nouveau-1.0.7-1.fc18.x86_64 xorg-x11-drv-mga-1.6.2-6.fc18.x86_64 xorg-x11-server-utils-7.5-16.fc18.x86_64 xorg-x11-fonts-Type1-7.5-6.fc18.noarch xorg-x11-drv-dummy-0.3.6-2.fc18.x86_64 xorg-x11-glamor-0.5.0-5.20130401git81aadb8.fc18.x86_64 xorg-x11-drv-vmmouse-13.0.0-1.fc18.x86_64 Please, let me know if more information is needed. Thank you
Hi, Again again. Forgot to say. This started after upgrading from F17 to F18 with fedup. Thanks.
This message is a reminder that Fedora 17 is nearing its end of life. Approximately 4 (four) weeks from now Fedora will stop maintaining and issuing updates for Fedora 17. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as WONTFIX if it remains open with a Fedora 'version' of '17'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version prior to Fedora 17's end of life. Bug Reporter: Thank you for reporting this issue and we are sorry that we may not be able to fix it before Fedora 17 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged change the 'version' to a later Fedora version prior to Fedora 17's end of life. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete.
I still have this problem. I noticed that both in fedora 17 and fedora 18 it happens more often when using remmina to connect via rdp to a windows server. Using remmina I can reproduce at least once an hour.. Can anyone else confirm if they are using remmina too?
I use remmina sometimes. Fullscreen mode (unscaled) causes some really weird interactions with gnome-shell, but I don't recall it actually hanging it. Maybe crashing it though. I will try it again and see what happens.
This message is a reminder that Fedora 18 is nearing its end of life. Approximately 4 (four) weeks from now Fedora will stop maintaining and issuing updates for Fedora 18. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as WONTFIX if it remains open with a Fedora 'version' of '18'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version prior to Fedora 18's end of life. Thank you for reporting this issue and we are sorry that we may not be able to fix it before Fedora 18 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged change the 'version' to a later Fedora version prior to Fedora 18's end of life. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete.
Fedora 18 changed to end-of-life (EOL) status on 2014-01-14. Fedora 18 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. If you are unable to reopen this bug, please file a new report against the current release. If you experience problems, please add a comment to this bug. Thank you for reporting this bug and we are sorry it could not be fixed.