| Summary: | gdm hangs on user switch | ||
|---|---|---|---|
| Product: | [Fedora] Fedora | Reporter: | Bojan Smojver <bojan> |
| Component: | gdm | Assignee: | Ray Strode [halfline] <rstrode> |
| Status: | CLOSED WONTFIX | QA Contact: | Fedora Extras Quality Assurance <extras-qa> |
| Severity: | medium | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 18 | CC: | benjavalero, collura, jason, jon.dufresne, kelevel+redhat, kevin.russell, kirtis.bakalarczyk, michael_stevens, quentin, rstrode, runekl, scattol, txn2tahx3v, ufospoke |
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | x86_64 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2014-02-05 11:51:36 UTC | Type: | --- |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Attachments: | |||
|
Description
Bojan Smojver
2011-11-06 22:33:37 UTC
I have the same problem. It works to switch user after reboot, but after using the system for a little while the dialogue showing the available users become empty. Pressing ctrl-alt-f2 and usig alt-arrow keys takes me back to the user i tried to swotch from. gdm-3.2.1.1-8.fc16.x86_64 I'm having this problem as well. The login screen shows my user icon but won't allow any input. I can click the power icon in the top right corner and it 'lights up' indicating that the menu should be appearing, but nothing happens. Same here, CRTL-ALT-BACKSPACE solves it for me. When hung GDM shows just one user picture instead of a list and does not accept any input. One more thing... at times when awakening from suspend the computer asks for the password but it does not accept any input (keyboard or mouse) CTRL-ALT-BACKSPACE solves this also, but seems not to be the correct way of handling this. I'm experiencing the same bug -- have to hit ctrl-alt-backspace to get back in business. Created attachment 539936 [details]
Screenshot with gdm freezed
This is a screenshot. For me, Ctrl + Alt + Backspace also solves the problem for me.
(In reply to comment #6) > Created attachment 539936 [details] > Screenshot with gdm freezed > > This is a screenshot. For me, Ctrl + Alt + Backspace also solves the problem > for me. Ctrl + Alt + F2 also works sometimes. Created attachment 541684 [details]
Server log of possibly hung Xorg process
I've noticed that when the switch user hangs, there are two Xorg processes running. The first was running the active user, while the second server (actually started a day earlier), seemed to be hung. Killing the second Xorg process allowed switch user to proceed. having similar experiences with logging out after running the update-gui (doesnt seem to happen after command line yum) https://bugzilla.redhat.com/show_bug.cgi?id=804361 asks for admin password to logout, if freezes excpet for mouse then alt-f2 'restart' to restart window manager sometimes works but sometimes need ctl-alt-backspace to restart session This bug is annoying. This is occuring multiple times a week on my fedora server. this is happening multiple times a day. Doing "ps -ef |grep gdm" will give 3 or more pages of processes with up to 30 displays. It occurs frequently and the restart works but it eats up the machine like crazy. Can't keep the machine up for a week. the machine is recent with NVIDIA drivers. It's definitely not to slow. I recently upgraded to F18 and this is still happening frequently. Bumping version. looks like once it starts happening, it happens with every subsequent switch user done through GDM with extra Xorg display being created at each switch user. Furthermore each leaked Xorg uses the next higher display number (I am currently at display :45) the TTYs attached to each Xorg process isn't leaking though those seem to get recycled as the highest one I have is tty8 Currently killing the process tree structure starting at the parent of each leaked Xorg process frees the memory and keeps the machine up and running. It looks like the reason for the machine to freeze is that it eventually runs out of memory. Each Xorg process tree does take RAM and at the beginning of the characterising this issue I did notice that after the machine running a while the swap file was used near capacity and the machine swapped more than it used to. It's my conjecture that the gdm eventually freezes on the OS lacking virtual memory and something seizes up. It looks like the extra Xorg processes (the ones that leak) appear when someone switches back into their account they already have running. And that behavior starts immediately after reboot. So it's not like it works flawlessly then something snaps and then it's in error. This condition exists from the start. Don't know what the extra Xorg does but you are put back into your environment suggesting that, in the end, you do reconnect with the Xorg you originally logged in. Furthermore deleting the extra Xorgs does process tree does not cause problem for the exiting logged in users suggesting that these Xorgs are extra It looks like running out of ram or swap might not be the actual issue as I just had a freeze when the machine was clean thanks to a cronjob deleting the Xorg process trees. Looks like it get get locked up on it's own accord. Requesting severity raised to HIGH considering how much resources are leaked and how quickly essentially rendering this feature nearly useless. Is this related to this bug: https://bugzilla.redhat.com/show_bug.cgi?id=799671 Steps are different but in both cases and in a matter of days gdm is rendered useless by essentially logging in and out Created attachment 647274 [details]
gdm logs when gdm looses Xdisplays and eventually freezes
Here are the gdm logs for the last while. The machine froze on Nov 18 2012, so this is the day I packaged the logs. This issue would have been occuring all week before that date
Created attachment 647275 [details]
Xorg logs related to when gdm looses Xdisplays and eventually freezes
these are the Xorg logs that are related to the gdm logs uploaded at the same time. Note that the displays active with real users on them are 0 to 5. Displays above 5 are extranious and are deleted nightly by a cronjob. Presumably the logs reflect this but that's the result of a workaround not of the problem
repro steps are straight forward: log in users (I have are 5) switch between users by using the switch user dropdown menu entry from the name on top left corner of the menu bar log as another of the users in the list of users doing this enough times eventually the display will not respond CTL-ALT-BACKSPACE will kill the display and restart but eventually it's going to fail and the machine needs to be hard reset. And by eventually I mean after a few days where there are probably 5 or 10 user switches per day. I also have this bug with F17 and this seems to be worse with kernel 3.6.7. I is very annoying, I have to hard reboot very often. My children keep telling me that linux is full of bugs and that I should use windows! Please do something Gromit! If I can help debugging this, I would be delighted to do but I do. Created attachment 654660 [details]
script to delete Xorgs (display higher than 5) to alleviate the issue of leaked Xorgs caused by thei bug
This is a script to delete the Xorg that have display number greater than 5 that (in my use case) are duplicate leaked by this bug.
run this script as a cron job once a night.
The usecase is to reboot (or do "init 3" followed by "init 5" then log all the users in let this go.
ER: find a way to determine which display is not attached to a log users.
@frederic Bron I've uploaded the script that I use to clean up the extra Xorg. It delays the problem but does not fix it. It looks like more gnome processes are leaked and not cleaned by the script. It relies on shell script killtree.sh found on the web. The above strategy doesn't fix the problem. It just delays it. When keyboard freeze up. The workaround is to telnet from another machine and issue an "init 2" command as root and then an "init 5" the display will be working again Is it possible to remove gdm and use another program that works? For me I have the problem everyday after only a few minutes of use as soon as I switch user without closing the first opened session. I also get a freeze after password typing just when the wallpaper appears (this time Ctrl+Alt+Backspace works). Any idea if this is specific to fedora? any idea how ubuntu works? in /var/log/gdm it contains log files for all the X displays opened. In my case the high numbered ones are all associated with this problem as there never are any users.
In the :32-greeter.log we see the following error:
Fatal IO error 11 (Resource temporarily unavailable) on X server :32.
this appears twice in the file and it's sometimes followed by:
Window manager warning: Log level 16: gnome-shell: Fatal IO error 0 (Success) on X server :32.
In some of them we also see the following error:
(gnome-settings-daemon:2537): media-keys-plugin-WARNING **: Unable to get default sink
It's unclear if this is related but it certainly appears in the current displays that are problematic
Created attachment 655546 [details]
log files for display :32 from /var/log/gdm
There are sample log files the show the problems described in the previous entry with various X errors in gdm logs
Could it be related to the NVIDIA proprietary drivers? I see a lot of NVIDIA in /var/log/gdm. Do someone have those problems without? My recollection is that, on that same machine, using the nouveau drivers that shipped with the OS the machine was also unusable/unstable. I won't go as far as to swear that it occured with the nouveau drivers but I am pretty sure it did as in my quest to stabilised my machine the first thing I tried was to get the NVIDIA drivers. I had to work hard to install the proprietary drivers (and make them stick) and part of the reason was to have a more stable machine. Can we increase the severity to high="Problem due to crashes, loss of data, severe memory, leak, etc." Updated to NVIDIA 310 drivers. Still looks like it's leaking Xorg processes (which so far seems to be an indicator of the problem) This message is a reminder that Fedora 18 is nearing its end of life. Approximately 4 (four) weeks from now Fedora will stop maintaining and issuing updates for Fedora 18. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as WONTFIX if it remains open with a Fedora 'version' of '18'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version prior to Fedora 18's end of life. Thank you for reporting this issue and we are sorry that we may not be able to fix it before Fedora 18 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged change the 'version' to a later Fedora version prior to Fedora 18's end of life. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. This issue also exists with Fedora 19. Fedora 18 changed to end-of-life (EOL) status on 2014-01-14. Fedora 18 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. If you are unable to reopen this bug, please file a new report against the current release. If you experience problems, please add a comment to this bug. Thank you for reporting this bug and we are sorry it could not be fixed. Maybe the version should have been bumped by someone who can edit this bug (and then reopen the bug). It was reported that it happens in F19. |