Bug 690873

Summary: gdm hangs and uses 100% CPU
Product: [Fedora] Fedora Reporter: Andrew McNabb <amcnabb>
Component: gdmAssignee: jmccann
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: unspecified    
Version: 15CC: aaron, blakkheim.gw, cschalle, dnortham99, hhorak, jlaska, jmccann, marcus.moeller, mjc, rstrode, sgallagh, sgehwolf, tflink
Target Milestone: ---Keywords: Reopened
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: gdm-3.0.4-1.fc15 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-06-07 13:54:13 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 617261    
Attachments:
Description Flags
messages
none
backtrace of process in cycle
none
gdm none

Description Andrew McNabb 2011-03-25 16:38:44 UTC
Created attachment 487598 [details]
messages

Description of problem:

After booting, the gdm login screen appeared, and I entered my username.  The "password" prompt then appeared, but gdm stopped responding to any keyboard or mouse clicks (although I was still able to move the mouse pointer).  I sshed in, and saw that gdm-simple-greeter is using 100% CPU.

Version-Release number of selected component (if applicable):

gdm-2.91.94-1.fc15.x86_64

How reproducible:

I tried to reproduce a few times (hitting CTRL-ALT-BACKSPACE a few times and retrying), and the problem did not recur.

Additional info:

I am attaching GDM-related output that I found in /var/log/messages.  Hopefully this is helpful, since I have not yet succeeded in reproducing the problem

Comment 1 Andrew McNabb 2011-04-06 16:52:34 UTC
This happened to me again today with gdm-3.0.0-1.fc15.x86_64.

Comment 2 Andrew McNabb 2011-04-06 16:54:33 UTC
By the way, it seems like this might be reproducible by:
1) type in the username of a user who has not logged in since the machine was installed
2) select "user script" in the drop-down session selector

It seems to only happen the first time, which would explain why this has been hard to reproduce.

Comment 3 Honza Horak 2011-04-08 09:53:44 UTC
Created attachment 490750 [details]
backtrace of process in cycle

I've noticed this failure, too. If an existed user is trying to login the first time, login window starts to cycle (process gdm-simple-gree). I'm attaching a backtrace of this process, HTH.

Comment 4 Honza Horak 2011-04-08 09:58:04 UTC
I've forget to mention I'm using gdm-3.0.0-1.fc15.x86_64, as well.

Comment 5 Severin Gehwolf 2011-04-08 14:16:30 UTC
Here is another way to reproduce:

1.) Enable NIS
2.) Boot
3.) Try to log in as some NIS user (note GDM does not remember any NIS users who logged in prior booting)
4.) GDM freezes

After a <ctrl>+<alt>+Backspace GDM "remembers" (i.e. shows this user's correct first/last name as provided by NIS) the last NIS user I attempted to log in as and when using that logging in works fine.

$ rpm -q gdm
gdm-3.0.0-1.fc15.x86_64

Thanks!

Comment 6 Andrew McNabb 2011-05-02 16:34:36 UTC
I am still seeing this in gdm-3.0.0-2.fc15.x86_64.

I am proposing this as a blocker for Fedora 15.  From the release criteria: "a system... must boot to a working graphical environment without unintended user intervention."  In this case, a user must either ssh and restart gdm or power-cycle the machine in order to get a working graphical environment.

Comment 7 Ray Strode [halfline] 2011-05-02 19:48:47 UTC
what version of accountsservice do you guys have installed?

Comment 8 Andrew McNabb 2011-05-02 19:55:31 UTC
accountsservice-0.6.9-1.fc15.x86_64

Comment 9 Severin Gehwolf 2011-05-02 20:06:46 UTC
$ rpm -q accountsservice
accountsservice-0.6.9-1.fc15.x86_64

Comment 10 Honza Horak 2011-05-03 07:29:34 UTC
$ rpm -q accountsservice
accountsservice-0.6.7-1.fc15.x86_64

Comment 11 Stephen Gallagher 2011-05-03 18:26:46 UTC
I'm also seeing a similar behaviour with SSSD in use for network logins. About two weeks ago, I started having a problem where, when I clicked on my username in the greeter, it would hang and never prompt me for a password. After hitting ctrl-alt-del and choosing log off, I'd come back to the greeter and everything would work.

As of updates yesterday, I don't get to the greeter fully at all. Now it stops at the welcome screen before displaying my users, until I do the ctrl-alt-del and log off. After doing so, all the users are displayed and work again.

I don't see 100% CPU in use with the current behaviour. I haven't tried downgrading accountsservice to see if that changes the behaviour back to the earlier version.

Comment 12 James Laska 2011-05-05 11:49:11 UTC
Created attachment 497042 [details]
gdm

(In reply to comment #11)
> I'm also seeing a similar behaviour with SSSD in use for network logins. About
> two weeks ago, I started having a problem where, when I clicked on my username
> in the greeter, it would hang and never prompt me for a password. After hitting
> ctrl-alt-del and choosing log off, I'd come back to the greeter and everything
> would work.
> 
> As of updates yesterday, I don't get to the greeter fully at all. Now it stops
> at the welcome screen before displaying my users, until I do the ctrl-alt-del
> and log off. After doing so, all the users are displayed and work again.

I'm seeing the same behavior that Stephen reports.  With my SSSD enabled (LDAP ident, krb5 auth) system, I no longer get a login prompt from gdm.

$ rpm -q accountsservice gdm
accountsservice-0.6.9-5.fc15.x86_64
gdm-3.0.0-2.fc15.x86_64

Comment 13 James Laska 2011-05-05 11:55:02 UTC
(In reply to comment #12)
> I'm seeing the same behavior that Stephen reports.  With my SSSD enabled (LDAP
> ident, krb5 auth) system, I no longer get a login prompt from gdm.
> 
> $ rpm -q accountsservice gdm
> accountsservice-0.6.9-5.fc15.x86_64
> gdm-3.0.0-2.fc15.x86_64

Downgrading to accountsservice-0.6.9-1.fc15.x86_64 ... and I no longer see the reported problem.  Perhaps something related to fixing bug#678236?  I had autologin enabled, and that issue was fixed by 0.6.9-5.  However, I just disabled autologin yesterday, and began seeing this bug.

Comment 14 Tim Flink 2011-05-06 17:59:36 UTC
Discussed in the 2011-05-06 blocker review meeting. It isn't clear what the impact of this bug is or how exactly how it is being hit. The questions we have are:

* Does this only affect upgrade non-network logins on first login?
* Does this affect network logins every time?
* Does this hit 'first login per boot' or just 'first login ever'?
* Are the network login issues and upgrade issues the same bug?

Deferred decision on blocker/NTH status until more information is available.

Comment 15 Ray Strode [halfline] 2011-05-06 18:09:49 UTC
This definitely seems blocker worthy and is on my near term radar, fwiw

Comment 16 Stephen Gallagher 2011-05-06 18:12:54 UTC
(In reply to comment #14)
> * Does this hit 'first login per boot' or just 'first login ever'?

First login per boot.

Comment 17 Ray Strode [halfline] 2011-05-06 18:43:20 UTC
So I looked into this a little bit today (reading through the code, I haven't tried to reproduce yet).

The trace has this in it:

#20 0x000000331fe37bdb in abort () at abort.c:92
#21 0x000000331fe722c3 in __libc_message (do_abort=2, fmt=
    0x331ff5ca28 "*** glibc detected *** %s: %s: 0x%s ***\n")
    at ../sysdeps/unix/sysv/linux/libc_fatal.c:186
#22 0x000000331fe787ba in malloc_printerr (action=3, str=
    0x331ff5cc18 "double free or corruption (fasttop)", ptr=<optimized out>)
    at malloc.c:6283
#23 0x0000003321e49ac3 in g_free (mem=0x1f02c00) at gmem.c:263
#24 0x0000003322a35993 in g_value_unset (value=0x7fff7d78c370) at gvalue.c:275
#25 0x000000332982e256 in gtk_tree_model_get_valist (tree_model=0x1de71c0, 
    iter=0x7fff7d78c570, var_args=0x7fff7d78c3d8) at gtktreemodel.c:1734

This suggests that either 

1) the tree model is already destroyed at the type of frame 25
or
2) some more less obvious heap corruption is going on

Let's assume 1, for now, since it's way more likely.  The comments above say the crash happens after the user logs in. It's certainly no surprise that the tree model would be destroyed at that point, since after log in we have no need for a user list.

Higher in the trace we see this:

#38 0x0000003320e03bac in on_get_all_finished (proxy=0x1f0cca0 [DBusGProxy], 
    call=<optimized out>, user=0x7fdb3c00b0c0 [ActUser]) at act-user.c:1052

So some time after login a GetAll call to the accounts service is finishing and we call back into the user list to update it (to reflect the changes returned by the GetAll call).  But since the user list is gone, we crash.

Assuming this is the only issue, the fix should be straight foward.  I'll push an update shortly and would appreciate feedback/confirmation of the fix.

Comment 18 Fedora Update System 2011-05-06 20:33:32 UTC
gdm-3.0.0-3.fc15 has been submitted as an update for Fedora 15.
https://admin.fedoraproject.org/updates/gdm-3.0.0-3.fc15

Comment 19 Fedora Update System 2011-05-07 15:09:05 UTC
Package gdm-3.0.0-3.fc15:
* should fix your issue,
* was pushed to the Fedora 15 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing gdm-3.0.0-3.fc15'
as soon as you are able to.
Please go to the following url:
https://admin.fedoraproject.org/updates/gdm-3.0.0-3.fc15
then log in and leave karma (feedback).

Comment 20 Fedora Update System 2011-05-09 04:05:16 UTC
gdm-3.0.0-3.fc15 has been pushed to the Fedora 15 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 21 James Laska 2011-05-10 12:01:37 UTC
I'm still seeing the problem in comment#12 and comment#16 problem using gdm-3.0.0-3.fc15.x86_64.  Anyone else?

Comment 22 Stephen Gallagher 2011-05-10 12:12:46 UTC
(In reply to comment #21)
> I'm still seeing the problem in comment#12 and comment#16 problem using
> gdm-3.0.0-3.fc15.x86_64.  Anyone else?

Yes, as I noted in my negative karma comment to the bodhi update.

Comment 23 Andrew McNabb 2011-05-12 15:34:21 UTC
Sorry--I was out of town this week, so I wasn't able to keep up on this bug.  I wonder if there might be two separate bugs.  Mine happens after I enter the username but before I have a chance to enter the password.

Comment 24 Ray Strode [halfline] 2011-05-16 19:34:41 UTC
Both issues should be addressed by the errata.

Comment 25 Marcus Moeller 2011-05-20 08:22:30 UTC
*** Bug 705805 has been marked as a duplicate of this bug. ***

Comment 26 Marcus Moeller 2011-05-20 08:23:59 UTC
The problem still exists with accountsservice 0.6.12 and gdm 3.0.0-3

Comment 27 Marcus Moeller 2011-05-24 09:01:16 UTC
Seems to be fixed in gdm 3.0.2

Comment 28 James Laska 2011-05-26 15:19:44 UTC
(In reply to comment #27)
> Seems to be fixed in gdm 3.0.2

Moving back to CLOSED based on your feedback.

Comment 29 Aaron Cohen 2011-05-27 18:28:17 UTC
I'm experiencing this same issue. Is it possible to get gdm-3.0.2 in Fedora 15 somehow? If not, why is this bug closed?

Comment 30 Andrew McNabb 2011-05-27 19:54:52 UTC
It doesn't look like gdm-3.0.2 is in koji at all, so I'll reopen this.

Comment 31 dnortham99 2011-06-01 00:02:43 UTC
I am also still seeing this issue with winbind users. I can reproduce issue with the following:

1. start fedora
2. only local users are shown in gdm-simple-greeter
3. click on 'other' enter winbind username  DOMAIN\user
4. password prompt appears and gdm-simple-greeter hangs before I am able to enter password.
5. press CTRL+ALT+F2 for shell console
6. kill -9 <pid of gdm-simple-greeter>
7. press CTRL+ALT+F1 take me back to Graphical Interface
8. now gdm-simple-greeter is displaying my winbind users. Click on my winbind user and now I am able to login. 


Occurs on every reboot.

I am currently running the following package versions:

   gdm-3.0.0-3.fc15.x86_64
   accountsservice-0.6.10-2.fc15.x86_64
   pam-1.1.3-8.fc15.x86_64

Comment 32 Blakkheim.GW 2011-06-06 15:26:13 UTC
*** Bug 708953 has been marked as a duplicate of this bug. ***

Comment 33 Blakkheim.GW 2011-06-06 15:43:50 UTC
I confirm that gdm-3.0.4 found in the "updates-testing" repo fix the problem in my case on Fedora 15.

Cheers.

Comment 34 Honza Horak 2011-06-07 07:31:50 UTC
gdm-3.0.4 seems to work for me. Thanks.

Comment 35 James Laska 2011-06-07 13:54:13 UTC
https://admin.fedoraproject.org/updates/gdm-3.0.4-1.fc15 has been pushed to stable.  I'm closing this bug once again.  

The original conditions that lead to this bug were slightly confusing and likely involved multiple issues.  If any problems remain beyond gdm-3.0.4-1, please file them as new bugs.

Comment 36 Michael J. Chudobiak 2011-06-08 12:32:19 UTC
*** Bug 707985 has been marked as a duplicate of this bug. ***