Bug 1081004 - after remaining locked for a time, GDM login screen fails to authenticate
Summary: after remaining locked for a time, GDM login screen fails to authenticate
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Fedora
Classification: Fedora
Component: gdm
Version: 20
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
Assignee: Ray Strode [halfline]
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-03-26 13:26 UTC by Erik M Jacobs
Modified: 2014-10-27 16:45 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2014-10-27 16:45:42 UTC


Attachments (Terms of Use)
journalctl -b -a -l --no-pager | grep -E "gdm|gnome" (347.81 KB, text/x-log)
2014-04-06 14:38 UTC, Erik M Jacobs
no flags Details

Description Erik M Jacobs 2014-03-26 13:26:59 UTC
Description of problem:
In the last few days I have noticed that, after returning to my machine when it has been locked for some time (>60 minutes) I am unable to log back in to my session. 

I also notice that, when I hit Enter to bring up the password prompt, the password field is not highlighted.

I will see the spinner after I submit my password, but it never completes. The screen will go black again and if I hit control or move the mouse it will come back and, at some point, I might see "Authentication Failure" but it doesn't kick me back to attempting to enter a password again. I eventually need to hit Cancel.

Restarting GDM by logging in as root and killing the GDM process appears to work, but that kills your current session.

Version-Release number of selected component (if applicable):

I have seen "no space left on device" mentioned in reference to GDM in the journalctl -f. If it happens again I will try to gist it.

How reproducible:
Intermittant, but seems like a high percentage of the time.

Steps to Reproduce:
1. Login to Gnome and do some things (browser, etc).
2. Super+L to lock the screen.
3. Walk away for >30 minutes.
4. Come back and hit Enter to bring up the password prompt for your user.
5. Enter the password and hit "Unlock".

Actual results:
Nothing happens - you never get logged back in / unlocked.

Expected results:
You should be able to unlock.

Additional info:
This seems like a pretty serious issue.

According to 684653 this problem could maybe occur if /bin/su was lacking the s-bit but that is not the case in my installation:

[root@t440-festive-local ccfba3d218eb474bb9933774d4243980]# ll /bin/su
-rwsr-xr-x. 1 root root 32024 Jan 21 04:07 /bin/su

I have not verified whether this happens immediately on lock, but I will test that now.

Comment 1 Erik M Jacobs 2014-03-26 13:27:29 UTC
In reference to the "no space left on device" error I was seeing:

[root@t440-festive-local ccfba3d218eb474bb9933774d4243980]# df -h
Filesystem      Size  Used Avail Use% Mounted on
/dev/dm-2        96G   63G   29G  69% /
devtmpfs        5.8G     0  5.8G   0% /dev
tmpfs           5.8G  148K  5.8G   1% /dev/shm
tmpfs           5.8G  1.2M  5.8G   1% /run
tmpfs           5.8G     0  5.8G   0% /sys/fs/cgroup
tmpfs           5.8G 1012K  5.8G   1% /tmp
/dev/sda1       477M  129M  319M  29% /boot

There is plenty of space left on /

Comment 2 Erik M Jacobs 2014-03-26 13:29:43 UTC
I just attempted a manual lock with immediate unlock and it worked.

It really seems like there is a time factor in play here.

Comment 3 Ray Strode [halfline] 2014-03-26 14:00:36 UTC
when it happens, can you hit ctrl-alt-f6, log in at a VT, and run:

$ loginctl list-sessions

then run

$ loginctl show-session N

(where N is the number of your session).  I'm curious if it says

State=closing

instead of

State=active

There have been systemd bugs in the past causing the session to prematurely go to closing.

Also, what is the output of

$ rpm -q gnome-shell gdm systemd

If you run

$ loginctl unlock-session N

(where N is the number of your session), then hit ctrl-alt-f1 does your session unlock?

Can you add Enable=true to the [debug] section of /etc/gdm/custom.conf reboot, reproduce, then attach the output of

# journalctl -b -a -l


?
?

Comment 4 Erik M Jacobs 2014-03-26 21:43:59 UTC
[root@t440-festive-local ~]# loginctl show-session 1
Id=1
Timestamp=Wed 2014-03-26 09:15:27 EDT
TimestampMonotonic=21149926
VTNr=1
Display=:0
Remote=no
Service=gdm-password
Scope=session-1.scope
Leader=2069
Audit=1
Type=x11
Class=user
Active=no
State=online
IdleHint=yes
IdleSinceHint=1395865780453170
IdleSinceHintMonotonic=26074421634
Name=thoraxe

[root@t440-festive-local ~]# rpm -q gnome-shell gdm systemd
gnome-shell-3.10.4-2.fc20.x86_64
gdm-3.10.0.1-1.fc20.x86_64
systemd-208-15.fc20.x86_64

$ loginctl unlock-session N -> I get in, yes.

... Can you add Enable=true to the [debug] ...

I will do it, yeah.

# journalctl -b -a -l

This puts me into a less-like buffer. I think you may want:

journalctl -b -a -l --no-pager > foo.txt

?

Comment 7 Erik M Jacobs 2014-04-06 14:38:46 UTC
Created attachment 883275 [details]
journalctl -b -a -l --no-pager | grep -E "gdm|gnome"

Comment 8 Ray Strode [halfline] 2014-04-09 17:02:29 UTC
i see this in your log:

Mar 31 15:58:10 t440-festive-local gdm[1093]: AccountsService: Failed to monitor logind session changes: No space left on device

which leads to a chain of failures.

There error code is from inotify_add_watch, which says:

Mar 31 15:58:10 t440-festive-local gdm[1093]: AccountsService: Failed to monitor logind session changes: No space left on device

So my assumption is something on this system is leaking inotify watches.  If you run lsof, is there any process with a large number of open files?  Maybe tracker? or some sort of virus scanner?

Comment 9 Ray Strode [halfline] 2014-04-09 17:03:06 UTC
sorry that second paste from your log, should have been a paste from the inotify_add_watch man page:

       ENOSPC The user limit on  the  total  number  of  inotify  watches  was
              reached or the kernel failed to allocate a needed resource.

Comment 10 Erik M Jacobs 2014-04-09 17:36:55 UTC
I have both spideroak and crashplan (spideroak for work, crashplan for personal) as well as dropbox. These three are huge hogs when it comes to watching files.

I could probably not run these services all the time, but that somewhat defeats the purpose of having backup software like this...

working on a way to produce the output of lsof organized by # of files per process, but it's probably just that I'm running out between those three very watchy programs.

currently:
[root@t440-festive-local proc]# sysctl -a | grep watch
fs.epoll.max_user_watches = 2451333
fs.inotify.max_user_watches = 8192
kernel.nmi_watchdog = 1
kernel.watchdog = 1
kernel.watchdog_thresh = 10

Comment 11 Erik M Jacobs 2014-04-09 17:38:20 UTC
systemd: 27
systemd-j: 24
lvmetad: 17
systemd-u: 27
auditd: 21
audispd: 4
sedispatc: 13
dmeventd: 18
alsactl: 12
bluetooth: 15
firewalld: 68
accounts-: 21
rtkit-dae: 9
ModemMana: 48
cupsd: 45
avahi-dae: 14
avahi-dae: 14
rpcbind: 20
chronyd: 9
rngd: 3
irqbalanc: 9
systemd-l: 18
dbus-daem: 13
mcelog: 4
atd: 14
crond: 13
gdm: 25
abrtd: 49
abrt-watc: 48
abrt-watc: 47
polkitd: 28
gdm-simpl: 27
colord: 39
Xorg: 74
NetworkMa: 49
httpd: 108
libvirtd: 114
systemd: 23
sshd: 37
rpc.statd: 22
(sd-pam): 36
java: 103
wpa_suppl: 24
httpd: 108
httpd: 108
httpd: 108
httpd: 108
httpd: 108
upowerd: 49
gdm-sessi: 57
dnsmasq: 10
systemd: 23
(sd-pam): 36
gnome-key: 29
gnome-ses: 94
dbus-laun: 7
dbus-daem: 13
at-spi-bu: 24
dbus-daem: 13
at-spi2-r: 26
gvfsd: 26
gvfsd-fus: 28
gnome-set: 219
pulseaudi: 86
gvfs-udis: 35
udisksd: 30
gvfs-goa-: 24
goa-daemo: 120
mission-c: 43
gvfs-mtp-: 29
gvfs-afc-: 45
gvfs-gpho: 32
syndaemon: 11
gnome-she: 345
gsd-print: 83
dconf-ser: 15
ibus-daem: 30
ibus-dcon: 32
ibus-x11: 82
gnome-she: 151
evolution: 160
dnsmasq: 10
seapplet: 88
SpiderOak: 164
evolution: 183
tracker-m: 69
abrt-appl: 125
tracker-s: 46
dropbox: 204
evolution: 176
obexd: 12
ibus-engi: 29
SpiderOak: 104
inotify_d: 13
gnome-ter: 154
gnome-pty: 5
bash: 8
sudo: 50
su: 47
bash: 8
gvfsd-tra: 31
tmux: 11
tmux: 12
bash: 8
bash: 8
dhclient: 40
journalct: 42
bash: 8
tmux: 11
tmux: 12
bash: 8
vim: 31
bash: 8
systemd: 23
(sd-pam): 36
pidgin: 314
gconfd-2: 22
thunderbi: 251
gvfsd-mtp: 34
gvfsd-met: 30
firefox: 271
gvfsd-htt: 39
bash: 8
sudo: 50
su: 47
bash: 8
plugin-co: 123
bash: 8
grep: 7

Comment 12 Richard Shaw 2014-04-11 21:16:20 UTC
I can confirm similar behavior on my F20 x86_64 system, however, when it stalls on logging in I think I hit ESC or cancel, and then on the second attempt it lets me in.

Comment 13 baj_be 2014-05-23 12:21:43 UTC
I've also noticed the same behavior here, with dropbox and crashplan running on a FC20 x86_64 box. 

It seems that the problem indeed comes from the value of fs.inotify.max_user_watches, which is probably too low to keep track of all the files that need to be backed up by crashplan, as explained in :

http://support.code42.com/CrashPlan/Latest/Troubleshooting/Real-Time_Backup_For_Network-Attached_Drives

"CrashPlan on Linux depends on the inotify kernel module to know when files are added or edited in real-time.

<cut>

The inotify module is governed by a property called max_user_watches. If you attempt to exceed the max number, you receive the following error in the engine_error.log (but the process lives on).

inotify_add_watch: No space left on device
Any file not covered by a watch does not have real-time backup protection."


They suggest to give a larger value to fs.inotify.max_user_watches in the /etc/sysctl.conf file, such as fs.inotify.max_user_watches=1048576

The system needs to be restarted.

Comment 14 Erik M Jacobs 2014-05-23 12:57:39 UTC
I can concur that increasing the number of watches fixes the issue, but I am unsure why the *system* would need to be restarted? Doesn't editing sysctl.conf and then doing sysctl -p cause all values in /etc/sysctl.conf to be immediately loaded into the kernel?

I think we need to answer a question, though:

Should GDM be able to deal with an exhausted inotify watcher state gracefully?

Depending on the answer to that, this is either still a bug or should be closed.

Comment 15 baj_be 2014-05-23 13:20:02 UTC
(In reply to Erik M Jacobs from comment #14)
> I can concur that increasing the number of watches fixes the issue, but I am
> unsure why the *system* would need to be restarted? Doesn't editing
> sysctl.conf and then doing sysctl -p cause all values in /etc/sysctl.conf to
> be immediately loaded into the kernel?

That could be, but my system crashed doing that. Maybe for another reason...

> 
> I think we need to answer a question, though:
> 
> Should GDM be able to deal with an exhausted inotify watcher state
> gracefully?
> 
> Depending on the answer to that, this is either still a bug or should be
> closed.

I don't have a definitive answer to this question. It would probably be nicer for the end user if gdm would be able to cope with this. But is gdm the only piece of software that is affected by that ? I would say that people who are writing apps that could reach the limit imposed by the number of watches should handle the problem themselves (i.e. crashplan in this case).

Comment 16 Erik M Jacobs 2014-05-23 13:58:12 UTC
(In reply to baj_be from comment #15)
> (In reply to Erik M Jacobs from comment #14)
> > Should GDM be able to deal with an exhausted inotify watcher state
> > gracefully?
> > 
> > Depending on the answer to that, this is either still a bug or should be
> > closed.
> 
> I don't have a definitive answer to this question. It would probably be
> nicer for the end user if gdm would be able to cope with this. But is gdm
> the only piece of software that is affected by that ? I would say that
> people who are writing apps that could reach the limit imposed by the number
> of watches should handle the problem themselves (i.e. crashplan in this
> case).

Crashplan, Dropbox and Spideroak never once complained about reaching the inotify limit. Ironically the only thing that "broke" was GDM and other gnome-related facilities.

So, it sounds to me like these services already handle this gracefully...?

Comment 17 baj_be 2014-05-23 14:48:58 UTC
(In reply to Erik M Jacobs from comment #16)
> 
> Crashplan, Dropbox and Spideroak never once complained about reaching the
> inotify limit. Ironically the only thing that "broke" was GDM and other
> gnome-related facilities.

Well, the URL I've mentioned earlier is from the support section of code42, the editor of crashplan. At least they know :-)


Note You need to log in before you can comment on or make changes to this bug.