Bug 244490

Summary: After upgrading any fc6 system to f7, gdmgreeter will continuously die with a segmentation fault preventing any attempt to login.
Product: [Fedora] Fedora Reporter: William Kucharski <kucharsk>
Component: gdmAssignee: Ray Strode [halfline] <rstrode>
Status: CLOSED WONTFIX QA Contact:
Severity: high Docs Contact:
Priority: low    
Version: 7CC: bugzilla, kas, lorenzo.fiorini, malex, mattwilkens, m.a.young, triage
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2008-06-17 01:36:49 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description William Kucharski 2007-06-16 02:08:20 UTC
Description of problem:
After upgrading any fc6 system to f7, gdmgreeter will continuously die with a
segmentation fault preventing any attempt to login.

Version-Release number of selected component (if applicable):
gdm-2.18.0-14.fc7

How reproducible:
Upgrade an fc6 system to fc7.  Watch the boot proceed to the point where a login
screen would be presented and gdmgreeter will crash continuously, preventing a
login screen from even being presented.
  
Actual results:
Continual crash prevents any user from logging in

Expected results:
A normal login screen post-upgrade

Additional info:

Comment 1 William Kucharski 2007-06-16 02:13:40 UTC
This happens on both x86 and x86_64 systems.

The arrow cursor with blue spinner is displayed until gdmgreeter crashes, and
the whole cycle begins again.

A NULL pointer is obviously being dereferenced; the output of an strace shows:

writev(13, [{"GIOP\1\2\1\0\360\f\0\0", 12},
{"\350\247-F\3\0\0\0\0\0\0\0\34\0\0\0\0\0\0\0\315\340\270"..., 2036},
{"ROLE_TOOL_TIP\0\0\0\n\0\0\0ROLE_TREE\0\0\0"..., 1276}], 3) = 3324
poll([{fd=7, events=POLLIN}, {fd=13, events=POLLIN|POLLPRI, revents=POLLIN},
{fd=14, events=POLLIN|POLLPRI}, {fd=15, events=POLLIN|POLLPRI}], 4, -1) = 1
read(13, "GIOP\1\2\1\1$\0\0\0", 12)     = 12
read(13, "\350\247-F\0\0\0\0\1\0\0\0\1\0\0\0\f\0\0\0\1\1\1\1\1\0"..., 36) = 36
gettimeofday({1181959972, 478992}, NULL) = 0
gettimeofday({1181959972, 479045}, NULL) = 0
gettimeofday({1181959972, 479092}, NULL) = 0
gettimeofday({1181959972, 479139}, NULL) = 0
writev(13, [{"GIOP\1\2\1\0\210\v\0\0", 12},
{"\210\247-F\3\0\0\0\0\0\0\0\34\0\0\0\0\0\0\0\315\340\270"..., 2036},
{"ROLE_TOOL_TIP\0\0\0\n\0\0\0ROLE_TREE\0\0\0"..., 916}], 3) = 2964
poll([{fd=7, events=POLLIN}, {fd=13, events=POLLIN|POLLPRI, revents=POLLIN},
{fd=14, events=POLLIN|POLLPRI}, {fd=15, events=POLLIN|POLLPRI}], 4, -1) = 1
read(13, "GIOP\1\2\1\1$\0\0\0", 12)     = 12
read(13, "\210\247-F\0\0\0\0\1\0\0\0\1\0\0\0\f\0\0\0\1\1\1\1\1\0"..., 36) = 36
--- SIGSEGV (Segmentation fault) @ 0 (0) ---


Comment 2 Karsten Wade 2007-06-21 00:14:09 UTC
Please specify what you think needs to be included in the release notes.  We are
working on an update for F7, so this is a good time to get it included. 
However, if this is a common bug, it should go on that page instead.

If you have edit access:

http://fedoraproject.org/wiki/Bugs/F7Common
http://fedoraproject.org/wiki/Docs/Beats

Comment 3 William Kucharski 2007-06-21 09:10:40 UTC
The workaround that's always worked for me is to log into the machine remotely and run gdmsetup, and 
deselect the checkbox that would create a list of users pased on /etc/passwd.

However, that obviously doesn't work if you don't have a second machine running X to login from, so I'm 
not sure if you've seen the issue and/or have other workarounds for it.

Comment 4 Paul W. Frields 2007-06-22 02:32:00 UTC
Before we put this in the release notes, it would probably be good to have this
at least CONFIRMED if not ASSIGNED.  Ray, can you shed any light on this one?

Comment 5 William Kucharski 2007-06-22 04:02:09 UTC
Just to clarify, the check box in question is the "Include all users from
/etc/passwd (not for NIS)" in the Users tab of "Login Window Preferences."

The systems in question DO use NIS, and I suspect either that fact alone or the
fact that there are 44,996 users in the NIS passwd map is causing gdmgreeter to
choke.

Login windows on all updated systems appeared properly once the check was
removed from the box.

Comment 6 Ray Strode [halfline] 2007-06-22 15:53:04 UTC
It's a little late for a release notes change, no?

We should debug the problem, fix it, and get it into update.

William, if you run (from a gnome-terminal or xterm):

export DOING_GDM_DEVELOPMENT=1
/usr/libexec/gdmgreeter

does it crash also?

Comment 7 William Kucharski 2007-06-23 03:03:30 UTC
[root@spinup bin]# export DOING_GDM_DEVELOPMENT=1
[root@spinup bin]# /usr/libexec/gdmgreeter
/usr/share/gdm/themes/FedoraFlyingHigh/FedoraFlyingHigh.gtkrc:51: error:
unexpected identifier `stepperstyle', expected character `}'
Segmentation fault


Comment 8 Ray Strode [halfline] 2007-06-25 14:11:13 UTC
do you have a custom or old version of the gtk clearlooks theme engine installed?

what is the output of 
rpm -q gtk2-engines
and
rpm -V gtk2-engines

Comment 9 William Kucharski 2007-06-26 05:41:41 UTC
x86
===
# rpm -q gtk2-engines
gtk2-engines-2.10.2-2.fc7
# rpm -V gtk2-engines
# 

x86_64
=====
# rpm -q gtk2-engines
gtk2-engines-2.10.2-2.fc7
gtk2-engines-2.10.2-2.fc7
# rpm -V gtk2-engines
.......T   /usr/share/gtk-engines/clearlooks.xml
.......T   /usr/share/gtk-engines/crux.xml
.......T   /usr/share/gtk-engines/glide.xml
.......T   /usr/share/gtk-engines/hc.xml
.......T   /usr/share/gtk-engines/industrial.xml
.......T   /usr/share/gtk-engines/mist.xml
.......T   /usr/share/gtk-engines/redmond.xml
.......T   /usr/share/gtk-engines/smooth.xml
.......T   /usr/share/gtk-engines/thinice.xml
.......T   /usr/share/locale/ar/LC_MESSAGES/gtk-engines.mo
.......T   /usr/share/locale/bg/LC_MESSAGES/gtk-engines.mo
.......T   /usr/share/locale/ca/LC_MESSAGES/gtk-engines.mo
.......T   /usr/share/locale/da/LC_MESSAGES/gtk-engines.mo
.......T   /usr/share/locale/de/LC_MESSAGES/gtk-engines.mo
.......T   /usr/share/locale/dz/LC_MESSAGES/gtk-engines.mo
.......T   /usr/share/locale/en_GB/LC_MESSAGES/gtk-engines.mo
.......T   /usr/share/locale/es/LC_MESSAGES/gtk-engines.mo
.......T   /usr/share/locale/et/LC_MESSAGES/gtk-engines.mo
.......T   /usr/share/locale/fr/LC_MESSAGES/gtk-engines.mo
.......T   /usr/share/locale/gl/LC_MESSAGES/gtk-engines.mo
.......T   /usr/share/locale/gu/LC_MESSAGES/gtk-engines.mo
.......T   /usr/share/locale/hu/LC_MESSAGES/gtk-engines.mo
.......T   /usr/share/locale/ko/LC_MESSAGES/gtk-engines.mo
.......T   /usr/share/locale/lt/LC_MESSAGES/gtk-engines.mo
.......T   /usr/share/locale/mk/LC_MESSAGES/gtk-engines.mo
.......T   /usr/share/locale/nl/LC_MESSAGES/gtk-engines.mo
.......T   /usr/share/locale/oc/LC_MESSAGES/gtk-engines.mo
.......T   /usr/share/locale/or/LC_MESSAGES/gtk-engines.mo
.......T   /usr/share/locale/pa/LC_MESSAGES/gtk-engines.mo
.......T   /usr/share/locale/pt/LC_MESSAGES/gtk-engines.mo
.......T   /usr/share/locale/pt_BR/LC_MESSAGES/gtk-engines.mo
.......T   /usr/share/locale/sv/LC_MESSAGES/gtk-engines.mo
.......T   /usr/share/locale/uk/LC_MESSAGES/gtk-engines.mo
.......T   /usr/share/themes/Clearlooks/gtk-2.0/gtkrc
.......T   /usr/share/themes/Crux/gtk-2.0/gtkrc
.......T   /usr/share/themes/Industrial/gtk-2.0/gtkrc
.......T   /usr/share/themes/Mist/gtk-2.0/gtkrc
.......T   /usr/share/themes/ThinIce/gtk-2.0/gtkrc
#

Comment 10 Adam Tkac 2007-07-02 12:29:24 UTC
*** Bug 246399 has been marked as a duplicate of this bug. ***

Comment 11 Adam Tkac 2007-07-02 12:37:41 UTC
I tried
$export DOING_GDM_DEVELOPMENT=1
$/usr/libexec/gdmgreeter

on rawhide and it works fine. But without DOING_GDM_DEVELOPMENT variable is
gdmgreeter unusable

Adam

Comment 12 Ray Strode [halfline] 2007-07-02 15:08:10 UTC
Hi William,

Can you grab the -debuginfo packages for gtk2-engines and gdm and get a
backtrace of the crash from comment 7?

Hi Adam, what makes you think bug 246399 is a duplicate of this one?

Comment 13 Jan "Yenya" Kasprzak 2007-07-04 20:44:03 UTC
I _may_ have a similar problem: X keeps restarting (however, the standalone "X"
command with the same args as in custom.conf works) - it displays (briefly,
maybe half a second or less) the blue screen with balloons and the greeter, and
then restarts X again. The strange thing is that I have a dual-head/dual-seat
setup (two graphics cards, two keyboards, two mice), and the problem occurs only
on one of the heads (:1 on nVidia TNT2 PCI running open source nv driver), while
on the other head (:0 on ATI Radeon 7500 AGP) it works as expected and I am able
to log in.

The /var/log/Xorg.1.log does not show any crash messages, but in
/var/log/messages I am getting the following:

gdm[2789]: failsafe dialog failed (inhibitions: 0 0)
gdm[2789]: failsafe dialog failed (inhibitions: 0 1)
gdm[2789]: failsafe dialog failed (inhibitions: 1 1)
gdm[2789]: The display server has been shut down about 6 times in the last 90
seconds. It is likely that something bad is going on.  Waiting for 2 minutes
before trying again on display :1.

Using xdm instead of gdm fixed the problem (and now I can have both users
working simultaneously).

One possible difference is that the nVidia head is only 15bpp (5 bits per color
plane).

Under FC6 (and earlier versions) I had no such problem with gdm.
I may attach my xorg.conf if needed. The custom.conf is the stock one only with
the servers sections modified to this:
============
[servers]
0=Standard
1=2nd

[server-Standard]
name=Standard server
command=/usr/bin/Xorg -audit 0 vt7 -layout ATI+LGLCD 
flexible=true

[server-2nd]
name=Second server
command=/usr/bin/Xorg -audit 0 vt7 -layout Riva+PhilipsLCD -sharevts -novtswitch
 -isolateDevice PCI:00:19:0
flexible=true
============

Comment 14 William Kucharski 2007-07-18 14:12:06 UTC
Can you tell me where to get the -debuginfo packages?

Comment 15 Jan "Yenya" Kasprzak 2007-07-18 14:24:12 UTC
Re: comment #14:

yum install yum-utils
debuginfo-install gdm

Comment 16 William Kucharski 2007-07-18 19:05:44 UTC
# /usr/libexec/gdmgreeter
/usr/share/gdm/themes/FedoraFlyingHigh/FedoraFlyingHigh.gtkrc:51: error:
unexpected identifier `stepperstyle', expected character `}'
Segmentation fault (core dumped)

gdb says:

Core was generated by `/usr/libexec/gdmgreeter'.
Program terminated with signal 11, Segmentation fault.
#0  0x00000000004132d8 in greeter_item_ulist_setup () at greeter_item_ulist.c:186
186                     if (usr->gecos && strcmp (usr->gecos, "") != 0) {

Given the faulting %rip:

(gdb) info registers
rax            0x0      0
rbx            0x160fdb0        23133616
rcx            0xfdffd0 16646096
rdx            0x21     33
rsi            0xfdffe0 16646112
rdi            0x2e2e2e657265   50775881773669
rbp            0xfdffe0 0xfdffe0
rsp            0x7ffff28b79f0   0x7ffff28b79f0
r8             0xfdffe0 16646112
r9             0x1      1
r10            0x2      2
r11            0x3314e73a90     219394030224
r12            0x16f1030        24055856
r13            0x16f0520        24053024
r14            0x0      0
r15            0x658c90 6655120
rip            0x4132d8 0x4132d8 <greeter_item_ulist_setup+1320>
eflags         0x10206  [ PF IF RF ]
cs             0x33     51
ss             0x2b     43
ds             0x0      0
es             0x0      0
fs             0x0      0
0x0000000000413299 <greeter_item_ulist_setup+1257>:     je     0x413320
<greeter_item_ulist_setup+1392>
0x000000000041329f <greeter_item_ulist_setup+1263>:     mov    0x0(%r13),%r12
0x00000000004132a3 <greeter_item_ulist_setup+1267>:     movq   $0x0,0x50(%rsp)
0x00000000004132ac <greeter_item_ulist_setup+1276>:     movq   $0x0,0x58(%rsp)
0x00000000004132b5 <greeter_item_ulist_setup+1285>:     movq   $0x0,0x60(%rsp)
0x00000000004132be <greeter_item_ulist_setup+1294>:     movq   $0x0,0x68(%rsp)
0x00000000004132c7 <greeter_item_ulist_setup+1303>:     mov    0x18(%r12),%rdi
0x00000000004132cc <greeter_item_ulist_setup+1308>:     test   %rdi,%rdi
0x00000000004132cf <greeter_item_ulist_setup+1311>:     je     0x4132dc
<greeter_item_ulist_setup+1324>
0x00000000004132d1 <greeter_item_ulist_setup+1313>:     movzbl 76928(%rip),%eax
       # 0x425f58 <__dso_handle+5360>
0x00000000004132d8 <greeter_item_ulist_setup+1320>:     cmp    (%rdi),%al
0x00000000004132da <greeter_item_ulist_setup+1322>:     jne    0x4132e1
<greeter_item_ulist_setup+1329>
0x00000000004132dc <greeter_item_ulist_setup+1324>:     mov    0x8(%r12),%rdi
0x00000000004132e1 <greeter_item_ulist_setup+1329>:     callq  0x420ff0
<gdm_common_text_to_escaped_utf8>

It looks like %r12 may be scrambled, as it's supposed to be "usr", but printing
the resulting structure returns garbage:

(gdb) print *(GdmUser *)0x16f1030
$1 = {uid = 544173908, login = 0x7420737265737520 <Address 0x7420737265737520
out of bounds>, 
  homedir = 0x68207473696c206f <Address 0x68207473696c206f out of bounds>, gecos
= 0x2e2e2e657265 <Address 0x2e2e2e657265 out of bounds>, 
  picture = 0x240}

It looks like the loop STARTED with good data:

(gdb)  print *(GdmUser *)users->data
$6 = {uid = 110305, login = 0x163c240 "alpha", homedir = 0x163c2b0
"/home/alpha", gecos = 0x163c290 "Alpha Guy", picture = 0xc90ed0}

But the linked list ENDS with bad data; judging from r13, the last GList
entry pointed to is:

(gdb) print *(GList *)0x16f0520
$12 = {data = 0x16f1030, next = 0x0, prev = 0x160f2e0}

The "data" field is the value of r12 that caused the error:

(gdb) print *(GdmUser *)0x16f1030 
$13 = {uid = 544173908, login = 0x7420737265737520 <Address 0x7420737265737520
out of bounds>, 
  homedir = 0x68207473696c206f <Address 0x68207473696c206f out of bounds>, gecos
= 0x2e2e2e657265 <Address 0x2e2e2e657265 out of bounds>, 
  picture = 0x240}

but the data field of the "prev" entry is fine:

(gdb) print *(GList *)0x160f2e0
$14 = {data = 0x160fdd0, next = 0x16f0520, prev = 0xcd7ae0}
(gdb) print *(GdmUser *)0x160fdd0 
$15 = {uid = 36314, login = 0x160fdb0 "zzyxy", homedir = 0x160fe20
"/home/zzyxy", gecos = 0x160fe00 "Zippo User", picture = 0xc90ed0}
(gdb) print *(GdmUser *)(GList *)0x160f2e0->data








Comment 17 Matthew Wilkens 2007-07-19 17:27:45 UTC
I'm seeing the same symptoms as bug 246399 (gdmgreeter segfaults when run over
vnc/xinetd), which was closed as a dupe of this bug. But gdm works fine locally
- it just crashes over vnc. Anyway, another data point. Let me know if I can
supply more info.

Comment 18 Matthew Wilkens 2007-07-24 18:09:55 UTC
A quick follow-on to my last post (comment 17). I'm on x86_64. The only
suspicious thing I see in /var/log/messages on gdm startup is this:

gdm[5673]: (null): cannot open shared object file: No such file or directory

I don't know *what* object file it's looking for, though. When gdmgreeter
segfaults, the log entry is:

kernel: gdmgreeter[5820]: segfault at 000000000000001c rip 00002aaaaf39bfa7 rsp
00007fffc19eca30 error 4

Comment 19 Matthew Wilkens 2007-07-24 18:11:26 UTC
Oh, and the recent gdm-2.18.3-1.fc7 update is applied, but doesn't fix the problem.

Comment 20 Michael Young 2007-08-08 12:13:53 UTC
I am seeing the vnc related issue, and did an strace of the gdm processes, and
the lines leading up to the segfault are
[pid 16228] open("/etc/gdm/modules/AccessDwellMouseEvents", O_RDONLY) = 15
[pid 16228] fstat64(15, {st_mode=S_IFREG|0644, st_size=2473, ...}) = 0
[pid 16228] mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS,
-1, 0) = 0xb7f5f000
[pid 16228] read(15, "# This is the configuration file"..., 4096) = 2473
[pid 16228] read(15, "", 4096)          = 0
[pid 16228] close(15)                   = 0
[pid 16228] munmap(0xb7f5f000, 4096)    = 0
[pid 16228] writev(3, [{"b\2\6\0\17\0\0\0", 8}, {"XInputExtension", 15}, {"\0",
1}], 3) = 24
[pid 16228] read(3,
"\1\225Q\0\0\0\0\0\0\0\0\0\0\20\0\0\0\0\0\0\0\0\0\0\0\0\0\0\30\t\215\277", 32) = 32
[pid 16228] write(2, "Xlib:  extension \"XInputExtensio"..., 71) = 71
[pid 16228] --- SIGSEGV (Segmentation fault) @ 0 (0) ---
Process 16228 detached

This led me to a workaround, in gdmsetup under the Accessibility tag deselect
the Enable accessible login check box (or equivalently add
GtkModuleList=
AddGtkModules=false
to the [daemon] section of /etc/gdm/custom.conf and restart gdm). 

Comment 21 Lorenzo Fiorini 2007-08-08 12:17:56 UTC
(In reply to comment #20)

> This led me to a workaround, in gdmsetup under the Accessibility tag deselect
> the Enable accessible login check box

This has worked for me.


Comment 22 Ray Strode [halfline] 2007-08-08 13:57:14 UTC
Ah interesting.  Thanks for debugging this.

Retitling for clarity.  We should just drop that module, it's broken in so many
ways (see bug 248752 and upstream bug 457998 for other ways this module is broken)

Comment 23 William Kucharski 2007-08-08 14:41:58 UTC
This is NOT the same bug as the VNC issue, and I'm not sure how they got
collapsed, so I've changed the summary BACK.

Basically there's an issue with a large number of NIS password entries and the
"Include all users from /etc/passwd (not for NIS)" option in the Users tab of
"Login Window Preferences" being checked by default.  The fact that the option
is checked by default and is new to FC7 is what causes newly upgraded systems to
crash and burn on systems with large NIS password files.

See comment #5, above, for my workaround, and if you look at comment #16, you'll
see it's an ENTIRELY different failure mechanism in which the last entry of the
GdmUser linked list ends up pointing off into the weeds (an area of memory
containing strings, if you decode the "weird" pointers as ASCII characters you
get one ending with "o list h" and one ending with "ere..."

Comment 24 Ray Strode [halfline] 2007-08-08 15:16:55 UTC
Hi William,

Sorry, I only read the bottom few comments of the bug report this morning.  I've
reopened bug 246399 to address the identified dwellmouselistener issue.

Comment 25 Chris Schanzle 2007-09-18 22:45:01 UTC
This NIS bug has been annoying me too.  Another workaround is to set
DISPLAYMANAGER=KDE in /etc/sysconfig/desktop.

Interestingly, fix in comment #5 does not disable NIS accounts from logging in,
which sets IncludeAll=false under the [greeter] section in /etc/gdm/custom.conf
(now I have patch to apply to kickstart installs).

Our NIS password file is relatively small, but just over 1k:

$ ypcat passwd|wc -l
1030


Comment 26 Bug Zapper 2008-05-14 13:07:47 UTC
This message is a reminder that Fedora 7 is nearing the end of life. Approximately 30 (thirty) days from now Fedora will stop maintaining and issuing updates for Fedora 7. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as WONTFIX if it remains open with a Fedora 'version' of '7'.

Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version prior to Fedora 7's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that we may not be able to fix it before Fedora 7 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora please change the 'version' of this bug. If you are unable to change the version, please add a comment here and someone will do it for you.

Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. If possible, it is recommended that you try the newest available Fedora distribution to see if your bug still exists.

Please read the Release Notes for the newest Fedora distribution to make sure it will meet your needs:
http://docs.fedoraproject.org/release-notes/

The process we are following is described here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 27 Bug Zapper 2008-06-17 01:36:48 UTC
Fedora 7 changed to end-of-life (EOL) status on June 13, 2008. 
Fedora 7 is no longer maintained, which means that it will not 
receive any further security or bug fix updates. As a result we 
are closing this bug. 

If you can reproduce this bug against a currently maintained version 
of Fedora please feel free to reopen this bug against that version.

Thank you for reporting this bug and we are sorry it could not be fixed.