Bug 212303 - dbus segfaults in dbus-launch
dbus segfaults in dbus-launch
Status: CLOSED RAWHIDE
Product: Fedora
Classification: Fedora
Component: dbus (Show other bugs)
rawhide
x86_64 Linux
medium Severity medium
: ---
: ---
Assigned To: John (J5) Palmieri
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2006-10-26 01:25 EDT by Michal Jaegermann
Modified: 2013-03-13 00:51 EDT (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2006-11-04 15:35:05 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Michal Jaegermann 2006-10-26 01:25:16 EDT
Description of problem:

With dbus-0.94-1.fc I am collecting in logs entries like that:

kernel: dbus-launch[17636]: segfault at 0000000000000000 rip 0000003679c75510
rsp 00007fff65d38e88 error 4

Looking with gdb at dropped core I am getting the following:

Core was generated by `/usr/bin/dbus-launch --exit-with-session
/etc/X11/xinit/Xclients'.
Program terminated with signal 11, Segmentation fault.
#0  0x0000003679c75510 in strlen () from /lib64/libc.so.6
(gdb) where
#0  0x0000003679c75510 in strlen () from /lib64/libc.so.6
#1  0x00000000004037e6 in x11_init () at dbus-launch-x11.c:250
#2  0x000000000040280e in main (argc=3, argv=0x7fffe001f9c8)
    at dbus-launch.c:384
(gdb) f 1
#1  0x00000000004037e6 in x11_init () at dbus-launch-x11.c:250
250       atom_name = malloc (strlen (machine) + strlen (user_name) + 2 +
(gdb) l
245         }
246       user_name = xstrdup(user->pw_name);
247
248       machine = get_machine_uuid ();
249
250       atom_name = malloc (strlen (machine) + strlen (user_name) + 2 +
251                           MAX (strlen (selection_prefix),
252                                MAX (strlen (address_prefix),
253                                     strlen (pid_prefix))));
254       if (atom_name == NULL)
(gdb)

None of variables is, unfortunately, available to gdb.  Still one
wonders why strlen() is used, instead of sizeof, on selection_prefix,
address_prefix and pid_prefix when all these three are
static const char arrays?  Quite weird.  OTOH do we really now
that xstrdup(user->pw_name) and get_machine_uuid() are guaranteed
to succeed?

Too much is allocated for atom_name anyway but this should not be
an issue.
  
Version-Release number of selected component (if applicable):
dbus-0.94-1.fc7

How reproducible:
apparently at each session start
Comment 1 John (J5) Palmieri 2006-10-26 11:21:09 EDT
strlen works fine and is more consitent.  please run this command as root:

chmod 0644 /etc/dbus-1/machine-id

and run dbus-launch again.  Thanks.
Comment 2 Michal Jaegermann 2006-10-26 11:54:43 EDT
> strlen works fine and is more consitent.

And later Dave Jones writes articles "Why the user space sucks" and,
even worse, he does not imagine things. "Consistent" with what?
Sure, recalculating few constants, known in a compile time, on its own
does not change very much in the general picture.

Making /etc/dbus-1/machine-id generally readable does not, unfortunately,
help at all.  I got immediately after that twice

kernel: dbus-launch[2793]: segfault at 0000000000000000 rip 0000003679c75510 rsp
00007fffb33e46a8 error 4

and two shiny new core files.  Restarting 'messagebus' service and 'gdm'
also does not have any effect.

Later during the day, if still needed, I will try to find some time
to dig deeper into that.
Comment 3 John (J5) Palmieri 2006-10-26 12:27:18 EDT
I could just as well point you to Michael Abrash's Graphical Black Book which
talks about needless optimizations.  Let's spend our time fixing this bug
instead of waxing over useless nitpicking.

The only thing that looks like it can go wrong is the strlen over the machine id
or user_name.  To get a better backtrace I suggest downloading and installing
the srpm and putting export CFLAGS="-O0 -ggdb3" right before the %configure
step.  This looks like an x86_64 bug.  I have a box at home I will try to debug
with.  On my x86 box dbus-launch works fine.
Comment 4 Matthias Clasen 2006-10-26 13:59:23 EDT
the compiler should be well capable of turning strlen(<string literal>) into a
compile-time constant.
Comment 5 Michal Jaegermann 2006-10-26 18:15:04 EDT
Well, after dropping 'static' from 'int init_x_atoms(Display *display)',
so gdb can see variables in this function, I am getting 0x0 as a value
for 'machine' and 'strlen' on it has to bomb out.

But that would mean that 'save_machine_uuid()' from dbus-launch.c
was not running before and I am scratching my head why actually it should.

BTW - CFLAGS="-O0 -ggdb3" is not required.  gdb gets variables all right
once the function is no longer static.  Also to get a core setting ulimit
accordingly is not enough.  These cores are dumped in / and non-root process
will be not able to dump there so one has to log into a session as root.
Comment 6 John (J5) Palmieri 2006-10-26 18:39:03 EDT
there is a fix in dbus cvs.  can you run dbus-uuidgen --ensure as root.  It
should be reading the machine id from that file which should have been created
in the post.  We are moving to creating it in the init script.
Comment 7 Michal Jaegermann 2006-10-26 21:24:22 EDT
> can you run dbus-uuidgen --ensure as root.

I did and this did not change segfault at all.  Not a real surprise.
/etc/dbus-1/machine-id is in place and 'dbus-uuidgen --get' does not
have any issues with printing its content.
Comment 8 Michal Jaegermann 2006-10-29 00:59:37 EDT
This bomb happens in a forked process and I am not finding a method
to get there with 'gdb -batch ...' and I also do not see how to get
there with gdb interactive.  But with 'machine_uuid' "hardwired" in
dbus-launch.c to a value from /etc/dbus-1/machine-id there is no core
and I see a child process running.

With /usr/bin/dbus-lauch replaced with the following script

#!/bin/sh

commands=/var/tmp/gdbcoms

if echo $@ | /bin/grep -q exit-with-ses ; then
    ( DBUS_VERBOSE="Y" \
        /usr/bin/gdb -x $commands -batch \
            --args /usr/bin/dbus-launch.real ${@} ) >/var/tmp/dbus.log 2>&1
else
    exec  /usr/bin/dbus-launch.real ${@}
fi

regardless if I am getting a core or not I see the following when
session terminates:

Applications must not close shared connections - see dbus_connection_close()
docs. This is a bug in the application.
Applications must not close shared connections - see dbus_connection_close()
docs. This is a bug in the application.
Window manager warning: CurrentTime used to choose focus window; focus window
may not be correct.
Window manager warning: Got a request to focus the no_focus_window with a
timestamp of 0.  This shouldn't happen!


Comment 9 Jay Cliburn 2006-10-30 22:11:10 EST
John Palmieri, can you please clarify comment #6:  is an upstream fix headed
this way, and does that fix, insofar as you know, include something more than
just adjusting perms on the uuid file or simply rerunning dbus-uuidgen?

Thanks.
Comment 10 John (J5) Palmieri 2006-10-31 11:24:22 EST
yes, sorry.  I'll be doing a release sometime this week and will promptly put it
into rawhide.  The fix involves checking if the machine ID is NULL along with
changing the file perm and moving the file into /var/lib.  We will also be
generating the file if it does not exist on dbus startup insead of the %post.  
Comment 11 Michal Jaegermann 2006-10-31 11:32:26 EST
> The fix involves checking if the machine ID is NULL along with
> changing the file perm ...

AFAICT an existence of readable file and with a right content does
not change anything; unless an ID string is read with a trailing '\n'
and then it is 33 chars long while somewhere there is a check for
exactly 32.
Comment 12 John (J5) Palmieri 2006-11-03 18:09:45 EST
New D-Bus should show up in the compose tonight.  Make sure you restart the
system bus reboot or run dbus-uuidgen --ensure before running dbus-launch as the
machine-id location has moved.  HAL will have some issues as it contains some
bugs that the new D-Bus exposed.  HAL updates should come soon.
Comment 13 Michal Jaegermann 2006-11-04 15:35:05 EST
It looks like that dbus-0.95-1.fc7 does not coredump in dbus-launch
anymore.  OTOH see bug 214027.

I am closing this bug.  If this is premature for somebody then feel
free to reopen.

Note You need to log in before you can comment on or make changes to this bug.