Bug 216034

Summary: a bunch of my apps just crashed
Product: Red Hat Enterprise Linux 5 Reporter: Ray Strode [halfline] <rstrode>
Component: dbus-glibAssignee: David Zeuthen <davidz>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: medium Docs Contact:
Priority: medium    
Version: 5.0CC: jkubin, johnp, mclasen, otaylor, sgrubb, wtogami
Target Milestone: ---Keywords: Desktop
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: RC Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2007-02-08 00:31:41 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 217766    
Attachments:
Description Flags
possible patch none

Description Ray Strode [halfline] 2006-11-16 20:54:54 UTC
so I'm not exactly sure what I was doing, but I noticed that my autohide panel
wouldn't come down, so I switched to an open terminal and typed ps -ef to see this:

rstrode   5665     1  0 14:03 ?        00:00:00 /usr/libexec/gnome_segv
gnome-at-properties 11 2.16.0
rstrode   5666  5078  0 14:03 ?        00:00:00 /usr/libexec/gnome_segv
evolution-alarm-notify 11 2.8.0
rstrode   5667  4902  0 14:03 ?        00:00:00 /usr/libexec/gnome_segv
gswitchit 11 0
rstrode   5668  4888  0 14:03 ?        00:00:00 /usr/libexec/gnome_segv
gnome-power-manager 11 2.16.0
rstrode   5664     1  0 14:03 ?        00:00:00 /usr/libexec/gnome_segv
gnome-panel 11 2.16.0
rstrode   5669  4733  0 14:03 ?        00:00:00 /usr/libexec/gnome_segv
gnome-session 11 2.16.0
rstrode   5670  5665  0 14:03 ?        00:00:00 /usr/bin/bug-buddy
--appname=gnome-at-properties --pid=5470 --package-ver=(null)
rstrode   5671  5668  0 14:03 ?        00:00:00 /usr/bin/bug-buddy
--appname=gnome-power-manager --pid=4888 --package-ver=(null)
rstrode   5672  5666  0 14:03 ?        00:00:00 /usr/bin/bug-buddy
--appname=evolution-alarm-notify --pid=5078 --package-ver=(null)
rstrode   5673  5664  0 14:03 ?        00:00:00 /usr/bin/bug-buddy
--appname=gnome-panel --pid=4834 --package-ver=(null)
rstrode   5674  5667  0 14:03 ?        00:00:00 /usr/bin/bug-buddy
--appname=gswitchit --pid=4902 --package-ver=(null)
rstrode   5675  5669  0 14:03 ?        00:00:00 /usr/bin/bug-buddy
--appname=gnome-session --pid=4733 --package-ver=(null)


I tried to gdb attach to gnome-session and I got this:

#0  0x0084e402 in __kernel_vsyscall ()
No symbol table info available.
#1  0x4956ac93 in __waitpid_nocancel () from /lib/libpthread.so.0
No symbol table info available.
#2  0x45189cf6 in gnome_gtk_module_info_get () from /usr/lib/libgnomeui-2.so.0
No symbol table info available.
#3  <signal handler called>
No symbol table info available.
#4  0x450adf7c in find_name_in_info (a=0x0, b=0x8ebb84c) at dbus-gproxy.c:496
No locals.
#5  0x4982416e in g_slist_find_custom () from /lib/libglib-2.0.so.0
No symbol table info available.
#6  0x450b16cc in dbus_g_proxy_manager_filter (connection=0x8ebb178, 
    message=0x8ebb6b8, user_data=0x8ebd5e0) at dbus-gproxy.c:716
        name = 0x8ebb84c "org.gnome.YelpService"
        prev_owner = 0x8ebb868 ":1.22"
        new_owner = 0x8ebb874 ""
        derr = {name = 0x0, message = 0x0, dummy1 = 1, dummy2 = 0, dummy3 = 1, 
  dummy4 = 1, dummy5 = 1, padding1 = 0xbfcacc18}
        manager = <value optimized out>
        __PRETTY_FUNCTION__ = "dbus_g_proxy_manager_filter"
#7  0x45050f05 in dbus_connection_dispatch () from /lib/libdbus-1.so.3
No symbol table info available.
#8  0x450a9ddd in message_queue_dispatch (source=0x8ebcb48, callback=0, 
    user_data=0x0) at dbus-gmain.c:113
        connection = (DBusConnection *) 0x8ebb178
#9  0x4980c342 in g_main_context_dispatch () from /lib/libglib-2.0.so.0
No symbol table info available.
#10 0x4980f31f in g_main_context_check () from /lib/libglib-2.0.so.0
No symbol table info available.
#11 0x4980f6c9 in g_main_loop_run () from /lib/libglib-2.0.so.0
No symbol table info available.
#12 0x49c22be4 in gtk_main () from /usr/lib/libgtk-x11-2.0.so.0
No symbol table info available.
#13 0x08053d33 in g_cclosure_marshal_VOID__ENUM ()
No symbol table info available.
#14 0x08ebeef0 in ?? ()
No symbol table info available.
#15 0x08ebf1e8 in ?? ()
No symbol table info available.
#16 0x00000000 in ?? ()
No symbol table info available.
#0  0x0084e402 in __kernel_vsyscall ()


Looking at the code, I see this:

   │490     static gint                                                        │
   │491     find_name_in_info (gconstpointer a, gconstpointer b)               │
   │492     {                                                                  │
   │493       const DBusGProxyNameOwnerInfo *info = a;                         │
   │494       const char *name = b;                                            │
   │495                                                                        │
  >│496       return strcmp (info->name, name);                                │
   │497     }    

info is NULL, name is "org.gnome.YelpService"

I think some event happened ("NameOwnerChanged" ?) which made all dbus-glib
using apps crash at the same time.

Comment 1 Ray Strode [halfline] 2006-11-16 21:05:17 UTC
In fact, the backtrace is missing the dbus_g_proxy_manager_replace_name_owner
frame it looks like.

The crash is triggered by this block of code:

   │707       else                                                             │
   │708         {                                                              │
   │709           DBusGProxyNameOwnerInfo *info;                               │
   │710           GSList *link;                                                │
   │711                                                                        │
   │712           /* Name owner changed or deleted */                          │
   │713                                                                        │
   │714           names = g_hash_table_lookup (manager->owner_names, prev_owner│
   │715                                                                        │
  >│716           link = g_slist_find_custom (names, name, find_name_in_info); │
   │717                                                                        │
where name is "org.gnome.YelpService" and names looks bogus

(gdb) p *(struct _GSList *) 0x8ed8f48
$18 = {data = 0x0, next = 0x0}



Comment 2 Ray Strode [halfline] 2006-11-16 21:19:45 UTC
Also, note this function (dbus_g_proxy_manager_replace_name_owner) has another
code path that could conceivably add an element to the names list with null data:

   │716           link = g_slist_find_custom (names, name, find_name_in_info); │
   │717                                                                        │
   │718           info = NULL;                                                 │
   │719           if (link != NULL)                                            │
   │720             {                                                          │
                  (fill in info here ...)
   │727             }                                                          |
   │728                                                                        │
   │729           if (new_owner[0] == '\0')                                    │
   │730             {                                                          │
                   (...)
   │748             }                                                          │
   │749           else                                                         │
   │750             {                                                          │
   │751               insert_nameinfo (manager, new_owner, info);              │
   │752             }  

and insert_nameinfo does
   │563       names = g_slist_append (names, info);                            │

I have no idea if it hit this code path earlier or not, I just noticed it when
snooping around.


Comment 3 Ray Strode [halfline] 2006-11-16 21:24:42 UTC
*** Bug 215952 has been marked as a duplicate of this bug. ***

Comment 4 RHEL Program Management 2006-11-16 21:40:34 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux major release.  Product Management has requested further
review of this request by Red Hat Engineering, for potential inclusion in a Red
Hat Enterprise Linux Major release.  This request is not yet committed for
inclusion.

Comment 5 Matthias Clasen 2006-11-28 05:41:40 UTC
It also looks to me like info might be leaked in that function you cite:


Here we remove info from the owner_names table:


      if (link != NULL)
        {
          info = link->data;

          names = g_slist_delete_link (names, link);

          if (names == NULL)
            g_hash_table_remove (manager->owner_names, prev_owner);
        }


And here we do nothing with it, assuming new_owner is not empty:

      if (new_owner[0] == '\0')
        {
          DBusGProxyUnassociateData data;
          GSList *tmp;

          data.name = name;
          data.destroyed = NULL;

          /* A service went away, we need to unassociate proxies */
          g_hash_table_foreach (manager->proxy_lists,
                                unassociate_proxies, &data);

          UNLOCK_MANAGER (manager);

          for (tmp = data.destroyed; tmp; tmp = tmp->next)
            dbus_g_proxy_destroy (tmp->data);
          g_slist_free (data.destroyed);

          LOCK_MANAGER (manager);
        }
 
I think info may need freeing in that case ?

Comment 6 Matthias Clasen 2006-11-28 06:04:51 UTC
Created attachment 142259 [details]
possible patch

Here is an untested patch that adresses both issues. 
Does that look reasonable, John ?

Comment 7 Matthias Clasen 2006-11-28 17:26:12 UTC
John says the patch looks good, and there very similar patches in upstream bugzilla.

Comment 8 David Zeuthen 2006-11-28 23:56:36 UTC
Thanks Matthias, I've included this patch

 * Tue Nov 28 2006 David Zeuthen <davidz> - 0.70-5
 - Add dbus-glib-0.70-fix-info-leak.patch
 - Resolves: #216034

Package is partially built in Brew (waiting on s390x).

Ray, can you verify that this package works? Thanks.

Comment 9 Ray Strode [halfline] 2006-11-29 00:03:03 UTC
I don't have a reliable way to reproduce the problem and it's only happened to
me a few times, unfortunately.

Comment 10 Ray Strode [halfline] 2006-11-29 00:04:16 UTC
I wonder if we could write a test case that just takes control of a bus name and
releases it over and over again.

Comment 11 Owen Taylor 2006-12-06 20:41:50 UTC
You don't want to take control and release it, instead you want to
go from one owner to another owner. Calling :

 mugshot --replace

a few times (it won't quit the old one with mugshot < 1.1.27, but it
still replaces the D-Bus name) should take down your session pretty reliably.

According to a conversation that Havoc and I had, the patch in the
upstream bug report is most likely more correct than the one here ...
the proxy code in D-Bus should simply not care changes in the
name owner for names it doesn't have a proxy for, so inserting a 
dummy info entry doesn't make sense. (We didn't review it line by
line.)

Comment 12 RHEL Program Management 2007-02-08 00:31:41 UTC
A package has been built which should help the problem described in 
this bug report. This report is therefore being closed with a resolution 
of CURRENTRELEASE. You may reopen this bug report if the solution does 
not work for you.