Bug 224552 - [workaround available] crash on accessing new emails in evo
Summary: [workaround available] crash on accessing new emails in evo
Keywords:
Status: CLOSED RAWHIDE
Alias: None
Product: Fedora
Classification: Fedora
Component: evolution
Version: rawhide
Hardware: i386
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Matthew Barnes
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks: FC7Blocker
TreeView+ depends on / blocked
 
Reported: 2007-01-26 11:15 UTC by Caolan McNamara
Modified: 2007-11-30 22:11 UTC (History)
1 user (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2007-04-03 11:27:26 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
patch to workaround crash (11.92 KB, patch)
2007-01-30 12:11 UTC, Caolan McNamara
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
GNOME Bugzilla 330728 0 None None None Never

Description Caolan McNamara 2007-01-26 11:15:58 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.1) Gecko/20070118 Firefox/2.0.0.1

Description of problem:
Thread 592 (Thread -1254499440 (LWP 32450)):
#0  0x006ff402 in __kernel_vsyscall ()
#1  0x00b9d34c in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib/libpthread.so.0
#2  0x0089dfa3 in ?? () from /lib/libgthread-2.0.so.0
#3  0x00cf6511 in ?? () from /lib/libglib-2.0.so.0
#4  0x00d32e63 in ?? () from /lib/libglib-2.0.so.0
#5  0x00d3129f in ?? () from /lib/libglib-2.0.so.0
#6  0x00b992db in start_thread () from /lib/libpthread.so.0
#7  0x009a597e in clone () from /lib/libc.so.6

Thread 49 (Thread -1265390704 (LWP 30023)):
#0  0x006ff402 in __kernel_vsyscall ()
#1  0x0099bd33 in poll () from /lib/libc.so.6
#2  0x00d16453 in ?? () from /lib/libglib-2.0.so.0
#3  0x00d167c9 in g_main_loop_run () from /lib/libglib-2.0.so.0
#4  0x005be4e0 in ?? () from /usr/lib/libORBit-2.so.0
#5  0x00d3129f in ?? () from /lib/libglib-2.0.so.0
#6  0x00b992db in start_thread () from /lib/libpthread.so.0
#7  0x009a597e in clone () from /lib/libc.so.6

Thread 5 (Thread -1244009584 (LWP 27613)):
#0  0x006ff402 in __kernel_vsyscall ()
#1  0x0099bd33 in poll () from /lib/libc.so.6
#2  0x00d16453 in ?? () from /lib/libglib-2.0.so.0
#3  0x00d167c9 in g_main_loop_run () from /lib/libglib-2.0.so.0
#4  0x0010a544 in ?? () from /usr/lib/libnm_glib.so.0
#5  0x00d3129f in ?? () from /lib/libglib-2.0.so.0
#6  0x00b992db in start_thread () from /lib/libpthread.so.0
#7  0x009a597e in clone () from /lib/libc.so.6

Thread 1 (Thread -1208760624 (LWP 27575)):
#0  0x007f6a2e in ect_check (a11y=<value optimized out>) at gal-a11y-e-cell-text.c:69
#1  0x007f7c14 in ect_get_name (a11y=0xa63ca20) at gal-a11y-e-cell-text.c:81
#2  0x00571ab9 in atk_object_get_name () from /usr/lib/libatk-1.0.so.0
#3  0x0038ea0b in ?? () from /usr/lib/gtk-2.0/modules/libatk-bridge.so
#4  0x0038fa1c in ?? () from /usr/lib/gtk-2.0/modules/libatk-bridge.so
#5  0x00dbf20e in ?? () from /lib/libgobject-2.0.so.0
#6  0x00dc0957 in g_signal_emit_valist () from /lib/libgobject-2.0.so.0
#7  0x00dc0b19 in g_signal_emit () from /lib/libgobject-2.0.so.0
#8  0x00570c98 in atk_object_notify_state_change () from /usr/lib/libatk-1.0.so.0
#9  0x007f6146 in gal_a11y_e_cell_add_state (cell=0xa63ca20, state_type=ATK_STATE_EXPANDED, emit_signal=1)
    at gal-a11y-e-cell.c:496
#10 0x007f7f44 in ectr_model_row_changed_cb (etm=0xa34eb10, row=3, a11y=0xa63ca20) at gal-a11y-e-cell-tree.c:46
#10 0x007f7f44 in ectr_model_row_changed_cb (etm=0xa34eb10, row=3, a11y=0xa63ca20) at gal-a11y-e-cell-tree.c:46
#11 0x00dbbe49 in g_cclosure_marshal_VOID () from /lib/libgobject-2.0.so.0
#12 0x00daed9b in g_closure_invoke () from /lib/libgobject-2.0.so.
#13 0x00dbf433 in ?? () from /lib/libgobject-2.0.so.0
#14 0x00dc0957 in g_signal_emit_valist () from /lib/libgobject-2.0.so.0
#15 0x00dc0b19 in g_signal_emit () from /lib/libgobject-2.0.so.0
#16 0x072e9b55 in e_table_model_row_changed (e_table_model=0xa34eb10, row=3) at e-table-model.c:487
#17 0x0730515b in etta_proxy_node_data_changed (etm=0xa0212d8, path=0xa67fea0, etta=0xa34eb10) at e-tree-table-adapter.c:777
#18 0x00dbb6b9 in g_cclosure_marshal_VOID__POINTER () from /lib/libgobject-2.0.so.0
#19 0x00daed9b in g_closure_invoke () from /lib/libgobject-2.0.so.0
#20 0x00dbf433 in ?? () from /lib/libgobject-2.0.so.0
#21 0x00dc0957 in g_signal_emit_valist () from /lib/libgobject-2.0.so.0
#22 0x00dc0b19 in g_signal_emit () from /lib/libgobject-2.0.so.0
#23 0x072ff667 in e_tree_model_node_data_changed (tree_model=0xa0212d8, node=0xa67fea0) at e-tree-model.c:279
#24 0x011b3fc6 in main_folder_changed (o=0xabf9558, event_data=0xa929a58, user_data=0xa345800) at message-list.c:2916
#25 0x011a0603 in do_async_event (mm=0xb0fe4f0) at mail-mt.c:626
#26 0x011a2252 in idle_async_event (mm=0xb0fe4f0) at mail-mt.c:637
#27 0x00d116e1 in ?? () from /lib/libglib-2.0.so.0
#28 0x00d13442 in g_main_context_dispatch () from /lib/libglib-2.0.so.0
#29 0x00d1641f in ?? () from /lib/libglib-2.0.so.0
#30 0x00d167c9 in g_main_loop_run () from /lib/libglib-2.0.so.0
#31 0x0041b7c3 in bonobo_main () from /usr/lib/libbonobo-2.so.0
#32 0x0805d9bc in main (argc=2, argv=0xbfc3a434) at main.c:611
#33 0x008ede90 in __libc_start_main () from /lib/libc.so.6
#34 0x0804f381 in _start ()


Version-Release number of selected component (if applicable):
evolution-2.9.5-2.fc7

How reproducible:
Sometimes


Steps to Reproduce:
1. I have threaded emails on and group by thread, newest mails at the top
2. when I move folders and click on top entries there is often a crash
3. unfortunately restart evo and then clicking on them works, i.e. it's a frequent crash, but I can't quite determine what is the common cause

Actual Results:
 │51      static gboolean                                                                                                     │
 │52      ect_check (gpointer a11y)                                                                                           │
 │53      {                                                                                                                   │
 │54              GalA11yECell *gaec = GAL_A11Y_E_CELL (a11y);                                                                │
 │55              ETableItem *item = gaec->item;                                                                              │
 │56                                                                                                                          │
 │57              g_return_val_if_fail ((gaec->item != NULL), FALSE);                                                         │
 │58              g_return_val_if_fail ((gaec->cell_view != NULL), FALSE);                                                    │
 │59              g_return_val_if_fail ((gaec->cell_view->ecell != NULL), FALSE);                                             │
 │60                                                                                                                          │
 │61              if (atk_state_set_contains_state (gaec->state_set, ATK_STATE_DEFUNCT))                                      │
 │62                      return FALSE;                                                                                       │
 │63                                                                                                                          │
 │64              if (gaec->row < 0 || gaec->row >= item->rows                                                                │
 │65                      || gaec->view_col <0 || gaec->view_col >= item->cols                                                │
 │66                      || gaec->model_col <0 || gaec->model_col >= e_table_model_column_count (item->table_model))         │
 │67                      return FALSE;                                                                                       │
 │68                                                                                                                          │
>│69              if (!E_IS_CELL_TEXT (gaec->cell_view->ecell))                                                               │
 │70                      return FALSE;                                                                                       │
 │71                                                                                                                          │
 │72              return TRUE;                                                                                                │
 │73      } 

(gdb) print *a11y 
$3 = {parent = {g_type_instance = {g_class = 0xab4fcd0}, ref_count = 6, qdata = 0xaa8e870}, description = 0x0,
  name = 0xaeabcf0 "Any plans to add support for new features of laptops targeting Vista?", accessible_parent = 0x0,
  role = ATK_ROLE_TABLE_CELL, relation_set = 0xaf88e30, layer = ATK_LAYER_INVALID}

The rest of the variables are optimized out, I'll rebuild evo with less optimization and wait for it to happen again and try and see what's going on.

Expected Results:


Additional info:

Comment 1 Matthew Barnes 2007-01-26 12:15:13 UTC
Thanks for the bug report.  Evolution seems to be pretty unstable when
accessiblity is turned on.  I've had numerous reports of similar crashes.  Any
additional information you can find about the source of the crash would be most
appreciated.

Comment 2 Caolan McNamara 2007-01-26 17:19:06 UTC
Hmm, rebuilt with -O0 and ran in gdb to get those optimized out variables. And
of course no crash all day. So I rebuilt with defaults again and this time
looked at the build log. 

There's plenty of "dereferencing type-punned pointer will break strict-aliasing
rules", that warning really does need to be taken seriously. I strongly suggest
something like...

--- evolution.spec.orig 2007-01-26 15:53:54.000000000 +0000
+++ evolution.spec      2007-01-26 15:54:57.000000000 +0000
@@ -412,7 +412,7 @@
   %ldap_flags %pilot_flags %krb5_flags %nntp_flags %ssl_flags %exchange_flags \
   --enable-plugins=all
 export tagname=CC
-make %{?_smp_mflags} LIBTOOL=/usr/bin/libtool CFLAGS="$CFLAGS
-UGNOME_DISABLE_DEPRECATED"
+make %{?_smp_mflags} LIBTOOL=/usr/bin/libtool CFLAGS="$CFLAGS
-UGNOME_DISABLE_DEPRECATED -fno-strict-aliasing"

 %install
 rm -rf $RPM_BUILD_ROOT

for all the supported evolution's and I suspect you can lie back and watch the
good reliability times roll

Comment 3 Matthew Barnes 2007-01-26 18:08:21 UTC
Sounds good to me!  I could use some relief from the constant crash reports. 
Would adding that flag be enough to close this bug, assuming no further crashes
are observed?

Comment 4 Caolan McNamara 2007-01-30 12:10:39 UTC
a) we should add the flag:
http://www.cellperformance.com/mike_acton/2006/06/understanding_strict_aliasing.html
is a good explanation of it FWIW

b) but it doesn't address the a11y problem. Attached is a workaround for my
problem. There seems to be some lifecycle/ownership problems in the a11y
cell/subcell stuff. I sadly know little about evo, so I can't tell what the
correct fix is, but the following patch adds some debugging scaffolding to alert
when the cell view that one of the a11y stucts depends on to function disappears
 on it without it knowing, and leaps in to save us from a fatal crash. So the
patch fixes the crash and hopefully might enable someone who knows the evo a11y
code to have a lightbulb moment.

I seem to see the problem when I move between threaded mail folders where I
always have newest mails at the top and "always keep threads together" enabled.
The problem might be when I've deleted new mail at the top of one folder and
move to another one and delete new mail at the top of another folder.

Comment 5 Caolan McNamara 2007-01-30 12:11:38 UTC
Created attachment 146911 [details]
patch to workaround crash

Comment 6 Matthew Barnes 2007-04-03 00:34:30 UTC
Caolan, how's this patch working out for you?

It looks like it got committed upstream (finally) so it should be in Rawhide
now.  I'm going to enable accessibility until F7 is out and try to get any
subsequent a11y-related crashes fixed up (which I should've been doing already).

Comment 7 Caolan McNamara 2007-04-03 07:19:23 UTC
Yeah, works like a dream for me. Of course it still not the correct fix, someone
whos knows what evo's a11y is trying to do would have to determine what the
lifecycle should be, but it works fine in avoiding the cruel punishment of
blowing up. So if this is is now in rawhide evo it can be closed.

Comment 8 Matthew Barnes 2007-04-03 11:27:26 UTC
Okay, closing this then.  That someone is probably me since I've apparently been
placed in charge of the old GAL library that Evolution assimilated. Probably
because no one else upstream wants to deal with it.  Lucky me.

Anyway, glad to hear the patch is holding things together for now.


Note You need to log in before you can comment on or make changes to this bug.