Bug 757478

Summary: segfault in dbus_pending_call_cancel - called from gkr-operation.c:operation_unref()
Product: [Fedora] Fedora Reporter: Sam Tygier <samtygier>
Component: gnome-shellAssignee: Owen Taylor <otaylor>
Status: CLOSED WONTFIX QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 16CC: alessandromachado.vip, ameya.gore, andremontes, artemio.silva, browning48ky, bunzon, Fivr, frasirchan, igauravroy, laurent.boualit, marco.capile, maxamillion, midnightsteel, moromario, nocountryman, otaylor, rstrode, samkraju, stefw, stuart, twas6263, walters
Target Milestone: ---   
Target Release: ---   
Hardware: i686   
OS: Unspecified   
Whiteboard: abrt_hash:4507e1b7669abd85d5556e96fd9dd3b5f3720bc8
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-02-13 20:06:49 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Attachments:
Description Flags
File: dso_list
none
File: build_ids
none
File: maps
none
File: smolt_data
none
File: xsession_errors
none
File: var_log_messages
none
File: backtrace
none
File: backtrace
none
File: backtrace
none
File: backtrace
none
File: backtrace
none
File: backtrace none

Description Sam Tygier 2011-11-27 12:06:59 UTC
libreport version: 2.0.7
abrt_version:   2.0.6
backtrace_rating: 4
cmdline:        /usr/bin/gnome-shell
comment:        I was not at the computer at the time. when i returned there were about 100 network manager dialogs asking for a wep key (wireless goes a bit screwy sometimes).
crash_function: dbus_pending_call_cancel
executable:     /usr/bin/gnome-shell
kernel:         3.1.1-2.fc16.i686.PAE
pid:            17916
pwd:            /home/sam
reason:         Process /usr/bin/gnome-shell was killed by signal 11 (SIGSEGV)
time:           Sat 26 Nov 2011 10:57:29 PM GMT
uid:            1000
username:       sam

backtrace:      Text file, 24466 bytes
build_ids:      Text file, 6847 bytes
dso_list:       Text file, 20517 bytes
maps:           Text file, 51899 bytes
smolt_data:     Text file, 2815 bytes
var_log_messages: Text file, 18781 bytes
xsession_errors: Text file, 40686 bytes

environ:
:XDG_VTNR=2
:XDG_SESSION_ID=19
:HOSTNAME=hydrogen
:IMSETTINGS_INTEGRATE_DESKTOP=yes
:SHELL=/bin/bash
:TERM=dumb
:HISTSIZE=1000
:XDG_SESSION_COOKIE=0ba19f208431e5ed2f2505570000000b-1322233138.806687-137427225
:GNOME_KEYRING_CONTROL=/tmp/keyring-u5U0Yv
:IMSETTINGS_MODULE=none
:HISTFILESIZE=10000
:USER=sam
:USERNAME=sam
:MAIL=/var/spool/mail/sam
:PATH=/home/sam/zgoubi/install/bin/:/usr/lib/ccache:/usr/lib/ccache:/usr/local/bin:/usr/bin:/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/sam/bin:/usr/local/texlive/2011/bin/i386-linux:/home/sam/.local/bin:/home/sam/bin
:DESKTOP_SESSION=gnome
:QT_IM_MODULE=xim
:PWD=/home/sam
:XMODIFIERS=@im=none
:GNOME_KEYRING_PID=16970
:LANG=en_US.utf8
:GDM_LANG=en_US.utf8
:GDMSESSION=gnome
:SSH_ASKPASS=/usr/libexec/openssh/gnome-ssh-askpass
:HISTCONTROL=ignoredups
:HOME=/home/sam
:XDG_SEAT=seat0
:SHLVL=1
:FLUPRO=/var/tmp/sam/install/FLUKA32
:LOGNAME=sam
:CVS_RSH=ssh
:DBUS_SESSION_BUS_ADDRESS=unix:abstract=/tmp/dbus-BYzt1ZDWED,guid=5d748ce397f0939c7c3006690000a99b
:'LESSOPEN=||/usr/bin/lesspipe.sh %s'
:WINDOWPATH=2
:XDG_RUNTIME_DIR=/run/user/sam
:DISPLAY=:0
:XAUTHORITY=/var/run/gdm/auth-for-sam-VcedGW/database
:_=/usr/bin/gnome-session
:GNOME_DESKTOP_SESSION_ID=this-is-deprecated
:SESSION_MANAGER=local/unix:@/tmp/.ICE-unix/16977,unix/unix:/tmp/.ICE-unix/16977
:SSH_AUTH_SOCK=/tmp/keyring-u5U0Yv/ssh
:GPG_AGENT_INFO=/tmp/keyring-u5U0Yv/gpg:0:1
:DESKTOP_AUTOSTART_ID=1037639b78bc9a95f132223314013455200000169770000

Comment 1 Sam Tygier 2011-11-27 12:07:05 UTC
Created attachment 537064 [details]
File: dso_list

Comment 2 Sam Tygier 2011-11-27 12:07:06 UTC
Created attachment 537065 [details]
File: build_ids

Comment 3 Sam Tygier 2011-11-27 12:07:08 UTC
Created attachment 537066 [details]
File: maps

Comment 4 Sam Tygier 2011-11-27 12:07:10 UTC
Created attachment 537067 [details]
File: smolt_data

Comment 5 Sam Tygier 2011-11-27 12:07:12 UTC
Created attachment 537068 [details]
File: xsession_errors

Comment 6 Sam Tygier 2011-11-27 12:07:14 UTC
Created attachment 537069 [details]
File: var_log_messages

Comment 7 Sam Tygier 2011-11-27 12:07:17 UTC
Created attachment 537070 [details]
File: backtrace

Comment 8 Owen Taylor 2012-01-10 21:59:10 UTC
*** Bug 758898 has been marked as a duplicate of this bug. ***

Comment 9 Owen Taylor 2012-01-10 21:59:17 UTC
*** Bug 760701 has been marked as a duplicate of this bug. ***

Comment 10 Owen Taylor 2012-01-10 21:59:23 UTC
*** Bug 760712 has been marked as a duplicate of this bug. ***

Comment 11 Stef Walter 2012-01-11 17:41:17 UTC
One possible cause might be that dbus threading should be initialized in gnome-shell. Since libgnome-keyring does this (if not already done) there may be issues with threading being swapped out while dbus calls are active.

But the more I look at this, it seems related to bug #773269.

It's likely this is the same code path failing, just doing it at a later point (after on_complete_later() is invoked and during cleanup of the GSource, which calls operation_unref().

Would need to reproduce in order to identify the source of the problem better. Obviously memory corruption is a source of suspicion for both these bugs because the stacks just don't make sense.

Comment 12 Ray Strode [halfline] 2012-01-11 19:41:48 UTC
The backtrace and code makes it seem like the operation is already freed.

I notice a lot of the api in gnome-keyring.c ends with gkr_operation_pending_and_unref() which returns an operation that can be cancelled with gnome_keyring_cancel_request which calls gkr_operation_complete_later.  This latter function does the g_idle_add_full that triggers the on_complete_later shown in the backtrace.  The source id of the g_idle_add_full call is never recorded.  So, I believe that if multiple calls to  gkr_operation_complete_later are made on the same request, multiple idles will be dispatched. Each one refs the operation though. So it seems like for this bug to hit, some code would have to 1) cause gkr_operation_complete_later to get called more than once in quick succession and 2) their would have to be an unref bug.

A quick grep in the gnome-shell code shows shell-network-agent.c does a gnome_keyring_find_itemsv call that calls get_secrets_keyring_cb on completion.  The bottom of that function does g_hash_table_remove which calls gnome_keyring_cancel_request.  So the shell code cancels the request immediately after getting a response.  That might satisfy 1) (not sure, didn't keep digging), but it's not immediately clear how 2) would be happening regardless.

Comment 13 Stef Walter 2012-01-11 20:08:35 UTC
(In reply to comment #12)
Your analysis is correct, but I've looked over the unref code and can't see anything wrong that would cause a double free. In addition we should be seeing malloc errors when freeing memory 

> A quick grep in the gnome-shell code shows shell-network-agent.c does a
> gnome_keyring_find_itemsv call that calls get_secrets_keyring_cb on completion.
>  The bottom of that function does g_hash_table_remove which calls
> gnome_keyring_cancel_request.  So the shell code cancels the request
> immediately after getting a response.  That might satisfy 1) (not sure, didn't
> keep digging), but it's not immediately clear how 2) would be happening
> regardless.

Well, get_secrets_keyring_cb sets closure->keyring_op to NULL right at the top, which would keep gnome_keyring_cancel_request() from being called in cases the gnome_keyring_find_itemsv() operation completes.

Even if the request currently in a callback was being cancelled, the gkr_operation_set_result() conditional in gkr_operation_complete() and gkr_operation_complete_later() prevent the request from 'completing' twice.

It's a shame it's using slices for GkrOperation, if we were using malloc memory we could rule out or confirm double free easily.

Comment 14 Ray Strode [halfline] 2012-01-11 20:46:25 UTC
note it doesn't have to be a double free(), just use-after-free().

on_complete_later has

if (!g_queue_is_empty (&op->callbacks)) on_complete (op)

on_complete has

cb = g_queue_pop_tail (&op->callbacks);
assert (cb)

(which is failing)

the assertion can blow if op is freed, since g_queue_is_empty looks at the queue head and g_queue_pop_tail looks at the queue tail and there's no gaurantee the head and tail are consistent with each other after op is freed.

So this problem could be explained by the operation getting unref'd too many times before the idle fires.

Comment 15 Alessandro Machado 2012-01-17 22:07:17 UTC
I was setting up the Wireless Network in NetworkManager when the problem occurred unknown to me.

backtrace_rating: 4
Package: gnome-shell-3.2.1-2.fc16
OS Release: Fedora release 16 (Verne)

Comment 16 Alessandro Machado 2012-01-17 22:07:21 UTC
Created attachment 555863 [details]
File: backtrace

Comment 17 Stef Walter 2012-01-22 10:38:46 UTC
(In reply to comment #14)
> note it doesn't have to be a double free(), just use-after-free().

Right but the use-after-free would result in a double free as further down in that function we would free the same memory again. But in any case that seems like what's happening. I'm just at a loss as to how it's happening and have been looking over the code, and can't identify the mismatched references (yet).

If this is reproducible for anyone reading this, and you can run gnome-shell with the environment variable "GKR_DEBUG=all" that would be great. libgnome-keyring produces a bunch of output with that environment variable that would be helpful in tracking this down.

Comment 18 Viktor 2012-01-24 06:03:56 UTC
i don't know

backtrace_rating: 4
Package: gnome-shell-3.2.1-2.fc16
OS Release: Fedora release 16 (Verne)

Comment 19 Viktor 2012-01-24 06:04:05 UTC
Created attachment 557136 [details]
File: backtrace

Comment 20 Stef Walter 2012-01-31 12:52:56 UTC
I think this commit might fix the issue:

http://git.gnome.org/browse/libgnome-keyring/commit/?id=627895abba1b34fbd436968f775134cc5f62754c

In particular this bit: 


diff --git a/library/gkr-operation.c b/library/gkr-operation.c
index d3792b3..fc3ecc9 100644
--- a/library/gkr-operation.c
+++ b/library/gkr-operation.c
@@ -218,8 +218,7 @@ gkr_operation_set_result (GkrOperation *op, GnomeKeyringResult res)
 {
 	g_assert (op);
 	g_assert ((int) res != INCOMPLETE);
-	g_atomic_int_compare_and_exchange (&op->result, INCOMPLETE, res);
-	return g_atomic_int_get (&op->result) == res; /* Success when already set to res */
+	return g_atomic_int_compare_and_exchange (&op->result, INCOMPLETE, res);
 }
 
This commit is also worthwhile tracking this issue if the problem continues:

http://git.gnome.org/browse/libgnome-keyring/commit/?id=767a3a755c10081694c19b62bbcd76d440e27673

Should I close this bug, as the above commit is a likely fix. Or is there anyone who would like to take a moment to check if this fixes your issue? I realize though this is hard to reproduce.

Comment 21 Owen Taylor 2012-01-31 16:00:07 UTC
*** Bug 773269 has been marked as a duplicate of this bug. ***

Comment 22 Artemio 2012-03-29 23:38:53 UTC
o problema aconteceu durante a tentativa de conexão da wifi, a algum tempo que acontece isso nem consigo conectar meu notebook nas redes wifi, só da certo com cabo de rede.

backtrace_rating: 4
Package: gnome-shell-3.2.2.1-1.fc16
OS Release: Fedora release 16 (Verne)

Comment 23 Artemio 2012-03-29 23:39:01 UTC
Created attachment 573836 [details]
File: backtrace

Comment 24 Artemio 2012-04-01 12:25:32 UTC
o fato de não conseguir conectar wi-fi resultou no erro.

backtrace_rating: 4
Package: gnome-shell-3.2.2.1-1.fc16
OS Release: Fedora release 16 (Verne)

Comment 25 Artemio 2012-04-01 12:25:42 UTC
Created attachment 574329 [details]
File: backtrace

Comment 26 Stuart D Gathman 2012-04-01 21:32:05 UTC
Network Manager could not connect to wireless, and kept asking for password.  A stack of about 20 password dialogs built up while I was away.  I dismissed them, connected again, and it crashed.

backtrace_rating: 4
Package: gnome-shell-3.2.2.1-1.fc16
OS Release: Fedora release 16 (Verne)

Comment 27 Stuart D Gathman 2012-04-01 21:32:11 UTC
Created attachment 574371 [details]
File: backtrace

Comment 28 Fedora End Of Life 2013-01-16 16:26:16 UTC
This message is a reminder that Fedora 16 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 16. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '16'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 16's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 16 is end of life. If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora, you are encouraged to click on 
"Clone This Bug" and open it against that version of Fedora.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 29 Fedora End Of Life 2013-02-13 20:06:55 UTC
Fedora 16 changed to end-of-life (EOL) status on 2013-02-12. Fedora 16 is 
no longer maintained, which means that it will not receive any further 
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of 
Fedora please feel free to reopen this bug against that version.

Thank you for reporting this bug and we are sorry it could not be fixed.