Bug 739814 - zif breaks rpmdb when killed
Summary: zif breaks rpmdb when killed
Keywords:
Status: CLOSED NEXTRELEASE
Alias: None
Product: Fedora
Classification: Fedora
Component: zif
Version: 16
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Richard Hughes
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-09-20 06:57 UTC by Jiri Moskovcak
Modified: 2015-02-01 22:54 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2011-09-28 12:10:46 UTC


Attachments (Terms of Use)

Description Jiri Moskovcak 2011-09-20 06:57:45 UTC
Description of problem:
I killed zif pressing ctrl-c an it end up with:

[08:37:04 root@dhcp-25-200 ~]# zif update --skip-broken
error: rpmdb: Thread/process 11506/139835403798528 failed: Thread died in Berkeley DB library
error: db4 error(-30974) from dbenv->failchk: DB_RUNRECOVERY: Fatal error, run database recovery
error: cannot open Packages index using db4 -  (-30974)
error: cannot open Packages database in /var/lib/rpm
error: rpmdb: Thread/process 11506/139835403798528 failed: Thread died in Berkeley DB library
error: db4 error(-30974) from dbenv->failchk: DB_RUNRECOVERY: Fatal error, run database recovery
error: cannot open Packages database in /var/lib/rpm

Version-Release number of selected component (if applicable):
zif-0.2.4-1 (actually my build from the latest git)

How reproducible:
tried only once

Steps to Reproduce:
1. zif update
2. ctrl-c

  
Expected results:
zif should be more resistant to ctrl-c and try hard to die gracefully

Comment 1 Richard Hughes 2011-09-20 08:17:14 UTC
(In reply to comment #0)
> zif should be more resistant to ctrl-c and try hard to die gracefully

Hmm, it does: we've got:

static void
zif_main_sigint_cb (int sig)
{
	GCancellable *cancellable;
	g_debug ("Handling SIGINT");

	/* restore default ASAP, as the cancels might hang */
	signal (SIGINT, SIG_DFL);

	/* cancel any tasks still running */
	if (_state != NULL) {
		cancellable = zif_state_get_cancellable (_state);
		g_cancellable_cancel (cancellable);
	}
}

Did you press ctrl-c *twice*?

Comment 2 Jiri Moskovcak 2011-09-20 08:29:53 UTC
(In reply to comment #1)
> (In reply to comment #0)
> Did you press ctrl-c *twice*?

- yes ;)

Comment 3 Richard Hughes 2011-09-20 08:44:41 UTC
(In reply to comment #2)
> (In reply to comment #1)
> > (In reply to comment #0)
> > Did you press ctrl-c *twice*?
> 
> - yes ;)

Would it be sane to remove the signal(SIGINT,SIG_DFL) ?

On the one hand we'll still be waiting for the GCancellable to cancel (which may be a long wait) but we don't allow the user to destroy things. Maybe not removing the signal() and printing a message might be a good idea. Something like:

"Cancellation in progress, please wait..."

Ideas welcome.

Richard.

Comment 4 Elad Alfassa 2011-09-20 08:51:22 UTC
There was a recent rpm update fixing database breakage on ctrl+c. Might be related (or not, maybe I just don't know what I'm talking about), but you should test anyway IMO.

Comment 5 Richard Hughes 2011-09-28 12:10:46 UTC
(In reply to comment #4)
> There was a recent rpm update fixing database breakage on ctrl+c. Might be
> related (or not, maybe I just don't know what I'm talking about), but you
> should test anyway IMO.

Yes, Elad is correct. You want at least rpm-4.9.1.1-3.fc16.x86_64 for the fixed librpm signal stuff.

That said, because librpm is stealing SIGINT, we need to do this a little more cleverly. I've added this to master:

commit 5a33164fad5058345630ea6e7655444c148825a1
Author: Richard Hughes <richard@hughsie.com>
Date:   Tue Sep 27 15:09:48 2011 +0100

    Try to make the signal handling in libzif somewhat sane
    
    Basically, librpm steals SIGINT and a few other signals whenever it opens or
    closes the rpmdb. To work around this, disconnect SIGINT using rpmsqEnable and
    connect it to a cancel handler after any librpm transaction.
    
    Whilst not ideal, this means that ctrl-c causes the GCancellable to be
    cancelled at a sane point and allows zif to clean up and exit as soon as
    possible. Using the GCancellable also means the GFile stuff is cancelled
    correctly too.
    
    Resolves https://bugzilla.redhat.com/show_bug.cgi?id=739814


Note You need to log in before you can comment on or make changes to this bug.