Bug 739814

Summary: zif breaks rpmdb when killed
Product: [Fedora] Fedora Reporter: Jiri Moskovcak <jmoskovc>
Component: zifAssignee: Richard Hughes <hughsient>
Status: CLOSED NEXTRELEASE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 16CC: dfediuck, elad, hughsient
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-09-28 12:10:46 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Jiri Moskovcak 2011-09-20 06:57:45 UTC
Description of problem:
I killed zif pressing ctrl-c an it end up with:

[08:37:04 root@dhcp-25-200 ~]# zif update --skip-broken
error: rpmdb: Thread/process 11506/139835403798528 failed: Thread died in Berkeley DB library
error: db4 error(-30974) from dbenv->failchk: DB_RUNRECOVERY: Fatal error, run database recovery
error: cannot open Packages index using db4 -  (-30974)
error: cannot open Packages database in /var/lib/rpm
error: rpmdb: Thread/process 11506/139835403798528 failed: Thread died in Berkeley DB library
error: db4 error(-30974) from dbenv->failchk: DB_RUNRECOVERY: Fatal error, run database recovery
error: cannot open Packages database in /var/lib/rpm

Version-Release number of selected component (if applicable):
zif-0.2.4-1 (actually my build from the latest git)

How reproducible:
tried only once

Steps to Reproduce:
1. zif update
2. ctrl-c

  
Expected results:
zif should be more resistant to ctrl-c and try hard to die gracefully

Comment 1 Richard Hughes 2011-09-20 08:17:14 UTC
(In reply to comment #0)
> zif should be more resistant to ctrl-c and try hard to die gracefully

Hmm, it does: we've got:

static void
zif_main_sigint_cb (int sig)
{
	GCancellable *cancellable;
	g_debug ("Handling SIGINT");

	/* restore default ASAP, as the cancels might hang */
	signal (SIGINT, SIG_DFL);

	/* cancel any tasks still running */
	if (_state != NULL) {
		cancellable = zif_state_get_cancellable (_state);
		g_cancellable_cancel (cancellable);
	}
}

Did you press ctrl-c *twice*?

Comment 2 Jiri Moskovcak 2011-09-20 08:29:53 UTC
(In reply to comment #1)
> (In reply to comment #0)
> Did you press ctrl-c *twice*?

- yes ;)

Comment 3 Richard Hughes 2011-09-20 08:44:41 UTC
(In reply to comment #2)
> (In reply to comment #1)
> > (In reply to comment #0)
> > Did you press ctrl-c *twice*?
> 
> - yes ;)

Would it be sane to remove the signal(SIGINT,SIG_DFL) ?

On the one hand we'll still be waiting for the GCancellable to cancel (which may be a long wait) but we don't allow the user to destroy things. Maybe not removing the signal() and printing a message might be a good idea. Something like:

"Cancellation in progress, please wait..."

Ideas welcome.

Richard.

Comment 4 Elad Alfassa 2011-09-20 08:51:22 UTC
There was a recent rpm update fixing database breakage on ctrl+c. Might be related (or not, maybe I just don't know what I'm talking about), but you should test anyway IMO.

Comment 5 Richard Hughes 2011-09-28 12:10:46 UTC
(In reply to comment #4)
> There was a recent rpm update fixing database breakage on ctrl+c. Might be
> related (or not, maybe I just don't know what I'm talking about), but you
> should test anyway IMO.

Yes, Elad is correct. You want at least rpm-4.9.1.1-3.fc16.x86_64 for the fixed librpm signal stuff.

That said, because librpm is stealing SIGINT, we need to do this a little more cleverly. I've added this to master:

commit 5a33164fad5058345630ea6e7655444c148825a1
Author: Richard Hughes <richard>
Date:   Tue Sep 27 15:09:48 2011 +0100

    Try to make the signal handling in libzif somewhat sane
    
    Basically, librpm steals SIGINT and a few other signals whenever it opens or
    closes the rpmdb. To work around this, disconnect SIGINT using rpmsqEnable and
    connect it to a cancel handler after any librpm transaction.
    
    Whilst not ideal, this means that ctrl-c causes the GCancellable to be
    cancelled at a sane point and allows zif to clean up and exit as soon as
    possible. Using the GCancellable also means the GFile stuff is cancelled
    correctly too.
    
    Resolves https://bugzilla.redhat.com/show_bug.cgi?id=739814