Bug 58665 - RPM locks up in __os_yield_rpmdb
Summary: RPM locks up in __os_yield_rpmdb
Status: CLOSED WORKSFORME
Alias: None
Product: Red Hat Linux
Classification: Retired
Component: rpm (Show other bugs)
(Show other bugs)
Version: 7.2
Hardware: i686 Linux
medium
high
Target Milestone: ---
Assignee: Jeff Johnson
QA Contact:
URL:
Whiteboard:
Keywords:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2002-01-22 15:38 UTC by Michael Meeks
Modified: 2008-05-01 15:38 UTC (History)
0 users

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2002-01-22 15:38:25 UTC
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

Description Michael Meeks 2002-01-22 15:38:21 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:0.9.7) Gecko/20020104

Description of problem:
both the 'rpm' command and red-carpet hang a certain way into doing any
operation on the RPM database.

Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
No idea; do some sane package management with RC on top of a clean RH 7.2
install for a month or so.

Actual Results:  Stack trace is like this:

Program received signal SIGTSTP, Stopped (user).
0x40be45be in __select () from /lib/libc.so.6
(gdb) bt
#0  0x40be45be in __select () from /lib/libc.so.6
#1  0x40fa5b74 in __DTOR_END__ () from /usr/lib/librpmdb-4.0.3.so
#2  0x40f89db7 in __os_yield_rpmdb (dbenv=0x0, usecs=1000000) at
../db/dist/../os/os_spin.c:108
#3  0x40f2900f in __db_tas_mutex_lock_rpmdb (dbenv=0x81543c0, mutexp=0x40fc5658)
at ../db/dist/../mutex/mut_tas.c:150
#4  0x40f83f8a in memp_fget_rpmdb (dbmfp=0x814d0b0, pgnoaddr=0xbffff54c,
flags=0, addrp=0xbffff528) at ../db/dist/../mp/mp_fget.c:276
#5  0x40f5a0f6 in __db_goff_rpmdb (dbp=0x8154688, dbt=0xbffff630, tlen=81012,
pgno=1150, bpp=0x8154a64, bpsz=0x8154a6c)
    at ../db/dist/../db/db_overflow.c:139
#6  0x40f5e787 in __db_ret_rpmdb (dbp=0x8154688, h=0x4102b1cc, indx=39,
dbt=0xbffff630, memp=0x8154a64, memsize=0x8154a6c)
    at ../db/dist/../db/db_ret.c:54
#7  0x40f520ce in __db_c_get_rpmdb (dbc_arg=0x8154a18, key=0xbffff650,
data=0xbffff630, flags=21) at ../db/dist/../db/db_cam.c:810
#8  0x40f25fb9 in db3c_get () from /usr/lib/librpmdb-4.0.3.so
#9  0x40f264b8 in db3cget () from /usr/lib/librpmdb-4.0.3.so
#10 0x40f1fc77 in dbiGet () from /usr/lib/librpmdb-4.0.3.so
#11 0x40f22c4a in rpmdbNextIterator () from /usr/lib/librpmdb-4.0.3.so
#12 0x40f22b3b in XrpmdbNextIterator () from /usr/lib/librpmdb-4.0.3.so
#13 0x0808d8cd in rc_rpmman_query_all_v4 (packman=0x814ce28) at rc-rpmman.c:1499
#14 0x0808db6d in rc_rpmman_query_all (packman=0x814ce28) at rc-rpmman.c:1595
#15 0x08089190 in rc_packman_query_all (packman=0x814ce28) at rc-packman.c:361
#16 0x08074da4 in rc_gui_query_system_packages () at gui-util.c:88
#17 0x08066073 in main (argc=1, argv=0xbffff8a4) at gui-init.c:1414
#18 0x40b20306 in __libc_start_main (main=0x8065eec <main>, argc=1,
ubp_av=0xbffff8a4, init=0x8052858 <_init>, fini=0x8091b60 <_fini>, 
    rtld_fini=0x4000d2dc <_dl_fini>, stack_end=0xbffff89c) at
../sysdeps/generic/libc-start.c:129
(gdb) 

Expected Results:  shouldn't hang.

Additional info:

Comment 1 Jeff Johnson 2002-01-22 23:02:55 UTC
Try
	rm -f /var/lib/rpm/__db*

Please reopen if that does not fix.

Comment 2 Michael Meeks 2002-01-29 10:40:27 UTC
And asking the user to remove some files is a fix ? - that makes me feel it
isn't worth filing bugs against RPM, _Or_ is this already fixed in some
development version ?

Comment 3 Jeff Johnson 2002-01-29 14:55:57 UTC
Yes, removing the files is the fix.

The underlying issue is that rpm is going to permit concurrent access
to the database, using a dbenv (what's in rpm-4.0.3) is the 1st step.
The complete solution is gonna require either a setgid helper
and/or new kernel functionality, and that is gonna take more than
coding in rpm.

Comment 4 Michael Meeks 2002-01-30 12:06:48 UTC
Sorry - I'm still unclear your statement is:

The issue is that we have a half implemented feature shipping, and this can
result in users totally loosing the ability to manage packages - and it's not
going to be complete without a setgid helper and/or new kernel functionality and
more hacking in rpm ?

And in the meantime, users have to 'just know' that they need to su and remove
files in a strange place on the disk ?

Would you reccommend that automated tools always remove these files ? or ...

And - I'm suprised that rpm presents any problem that would require new kernel
functionality. Hmm.

Comment 5 Michael Meeks 2002-01-30 12:06:49 UTC
Sorry - I'm still unclear your statement is:

The issue is that we have a half implemented feature shipping, and this can
result in users totally loosing the ability to manage packages - and it's not
going to be complete without a setgid helper and/or new kernel functionality and
more hacking in rpm ?

And in the meantime, users have to 'just know' that they need to su and remove
files in a strange place on the disk ?

Would you reccommend that automated tools always remove these files ? or ...

And - I'm suprised that rpm presents any problem that would require new kernel
functionality. Hmm.

Comment 6 Jeff Johnson 2002-01-30 14:30:32 UTC
No, there is a fully implemented use of a dbenv in rpm-4.0.3.
A dbenv is necessary for using other functionality within
Berkeley DB. Read about interprocess and interthread locking
in the db sources, what's missing in linux is
	pthread_mutexattr_setpshared()
which AFAICT is gonna require kernel changes.

Comment 7 Jeff Johnson 2002-01-30 15:59:01 UTC
No, red-carpet shouldn't remove the __db* files, they
are removed when the database is opened by root.

The problem that cannot be solved w/o setgid helper
is r__db* emoval by non-root if created by root.

Comment 8 Michael Meeks 2002-02-02 18:46:10 UTC
In all instances I was opening the RPM database as root - and it persistantly
failed to remove these files - and it stuck there and blocked.

Strangely rpm --rebuilddb did remove the locks, and solved the problem for me;

So again, I did nothing a normal user would not and I was screwed ;-)

Comment 9 Jeff Johnson 2002-02-02 18:52:20 UTC
rpm-4.0.3? Yup, removal fixed in rpm-4.0.4.

Comment 10 Michael Meeks 2002-02-03 01:29:52 UTC
Lovely thanks; I feel thorougly reassured.


Note You need to log in before you can comment on or make changes to this bug.