Bug 88178
Summary: | db4 CDB mode environment creation fails, probably due to glibc pthreds issue | ||||||
---|---|---|---|---|---|---|---|
Product: | [Retired] Red Hat Raw Hide | Reporter: | matti aarnio <matti.aarnio> | ||||
Component: | db4 | Assignee: | Jeff Johnson <jbj> | ||||
Status: | CLOSED WONTFIX | QA Contact: | David Lawrence <dkl> | ||||
Severity: | medium | Docs Contact: | |||||
Priority: | medium | ||||||
Version: | 1.0 | CC: | jorton | ||||
Target Milestone: | --- | ||||||
Target Release: | --- | ||||||
Hardware: | i386 | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2004-11-13 22:37:55 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
matti aarnio
2003-04-07 12:40:15 UTC
Created attachment 90948 [details]
db4-cdbmode testcase
header-comments possibly disagree, use this to compile:
gcc -g -o db4-cdb-test db4-cdb-test.c -lpthread -ldb-4.0
This attempts to create a SleepyCat CDB environment in /var/tmp,
and join it. Then open READ ONLY a database which does not (likely)
exist in there.
A bit edited gdb session with sources in place, and suitable 'directory' command given to the gdb: (gdb) step __db_pthread_mutex_init (dbenv=0x80498c8, mutexp=0x80498c8, flags=0) at ../mutex/mut_pthread.c:68 68 ret = 0; 65 { 69 memset(mutexp, 0, sizeof(*mutexp)); 79 if (LF_ISSET(MUTEX_THREAD) || F_ISSET(dbenv, DB_ENV_PRIVATE)) { 89 pthread_condattr_t condattr, *condattrp = NULL; 90 pthread_mutexattr_t mutexattr, *mutexattrp = NULL; 92 if (!F_ISSET(mutexp, MUTEX_THREAD)) { 93 ret = pthread_mutexattr_init(&mutexattr); 94 if (ret == 0) 95 ret = pthread_mutexattr_setpshared( 97 mutexattrp = &mutexattr; 100 if (ret == 0) (gdb) print ret $4 = 38 .... Definitely smells of bad pthead_mutexattr_setpshared() thing.. Moving to glibc, as the bug seems to be there. Known to happen at i386, (i686 version of glibc, actually). Can't say anything about e.g. Alpha et.al. I think this is really a db4 issue: db4 is configured to depend on pthread_mutexattr_setpshared working, but it doesn't when running an earlier kernel. glibc is not doing anything wrong, it just started supporting setpshared recently. *** This bug has been marked as a duplicate of 86381 *** The Kernel I am running there is: 2.4.18-23.8.0smp Which is RH development errata for RH 8.0. That kernel is stable, anything latter is prone to hung up the box. Running latter kernel (e.g. 2.4.20/21) _is_not_ an option at present. With up to date rawhide packages of kernel, glibc, along with previous db4 (hmm.. overnight there are newer versions of kernel, at least) kernel-2.4.20-20.1.2007.nptl.i686 glibc-2.3.2-48.i686 db4-4.1.25-2 Apparently db4 was just "rebuilt" in between -2 and -3. POSIX mutexes using nptl are already in -1. What happens: 285 err = prv->ZSE->env->open(prv->ZSE->env, (gdb) next router: unable to join the environment 289 if (err) prv->ZSE->env->err(prv->ZSE->env, err, "envhome <%s> open failed", prv->ZSE->envhome ? prv->ZSE->envhome : "NULL"); (gdb) list 284 285 err = prv->ZSE->env->open(prv->ZSE->env, 286 prv->ZSE->envhome, 287 prv->ZSE->envflags, 288 prv->ZSE->envmode); 289 if (err) prv->ZSE->env->err(prv->ZSE->env, err, "envhome <%s> open failed", prv->ZSE->envhome ? prv->ZSE->envhome : "NULL"); 290 291 if (err) return err; /* Uhh.. */ 292 } 293 (gdb) next router: envhome </opt/mail/db> open failed: Resource temporarily unavailable (gdb) print prv->ZSE->envhome $2 = 0x8c09588 "/opt/mail/db" (gdb) print/o prv->ZSE->envflags $3 = 044001 (DB_INIT_CDB | DB_INIT_MPOOL | DB_CREATE) (gdb) print/o prv->ZSE->envmode $4 = 0600 So there is difference from original error diagnostics, but still the thing refuses to function in presumably good thread environment. The lattest bug appears to be different behaviour of O_DIRECT option for open(2) in between FreeBSD, and Linux. In FreeBSD that flag does not bring in Linux's special requirement of write/read to be done in page size (or exact multiples), and memory areas beginning at page boundary. Lattest RedHat Rawhide db4-4.1.25-3.src.rpm does contain configure.ac test that sees, if the O_DIRECT flag functions like FreeBSD expects it to. Now if compilation machine kernel happens to IGNORE that flag (e.g. is old enough!), that configuration test produces FreeBSD-like results in compilation, and resulting binary (with kernel understanding that flag) fails to function! Your compilation environment needs fixing, then packages need recompiling. Reassign to db4 again since this is a db4 build issue and no glibc problems. There are 2 problems here. The earlier failiure is testing whether posix mutexes are shreable. Only a kernel that supports futex has shared posix mutexes. So db4-4.1.25 fails because it is compiled with --enable-posixmutexes. Either run a kernel that supports futexes, or build db4 without the --enable-posixmutexes vonfigure option. Unfortunately, there are no other solutions. The latter problem has to do with new fangled O_DIRECT semantics in the kernel that break db-4.1.25 builds. There is a fix in later db4-4.1.25 src.rpm's, and db-4.2.52 and later have incorporated the changes with db4 configure to avoid using O_DIRECT. So the end resuly here is basically WONTFIX in the sense that the db4 problems are symptoms of other development, and there is no known answer other than what I've outlined above. |