Description of problem: With current rawhide files of glibc, gcc, its libs, binutils, autoconf and friends, etc. (glibc-2.3.2-11.9) Version-Release number of selected component (if applicable): Current rawhide files of glibc, compiler, tools, db4. How reproducible: Consistent Steps to Reproduce: 1. rawhide system 2. attached test program 3. run. Actual results: # ./db4-cdb-test .. db_env_create err= 0 db4-cdb-test: /var/tmp/__db.001: unable to initialize environment lock: Function not implemented .. env->open() err= 38 Expected results: # ./db4-cdb-test .. db_env_create err= 0 .. env->open() err= 0 .. db_create() err= 0 .. db->open() err= 2 (read-only opening presently non-existent database...) Additional info: Using RedHat 8.0 update glibc: glibc-2.3.2-4.80.i686.rpm (et.al.) the result is: the same. At machine with glibc-2.3.1-38 the binaries compiled at rawhide machine do work just fine as intended. An attempt to rpmbuild -bb of db4 package without optimizations to help tracking of syscalls did fail miserably. Possibly just rawhide indigestion... Attempt at gdb stepping thru calls did yield a high-probability suspicion, that pthreads_mutexattr_setpshared() call fails with "ENOSYS". Reading glibc source does not yield any definite indication of such being even possible! Possibly I studied wrong source...
Created attachment 90948 [details] db4-cdbmode testcase header-comments possibly disagree, use this to compile: gcc -g -o db4-cdb-test db4-cdb-test.c -lpthread -ldb-4.0 This attempts to create a SleepyCat CDB environment in /var/tmp, and join it. Then open READ ONLY a database which does not (likely) exist in there.
A bit edited gdb session with sources in place, and suitable 'directory' command given to the gdb: (gdb) step __db_pthread_mutex_init (dbenv=0x80498c8, mutexp=0x80498c8, flags=0) at ../mutex/mut_pthread.c:68 68 ret = 0; 65 { 69 memset(mutexp, 0, sizeof(*mutexp)); 79 if (LF_ISSET(MUTEX_THREAD) || F_ISSET(dbenv, DB_ENV_PRIVATE)) { 89 pthread_condattr_t condattr, *condattrp = NULL; 90 pthread_mutexattr_t mutexattr, *mutexattrp = NULL; 92 if (!F_ISSET(mutexp, MUTEX_THREAD)) { 93 ret = pthread_mutexattr_init(&mutexattr); 94 if (ret == 0) 95 ret = pthread_mutexattr_setpshared( 97 mutexattrp = &mutexattr; 100 if (ret == 0) (gdb) print ret $4 = 38 .... Definitely smells of bad pthead_mutexattr_setpshared() thing..
Moving to glibc, as the bug seems to be there.
Known to happen at i386, (i686 version of glibc, actually). Can't say anything about e.g. Alpha et.al.
I think this is really a db4 issue: db4 is configured to depend on pthread_mutexattr_setpshared working, but it doesn't when running an earlier kernel. glibc is not doing anything wrong, it just started supporting setpshared recently. *** This bug has been marked as a duplicate of 86381 ***
The Kernel I am running there is: 2.4.18-23.8.0smp Which is RH development errata for RH 8.0. That kernel is stable, anything latter is prone to hung up the box. Running latter kernel (e.g. 2.4.20/21) _is_not_ an option at present.
With up to date rawhide packages of kernel, glibc, along with previous db4 (hmm.. overnight there are newer versions of kernel, at least) kernel-2.4.20-20.1.2007.nptl.i686 glibc-2.3.2-48.i686 db4-4.1.25-2 Apparently db4 was just "rebuilt" in between -2 and -3. POSIX mutexes using nptl are already in -1. What happens: 285 err = prv->ZSE->env->open(prv->ZSE->env, (gdb) next router: unable to join the environment 289 if (err) prv->ZSE->env->err(prv->ZSE->env, err, "envhome <%s> open failed", prv->ZSE->envhome ? prv->ZSE->envhome : "NULL"); (gdb) list 284 285 err = prv->ZSE->env->open(prv->ZSE->env, 286 prv->ZSE->envhome, 287 prv->ZSE->envflags, 288 prv->ZSE->envmode); 289 if (err) prv->ZSE->env->err(prv->ZSE->env, err, "envhome <%s> open failed", prv->ZSE->envhome ? prv->ZSE->envhome : "NULL"); 290 291 if (err) return err; /* Uhh.. */ 292 } 293 (gdb) next router: envhome </opt/mail/db> open failed: Resource temporarily unavailable (gdb) print prv->ZSE->envhome $2 = 0x8c09588 "/opt/mail/db" (gdb) print/o prv->ZSE->envflags $3 = 044001 (DB_INIT_CDB | DB_INIT_MPOOL | DB_CREATE) (gdb) print/o prv->ZSE->envmode $4 = 0600 So there is difference from original error diagnostics, but still the thing refuses to function in presumably good thread environment.
The lattest bug appears to be different behaviour of O_DIRECT option for open(2) in between FreeBSD, and Linux. In FreeBSD that flag does not bring in Linux's special requirement of write/read to be done in page size (or exact multiples), and memory areas beginning at page boundary. Lattest RedHat Rawhide db4-4.1.25-3.src.rpm does contain configure.ac test that sees, if the O_DIRECT flag functions like FreeBSD expects it to. Now if compilation machine kernel happens to IGNORE that flag (e.g. is old enough!), that configuration test produces FreeBSD-like results in compilation, and resulting binary (with kernel understanding that flag) fails to function! Your compilation environment needs fixing, then packages need recompiling.
Reassign to db4 again since this is a db4 build issue and no glibc problems.
There are 2 problems here. The earlier failiure is testing whether posix mutexes are shreable. Only a kernel that supports futex has shared posix mutexes. So db4-4.1.25 fails because it is compiled with --enable-posixmutexes. Either run a kernel that supports futexes, or build db4 without the --enable-posixmutexes vonfigure option. Unfortunately, there are no other solutions. The latter problem has to do with new fangled O_DIRECT semantics in the kernel that break db-4.1.25 builds. There is a fix in later db4-4.1.25 src.rpm's, and db-4.2.52 and later have incorporated the changes with db4 configure to avoid using O_DIRECT. So the end resuly here is basically WONTFIX in the sense that the db4 problems are symptoms of other development, and there is no known answer other than what I've outlined above.