Bug 742582
Summary: | ns-slapd hangs in db functions under load | ||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | [Retired] 389 | Reporter: | Nathan Kinder <nkinder> | ||||||||||||
Component: | Directory Server | Assignee: | Nathan Kinder <nkinder> | ||||||||||||
Status: | CLOSED WORKSFORME | QA Contact: | Chandrasekar Kannan <ckannan> | ||||||||||||
Severity: | unspecified | Docs Contact: | |||||||||||||
Priority: | unspecified | ||||||||||||||
Version: | 1.2.9 | CC: | andrey.ivanov, benl, nhosoi, rmeggins | ||||||||||||
Target Milestone: | --- | ||||||||||||||
Target Release: | --- | ||||||||||||||
Hardware: | Unspecified | ||||||||||||||
OS: | Unspecified | ||||||||||||||
Whiteboard: | |||||||||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||||||||
Doc Text: | Story Points: | --- | |||||||||||||
Clone Of: | |||||||||||||||
: | 766321 (view as bug list) | Environment: | |||||||||||||
Last Closed: | 2012-03-28 18:07:37 UTC | Type: | --- | ||||||||||||
Regression: | --- | Mount Type: | --- | ||||||||||||
Documentation: | --- | CRM: | |||||||||||||
Verified Versions: | Category: | --- | |||||||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||||||
Embargoed: | |||||||||||||||
Bug Depends On: | |||||||||||||||
Bug Blocks: | 743970, 766321 | ||||||||||||||
Attachments: |
|
Description
Nathan Kinder
2011-09-30 16:01:01 UTC
Created attachment 525797 [details]
db_stat output
In the debugger, I see the checkpoint thread looping over this section of code in __memp_sync_int(): -------------------------------------------------------------------------- for (i = wrote_cnt = 0, remaining = ar_cnt; remaining > 0; ++i) { if (i >= ar_cnt) { i = 0; __os_yield(env, 1, 0); } if ((hp = bharray[i].track_hp) == NULL) continue; -------------------------------------------------------------------------- The track_hp pointer is NULL for every slot in the array that is checked, so we go through the entire array and then end up calling _os_yield() to allow other threads to do some work. The array contents never change though, so this thread will loop endlessly. The writer thread loops constantly over this section of code: -------------------------------------------------------------------------- for (nspins = mtxregion->stat.st_mutex_tas_spins; nspins > 0; --nspins) { dbenv = env->dbenv; /* * Avoid interlocked instructions until they're likely to * succeed by first checking whether it is held */ if (MUTEXP_IS_BUSY(mutexp) || !MUTEXP_ACQUIRE(mutexp)) { if (F_ISSET(dbenv, DB_ENV_FAILCHK) && ip == NULL && dbenv->is_alive(dbenv, mutexp->pid, mutexp->tid, 0) == 0) { ret = __env_set_state(env, &ip, THREAD_VERIFY); if (ret != 0 || ip->dbth_state == THREAD_FAILCHK) return (DB_RUNRECOVERY); } if (nowait) return (DB_LOCK_NOTGRANTED); /* * Some systems (notably those with newer Intel CPUs) * need a small pause here. [#6975] */ MUTEX_PAUSE continue; -------------------------------------------------------------------------- The MUTEXP_IS_BUSY() is always true, which causes us to hit the continue statement, jumping back to the top of the for loop. When nspins reaches 0, at attempt is made to acquire the mutex, which fails, causing us to go back to the for loop again. All of this seems to point to a mutex that is leaked somewhere. (In reply to comment #1) > Created attachment 525797 [details] > db_stat output A list of status string which could appear in the db_stat -C output. status = "ABORT"; status = "EXPIRED"; status = "FREE"; status = "HELD"; status = "PENDING"; status = "WAIT"; status = "UNKNOWN"; Interestingly, there's no "WAIT" nor "PENDING" in the attached output... Next time this happens, could you please get the page number __memp_fget is trying to fget? (gdb) thread 14 (of course, please replace the number with your latest...) (gdb) frame 4 (ditto) (gdb) p *pgnoaddr Thread 14 (Thread 0x7f042d7f2700 (LWP 3013)): #0 0x0000003dd00c7b57 in sched_yield () from /lib64/libc.so.6 #1 0x00007f044afea5d0 in __os_yield (env=0xec71e0, secs=0, usecs=0) at ../../os/os_yield.c:54 #2 0x00007f044ae6bf10 in __db_tas_mutex_lock_int (env=0xec71e0, mutex=2156, nowait=0) at ../../mutex/mut_tas.c:215 #3 0x00007f044ae6bfee in __db_tas_mutex_lock (env=0xec71e0, mutex=2156) at ../../mutex/mut_tas.c:248 #4 0x00007f044afcff2f in __memp_fget (dbmfp=0x7f041c011130, pgnoaddr=0x7f042d7e45b8, ip=0x0, Also, get the output from db_stat? I'm curious which page the memberof modop is waiting for... Created attachment 526951 [details]
Test Scripts
This is a set of test scripts that can be used to reproduce the issue with a standalone 389 DS instance.
Here is what the pages look like in the memberOf index db file: 1 (btree internal) -->2 (btree leaf) -->3 (btree leaf) -->4 (btree leaf) ----->7 (btree internal) -------->8 (duplicate) -------->9 (duplicate) -->5 (btree leaf) ----->6 (btree internal) -------->10 (duplicate) -------->11 (duplicate) From what I can see in the debugger, the write thread has the following locks on pages: read locks = 4 write locks = 4, 5, 7, 9 The read thread has the following lock: read lock = 11 The deadlock is happening because the read thread is requesting a read lock on page 5, which is blocked since the write thread already has a write lock on page 5. The write thread is attempting to get a write lock on page 11, which is blocked since the read thread already holds a read lock on page 11. What is interesting is that db_stat does not show the read lock that the read thread has on page 11. It also does not list the write locks held by the write thread on pages 7 and 9 (though it seems to list this transaction as having 3 write locks on page 4). It seems that there is no deadlock detection occurring within Berkeley DB for duplicate pages. Created attachment 526956 [details]
db_dump output of memberOf.db4
This is the dbdump output of the memberOf.db4 index file. It shows the page layout.
I have posted a question about this issue on the Oracle Berkeley DB forums, as it looks more and more to be a problem inside of libdb: https://forums.oracle.com/forums/thread.jspa?threadID=2292442 Upstream ticket: https://fedorahosted.org/389/ticket/25 marking as screened because it has been cloned upstream Created attachment 562317 [details]
Updated test scripts
This is an update to the previous test scripts that can be used to reproduce the issue.
Since the issue is timing related, I've found that it sometimes takes multiple test runs to reproduce the issue. This new set of scripts automatically runs the test multiple times to attempt to reproduce the issue. The DS instance creation and cleanup is now automatically done by the scripts at the beginning and end of each test run. See the README file included in the test script tarball for details.
Changing the status to WORKSFORME. See also: https://fedorahosted.org/389/ticket/25 |