Bug 515805 - Stop "initialize Database" crashes the server
Summary: Stop "initialize Database" crashes the server
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: 389
Classification: Retired
Component: Database - Import/Export
Version: 1.2.6
Hardware: All
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Noriko Hosoi
QA Contact: Viktor Ashirov
URL:
Whiteboard:
Depends On:
Blocks: 434914 389_1.2.6
TreeView+ depends on / blocked
 
Reported: 2009-08-05 19:18 UTC by Noriko Hosoi
Modified: 2015-12-07 16:31 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-12-07 16:31:15 UTC


Attachments (Terms of Use)
valgrind output (155.13 KB, text/plain)
2009-08-05 19:33 UTC, Noriko Hosoi
no flags Details
git patch file (4.00 KB, patch)
2010-03-18 19:00 UTC, Noriko Hosoi
nkinder: review+
Details | Diff

Description Noriko Hosoi 2009-08-05 19:18:56 UTC
Description of problem:
Stop "initialize Database" on Console crashes the server

How reproducible:
Every time...

Steps to Reproduce:
1. On the DS Console, Configuration tab | expand Data, choose a backend icon in a suffix
2. Right click and choose "initialize database", put an LDIF file name and click OK
3. Once the import started, click Stop on the "initialize Database <backend>" window, which crashed the server

Comment 1 Noriko Hosoi 2009-08-05 19:32:50 UTC
Stack trace
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7f6095120910 (LWP 17004)]
strlen () at ../sysdeps/x86_64/strlen.S:31
31      movdqa  (%rdi), %xmm0
Current language:  auto; currently asm
(gdb) bt
#0  strlen () at ../sysdeps/x86_64/strlen.S:31
#1  0x000000359647f166 in *__GI___strdup (s=0xb0 <Address 0xb0 out of bounds>)
    at strdup.c:42
#2  0x00007f609d5bdb9a in slapi_ch_strdup (
    s1=0xb1 <Address 0xb1 out of bounds>) at ldap/servers/slapd/ch_malloc.c:277
#3  0x00007f609d5c4d83 in slapi_sdn_get_ndn (sdn=0x7f609511d3e0)
    at ldap/servers/slapd/dn.c:1229
#4  0x00007f609d5f7e6b in op_shared_modify (pb=0x7f609511f590, pw_change=0,
    old_pw=0x0) at ldap/servers/slapd/modify.c:576
#5  0x00007f609d5f7c97 in modify_internal_pb (pb=0x7f609511f590)
    at ldap/servers/slapd/modify.c:520
#6  0x00007f609d5f7967 in slapi_modify_internal_pb (pb=0x7f609511f590)
    at ldap/servers/slapd/modify.c:410
#7  0x00007f609d62b75b in modify_internal_entry (
    dn=0xb1 <Address 0xb1 out of bounds>, mods=0x7f609511f890)
    at ldap/servers/slapd/task.c:626
#8  0x00007f609d62acde in slapi_task_status_changed (task=0x7f6074010120)
    at ldap/servers/slapd/task.c:299
#9  0x00007f609d62a64b in slapi_task_log_notice (task=0x7f6074010120,
    format=0x7f60996a2896 "%s") at ldap/servers/slapd/task.c:264
#10 0x00007f6099663486 in import_log_notice (job=0x7f607400bea0,
    format=0x7f60996a29a7 "Import threads aborted.")
    at ldap/servers/slapd/back-ldbm/import.c:190
#11 0x00007f609966544e in import_main_offline (arg=0x7f607400bea0)
    at ldap/servers/slapd/back-ldbm/import.c:1192
#12 0x00007f60996659d4 in import_main (arg=0x7f607400bea0)
    at ldap/servers/slapd/back-ldbm/import.c:1376
#13 0x00000035a6229473 in ?? () from /usr/lib64/libnspr4.so
#14 0x000000359700686a in start_thread (arg=<value optimized out>)
    at pthread_create.c:297
#15 0x00000035964de25d in clone ()
    at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#16 0x0000000000000000 in ?? ()

Most likely, the problem is if the task is stopped, the parent task exits before the worker thread (in this case the import thread) quits, which tried to log the status using the already-freed "task". Valgrind also backs up the theory.  I'm attaching the valgrind output next.

(gdb) frame 8
#8  0x00007f609d62acde in slapi_task_status_changed (task=0x7f6074010120)
    at ldap/servers/slapd/task.c:299
299     modify_internal_entry(task->task_dn, mod);
(gdb) p task
$7 = (Slapi_Task *) 0x7f6074010120
(gdb) p *task
$8 = {next = 0x7f6074010230, task_dn = 0xb1 <Address 0xb1 out of bounds>,
  task_exitcode = 1946205840, task_state = 32608, task_progress = 1946157176,
  task_work = 32608, task_flags = -1, task_status = 0x0,
  task_log = 0x7f6068004020 "Import threads aborted.", task_private = 0x0,
  cancel = 0, destructor = 0, task_refcount = 0}

Comment 2 Noriko Hosoi 2009-08-05 19:33:33 UTC
Created attachment 356416 [details]
valgrind output

Comment 3 Noriko Hosoi 2010-03-18 18:22:07 UTC
Task can be cancelled by sending a modify request on the task entry by replacing nsTaskCancel value with TRUE as long as the task has the cancel handling code.  Currently, only import does.

Once TRUE is set to nsTaskCacnel, the pre-set callback slapi_task_set_cancel_fn (import_task_abort in this case) is called and set the ABORT flag, which is monitored by import_monitor_threads.  By returning from import_monitor_threads, the main import threads calls slapi_task_log_status.  This function calls slapi_task_status_changed, where since the task is cancelled, it sets destroy_task to the event queue.  That is, any time after the first slapi_task_log_status/slapi_task_status_changed call, the task may be destroyed.  On the other hand, the task application import tries to log after that.
[..] - import userRoot: Aborting all import threads... <== first log after cancel
[..] - import userRoot: Import threads aborted.
[..] - import userRoot: Closing files...
!!! slapi_task_finish is called !!!
[..] - import userRoot: Import failed.

In this scenario, any logging or slapi_task_finish after the first log could crash the server.

We should not go into the task clean up code when the state is just cancelled.  Rather, we should let the task application finish the task which changes the state to finished, then destroy the task.

Comment 4 Noriko Hosoi 2010-03-18 19:00:08 UTC
Created attachment 401106 [details]
git patch file

Files:
 ldap/servers/slapd/back-ldbm/import.c
 ldap/servers/slapd/task.c

Fix Description:
SLAPI_TASK_CANCELLED could be set in task_modify any time by
users' modifying nsTaskCancel value to TRUE.  Then the following
slapi_task_status_changed destroys the task, which is called
even via a simple logging call slapi_task_log_status.  After the
task is destroyed, any task related calls such as another
slapi_task_log_status or slapi_task_finish crashes the server.

This fix changes the behaviour to destroy the task only when
task_state is SLAPI_TASK_FINISHED.  Once SLAPI_TASK_CANCELLED
is set to task_state, changing the state to SLAPI_TASK_FINISHED
by calling slapi_task_finish is the responsibility of the task
application (e.g., import).  Until then, it is guranteed that
the task is available.

Comment 5 Noriko Hosoi 2010-03-19 17:26:32 UTC
Reviewed by Nathan (Thank you!!!)

Pushed to master.

$ git merge fix
Updating d06cce8..6236bb3
Fast forward
 ldap/servers/slapd/back-ldbm/import.c |   22 ++++++++++++++++++++++
 ldap/servers/slapd/task.c             |    8 ++++++--
 2 files changed, 28 insertions(+), 2 deletions(-)

$ git push
Counting objects: 15, done.
Delta compression using 4 threads.
Compressing objects: 100% (8/8), done.
Writing objects: 100% (8/8), 1.54 KiB, done.
Total 8 (delta 6), reused 0 (delta 0)
To ssh://git.fedorahosted.org/git/389/ds.git
   d06cce8..6236bb3  master -> master

Pushed to Directory_Server_8_2_Branch.

$ git push origin ds82-local:Directory_Server_8_2_Branch
Counting objects: 15, done.
Delta compression using 4 threads.
Compressing objects: 100% (8/8), done.
Writing objects: 100% (8/8), 13.07 KiB, done.
Total 8 (delta 5), reused 1 (delta 0)
To ssh://git.fedorahosted.org/git/389/ds.git
   f4cce0c..02514d6  ds82-local -> Directory_Server_8_2_Branch

Comment 6 Jenny Severance 2010-06-01 19:27:20 UTC
verified - RHEL 4

version:

redhat-ds-base-8.2.0-2010053104.el4dsrv
redhat-admin-console-8.2.0-1.el4dsrv
redhat-idm-console-1.0.0-26.el4idm
redhat-ds-console-8.2.0-3.el4dsrv
redhat-ds-8.2.0-1.el4dsrv
redhat-ds-admin-8.2.0-3.el4dsrv

1. From DS Console, configuration tab, selected backend db and Initialize database ...
2. Selected large LDIF and and confirmed import.
3. After import started, selected "stop".
4. Import stopped and no server crash.


Note You need to log in before you can comment on or make changes to this bug.