Bug 838081 - ocaml/t/guestfs_500_parallel_mount_local crashes in caml_thread_reinitialize
Summary: ocaml/t/guestfs_500_parallel_mount_local crashes in caml_thread_reinitialize
Keywords:
Status: CLOSED DEFERRED
Alias: None
Product: Virtualization Tools
Classification: Community
Component: libguestfs
Version: unspecified
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Richard W.M. Jones
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-07-06 12:45 UTC by Richard W.M. Jones
Modified: 2013-03-06 11:39 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2013-03-06 11:39:07 UTC
Embargoed:


Attachments (Terms of Use)

Description Richard W.M. Jones 2012-07-06 12:45:38 UTC
Description of problem:

In the test suite, ocaml/t/guestfs_500_parallel_mount_local (both
bytecode and native code) crashes occasionally.  With core dumps
enabled you will see a segfault in caml_thread_reinitialize called
in the child process after the program forks:

#0  0x00000000004370c8 in caml_thread_reinitialize ()
#1  0x00000030e12bd7c6 in __libc_fork ()
    at ../nptl/sysdeps/unix/sysv/linux/fork.c:188
#2  0x00000030e1a108c5 in __fork ()
    at ../nptl/sysdeps/unix/sysv/linux/pt-fork.c:25
#3  0x00000030ee619e86 in fuse_mount_fusermount (
    mountpoint=mountpoint@entry=0x7f5e70000d00 "mp7", 
    opts=0x7f5e70000f30 "rw,nosuid,nodev", quiet=quiet@entry=0) at mount.c:338
#4  0x00000030ee61aa18 in fuse_kern_mount (
    mountpoint=mountpoint@entry=0x7f5e70000d00 "mp7", 
    args=args@entry=0x7f5e7fffe9d0) at mount.c:581
#5  0x00000030ee616d75 in fuse_mount_compat25 (
    mountpoint=mountpoint@entry=0x7f5e70000d00 "mp7", 
    args=args@entry=0x7f5e7fffe9d0) at helper.c:447
#6  0x00000030ee616dc6 in fuse_mount_common (
    mountpoint=mountpoint@entry=0x7f5e70000d00 "mp7", 
    args=args@entry=0x7f5e7fffe9d0) at helper.c:210
#7  0x00000030ee617115 in fuse_mount (
    mountpoint=mountpoint@entry=0x7f5e70000d00 "mp7", 
    args=args@entry=0x7f5e7fffe9d0) at helper.c:223
#8  0x00007f5e98e374be in guestfs__mount_local (g=g@entry=0x7f5e700008c0, 
    localmountpoint=localmountpoint@entry=0x7f5e70000d00 "mp7", 
    optargs=optargs@entry=0x7f5e7fffeab0) at fuse.c:965
#9  0x00007f5e98dda7c6 in guestfs_mount_local_argv (g=g@entry=0x7f5e700008c0, 
    localmountpoint=localmountpoint@entry=0x7f5e70000d00 "mp7", 
    optargs=optargs@entry=0x7f5e7fffeab0) at actions.c:4191
#10 0x0000000000426e61 in ocaml_guestfs_mount_local (gv=-4485090715960753727, 
    readonlyv=5, optionsv=209939302224, cachetimeoutv=39024112, 
    debugcallsv=96, localmountpointv=72340172838076673)
    at guestfs_c_actions.c:9334
#11 0x0000000000451aa9 in caml_interprete ()
#12 0x000000000044db67 in caml_callbackN_exn ()
#13 0x000000000044dbc5 in caml_callback_exn ()
#14 0x0000000000436ec9 in caml_thread_start ()
#15 0x00000030e1a07ef5 in start_thread (arg=0x7f5e7ffff700)
    at pthread_create.c:308
#16 0x00000030e12f4ead in clone ()
    at ../sysdeps/unix/sysv/linux/x86_64/clone.S:114

Note what happens here is that g#mount_local calls fuse_mount
which eventually forks the fusermount subprocess:

http://fuse.git.sourceforge.net/git/gitweb.cgi?p=fuse/fuse;a=blob;f=lib/mount.c;h=6a9da9eefd5641fc2b790738a1e739525016e5b5;hb=HEAD#l359

*After* the fork (in the new subprocess) there is some OCaml
runtime code (caml_thread_reinitialize) to delete the stacks of
the other threads which no longer exist:

http://caml.inria.fr/mantis/view.php?id=4577

This runtime code segfaults when traversing the list.

Version-Release number of selected component (if applicable):

1.19.16

How reproducible:

Frequent

Steps to Reproduce:
1. ulimit -Hc unlimited
2. ulimit -Sc unlimited
3. Repeatedly call: make -C ocaml check

Actual results:

You'll see core dumps in the ocaml/ directory.

Comment 1 Richard W.M. Jones 2012-07-06 13:12:30 UTC
I pushed this hack which appears to work around the
problem:
https://github.com/libguestfs/libguestfs/commit/ad7c4498f66f37c4219242c6df04d28e9ee7877f

Comment 2 Richard W.M. Jones 2012-07-16 12:15:51 UTC
The workaround doesn't cure the problem, so I have reverted it.

Comment 3 Richard W.M. Jones 2012-07-16 12:22:29 UTC
(In reply to comment #2)
> The workaround doesn't cure the problem, so I have reverted it.

Ignore that.

I saw what seems to be the same bug, affecting a different
piece of code.  Perhaps adding Gc.compact could work around
this place too?

#0  0x00000000004abcc8 in caml_thread_reinitialize ()
#1  0x0000003fc26bab9e in __libc_fork ()
    at ../nptl/sysdeps/unix/sysv/linux/fork.c:188
#2  0x0000003fc266cf34 in _IO_new_proc_open (fp=fp@entry=0x7f7868000f30,
    command=command@entry=0x7f787a7fa2c0 "LC_ALL=C '/bin/qemu-kvm' -nographic -\
version 2>/dev/null", mode=<optimized out>, mode@entry=0x7f7881ecd090 "r")
    at iopopen.c:187
#3  0x0000003fc266d1c7 in _IO_new_popen (
    command=0x7f787a7fa2c0 "LC_ALL=C '/bin/qemu-kvm' -nographic -version 2>/dev\
/null", mode=0x7f7881ecd090 "r") at iopopen.c:308
#4  0x00007f7881eab88d in test_qemu_cmd (g=g@entry=0x7f78680008c0,
    cmd=cmd@entry=0x7f787a7fa2c0 "LC_ALL=C '/bin/qemu-kvm' -nographic -version \
2>/dev/null", ret=ret@entry=0x7f7868000910) at launch.c:1428
#5  0x00007f7881eaba04 in test_qemu (g=0x7f78680008c0) at launch.c:1410
#6  0x00007f7881eabaea in qemu_supports (g=g@entry=0x7f78680008c0,
    option=option@entry=0x0) at launch.c:1479
#7  0x00007f7881eac921 in launch_appliance (g=g@entry=0x7f78680008c0)
    at launch.c:586
#8  0x00007f7881ead991 in guestfs__launch (g=g@entry=0x7f78680008c0)
    at launch.c:530
#9  0x00007f7881e4fda8 in guestfs_launch (g=g@entry=0x7f78680008c0)
    at actions.c:1119
#10 0x0000000000496ad8 in ocaml_guestfs_launch (gv=578721382704613384)
    at guestfs_c_actions.c:7544
#11 0x00000000004c3f9a in caml_c_call ()
#12 0x0000000000000001 in ?? ()

Comment 4 Richard W.M. Jones 2013-03-06 11:39:07 UTC
I suspect this is a bug in OCaml itself, specifically in
the rather hairy fork handling (see
http://caml.inria.fr/mantis/view.php?id=4577).

In any case, we no longer do a multithreaded test of
mount-local in OCaml.  It was rewritten in C.  So this
bug doesn't apply to libguestfs any longer.


Note You need to log in before you can comment on or make changes to this bug.