Bug 1401759 - Unison SIGSEGV when synchronizing large replica.
Summary: Unison SIGSEGV when synchronizing large replica.
Keywords:
Status: CLOSED EOL
Alias: None
Product: Fedora
Classification: Fedora
Component: unison240
Version: 25
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
Assignee: Richard W.M. Jones
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-12-06 02:19 UTC by Alex Markley
Modified: 2017-12-12 10:53 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-12-12 10:53:51 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)

Description Alex Markley 2016-12-06 02:19:42 UTC
Description of problem:

Ever since upgrading to Fedora 25, when synchronizing a large replica (> 1TB) and/or running for over 2 hours, Unison client crashes with a segmentation fault.


Version-Release number of selected component (if applicable):

unison240-2.40.128-4.fc24.x86_64
unison240-debuginfo-2.40.128-4.fc24.x86_64
unison240-text-2.40.128-4.fc24.x86_64

NOTE: Other versions of unison upstream, including 2.48.4 (latest stable) exhibit this same behavior.


How reproducible:

Completely reproducible, as long as the client is running Fedora 25. (Might require the server to also be running Fedora 25, but I'm not sure.)

The Unison server, also running on Fedora 25, does not appear to be affected by this issue. Only the client.

Fedora 24 did not (apparently) exhibit this issue.


Steps to Reproduce:
1. Create a large, complicated dataset on the server for Unison to synchronize. Ideally this will be over 1TB in size and require over 2 hours to transfer.
2. Perform a synchronization between the client and the server, requiring the majority of the data to be transferred from the server to the client. (This mimics initial synchronization of a new hub/spoke node.)
3. Observe the client fails to synchronize the entire dataset. Client is terminated with SIGSEGV.


Actual results:

Client Unison gets terminated with SIGSEGV.


Expected results:

Client Unison should finish working and transfer entire dataset.


Additional info:

Upstream bug report opened at https://github.com/bcpierce00/unison/issues/48

Comment 1 Alex Markley 2016-12-06 02:24:07 UTC
ABRT refused to report, due to an "unusable" backtrace. However, I thought I should include the backtrace anyway:

warning: core file may not match specified executable file.
[New LWP 3558]
Core was generated by `unison -batch -prefer ssh://alex@elbmin//home/alex alexHome'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x000000000045e0ff in camlLwt__connect_1025 ()

Thread 1 (LWP 3558):
#0  0x000000000045e0ff in camlLwt__connect_1025 ()
No symbol table info available.
#1  0x000000000048a111 in camlList__iter_1061 ()
No symbol table info available.
#2  0x000000000045e012 in camlLwt__restart_1020 ()
No symbol table info available.
#3  0x000000000045aaa2 in camlLwt_unix_impl__restart_threads_1051 ()
No symbol table info available.
#4  0x000000000045b18e in camlLwt_unix_impl__run_1081 ()
No symbol table info available.
#5  0x00000000004100d4 in camlUitext__doTransport_1157 ()
No symbol table info available.
#6  0x000000000040e906 in camlUitext__doit_1214 ()
No symbol table info available.
#7  0x00000000004109f0 in camlUitext__synchronizeOnce_1228 ()
No symbol table info available.
#8  0x000000000040f0dd in camlUitext__loop_1273 ()
No symbol table info available.
#9  0x0000000000410dec in camlUitext__synchronizeUntilDone_1277 ()
No symbol table info available.
#10 0x00000000004110bf in camlUitext__start_1280 ()
No symbol table info available.
#11 0x000000000040885a in camlMain__Body_1088 ()
No symbol table info available.
#12 0x0000000000407d73 in camlLinktext__entry ()
No symbol table info available.
#13 0x0000000000404649 in caml_program ()
No symbol table info available.
#14 0x00000000004ddd6e in caml_start_program ()
No symbol table info available.
#15 0x0000000000000000 in ?? ()
No symbol table info available.
From                To                  Syms Read   Shared Object Library
0x00007fc0b51cde60  0x00007fc0b51ce8b5  Yes         /lib64/libutil.so.1
0x00007fc0b4ec9720  0x00007fc0b4f3ac0a  Yes         /lib64/libm.so.6
0x00007fc0b4cc0da0  0x00007fc0b4cc19ae  Yes         /lib64/libdl.so.2
0x00007fc0b49199d0  0x00007fc0b4a69a83  Yes         /lib64/libc.so.6
0x00007fc0b53d0ad0  0x00007fc0b53ee2c0  Yes         /lib64/ld-linux-x86-64.so.2
$1 = 0x0
No symbol "__glib_assert_msg" in current context.
rax            0xaf3fd0	11485136
rbx            0x7fc068c81f08	140464368393992
rcx            0x7fc04e3c5038	140463923023928
rdx            0xaf3fd8	11485144
rsi            0x7fc0b48cacc0	140465639566528
rdi            0x1	1
rbp            0x4e0c7c	0x4e0c7c
rsp            0x7ffd35966370	0x7ffd35966370
r8             0x1	1
r9             0x7fc0b48d0048	140465639587912
r10            0x5bb7fb0	96174000
r11            0x7f8340	8356672
r12            0xb00	2816
r13            0xd3d	3389
r14            0x7ffd35966530	140725502502192
r15            0x7fc0b48cac78	140465639566456
rip            0x45e0ff	0x45e0ff <camlLwt__connect_1025+191>
eflags         0x10216	[ PF AF IF RF ]
cs             0x33	51
ss             0x2b	43
ds             0x0	0
es             0x0	0
fs             0x0	0
gs             0x0	0
st0            0	(raw 0x00000000000000000000)
st1            0	(raw 0x00000000000000000000)
st2            0	(raw 0x00000000000000000000)
st3            0	(raw 0x00000000000000000000)
st4            0	(raw 0x00000000000000000000)
st5            0	(raw 0x00000000000000000000)
st6            0	(raw 0x00000000000000000000)
st7            0	(raw 0x00000000000000000000)
fctrl          0x37f	895
fstat          0x0	0
ftag           0xffff	65535
fiseg          0x0	0
fioff          0x0	0
foseg          0x0	0
fooff          0x0	0
fop            0x0	0
mxcsr          0x1fa0	[ PE IM DM ZM OM UM PM ]
bndcfgu        {raw = 0x0, config = {base = 0x0, reserved = 0x0, preserved = 0x0, enabled = 0x0}}	{raw = 0x0, config = {base = 0, reserved = 0, preserved = 0, enabled = 0}}
bndstatus      {raw = 0x0, status = {bde = 0x0, error = 0x0}}	{raw = 0x0, status = {bde = 0, error = 0}}
ymm0           {v8_float = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, v4_double = {0x8000000000000000, 0x0, 0x0, 0x0}, v32_int8 = {0x30, 0x30, 0x33, 0x33, 0x20, 0x76, 0x69, 0x62, 0x72, 0x61, 0x74, 0x6f, 0x34, 0x20, 0x43, 0x35, 0x0 <repeats 16 times>}, v16_int16 = {0x3030, 0x3333, 0x7620, 0x6269, 0x6172, 0x6f74, 0x2034, 0x3543, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, v8_int32 = {0x33333030, 0x62697620, 0x6f746172, 0x35432034, 0x0, 0x0, 0x0, 0x0}, v4_int64 = {0x6269762033333030, 0x354320346f746172, 0x0, 0x0}, v2_int128 = {0x354320346f7461726269762033333030, 0x00000000000000000000000000000000}}
ymm1           {v8_float = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, v4_double = {0x8000000000000000, 0x8000000000000000, 0x0, 0x0}, v32_int8 = {0x20, 0x76, 0x69, 0x62, 0x72, 0x61, 0x74, 0x6f, 0x34, 0x20, 0x43, 0x35, 0x2e, 0x77, 0x61, 0x76, 0x0 <repeats 16 times>}, v16_int16 = {0x7620, 0x6269, 0x6172, 0x6f74, 0x2034, 0x3543, 0x772e, 0x7661, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, v8_int32 = {0x62697620, 0x6f746172, 0x35432034, 0x7661772e, 0x0, 0x0, 0x0, 0x0}, v4_int64 = {0x6f74617262697620, 0x7661772e35432034, 0x0, 0x0}, v2_int128 = {0x7661772e354320346f74617262697620, 0x00000000000000000000000000000000}}
ymm2           {v8_float = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, v4_double = {0x8000000000000000, 0x8000000000000000, 0x0, 0x0}, v32_int8 = {0x2f, 0x43, 0x68, 0x61, 0x6d, 0x62, 0x65, 0x72, 0x20, 0x53, 0x74, 0x72, 0x69, 0x6e, 0x67, 0x73, 0x0 <repeats 16 times>}, v16_int16 = {0x432f, 0x6168, 0x626d, 0x7265, 0x5320, 0x7274, 0x6e69, 0x7367, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, v8_int32 = {0x6168432f, 0x7265626d, 0x72745320, 0x73676e69, 0x0, 0x0, 0x0, 0x0}, v4_int64 = {0x7265626d6168432f, 0x73676e6972745320, 0x0, 0x0}, v2_int128 = {0x73676e69727453207265626d6168432f, 0x00000000000000000000000000000000}}
ymm3           {v8_float = {0x2bd, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, v4_double = {0x8000000000000000, 0x8000000000000000, 0x0, 0x0}, v32_int8 = {0x74, 0x61, 0x2f, 0x44, 0x69, 0x72, 0x65, 0x63, 0x74, 0x57, 0x61, 0x76, 0x65, 0x2f, 0x56, 0x61, 0x0 <repeats 16 times>}, v16_int16 = {0x6174, 0x442f, 0x7269, 0x6365, 0x5774, 0x7661, 0x2f65, 0x6156, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, v8_int32 = {0x442f6174, 0x63657269, 0x76615774, 0x61562f65, 0x0, 0x0, 0x0, 0x0}, v4_int64 = {0x63657269442f6174, 0x61562f6576615774, 0x0, 0x0}, v2_int128 = {0x61562f657661577463657269442f6174, 0x00000000000000000000000000000000}}
ymm4           {v8_float = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, v4_double = {0x8000000000000000, 0x8000000000000000, 0x0, 0x0}, v32_int8 = {0x2f, 0x43, 0x68, 0x61, 0x6d, 0x62, 0x65, 0x72, 0x20, 0x53, 0x74, 0x72, 0x69, 0x6e, 0x67, 0x73, 0x0 <repeats 16 times>}, v16_int16 = {0x432f, 0x6168, 0x626d, 0x7265, 0x5320, 0x7274, 0x6e69, 0x7367, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, v8_int32 = {0x6168432f, 0x7265626d, 0x72745320, 0x73676e69, 0x0, 0x0, 0x0, 0x0}, v4_int64 = {0x7265626d6168432f, 0x73676e6972745320, 0x0, 0x0}, v2_int128 = {0x73676e69727453207265626d6168432f, 0x00000000000000000000000000000000}}
ymm5           {v8_float = {0x2bd, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, v4_double = {0x8000000000000000, 0x8000000000000000, 0x0, 0x0}, v32_int8 = {0x74, 0x61, 0x2f, 0x44, 0x69, 0x72, 0x65, 0x63, 0x74, 0x57, 0x61, 0x76, 0x65, 0x2f, 0x56, 0x61, 0x0 <repeats 16 times>}, v16_int16 = {0x6174, 0x442f, 0x7269, 0x6365, 0x5774, 0x7661, 0x2f65, 0x6156, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, v8_int32 = {0x442f6174, 0x63657269, 0x76615774, 0x61562f65, 0x0, 0x0, 0x0, 0x0}, v4_int64 = {0x63657269442f6174, 0x61562f6576615774, 0x0, 0x0}, v2_int128 = {0x61562f657661577463657269442f6174, 0x00000000000000000000000000000000}}
ymm6           {v8_float = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, v4_double = {0x8000000000000000, 0x8000000000000000, 0x0, 0x0}, v32_int8 = {0x61, 0x36, 0x38, 0x63, 0x63, 0x36, 0x37, 0x61, 0x37, 0x31, 0x64, 0x64, 0x32, 0x62, 0x64, 0x65, 0x0 <repeats 16 times>}, v16_int16 = {0x3661, 0x6338, 0x3663, 0x6137, 0x3137, 0x6464, 0x6232, 0x6564, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, v8_int32 = {0x63383661, 0x61373663, 0x64643137, 0x65646232, 0x0, 0x0, 0x0, 0x0}, v4_int64 = {0x6137366363383661, 0x6564623264643137, 0x0, 0x0}, v2_int128 = {0x65646232646431376137366363383661, 0x00000000000000000000000000000000}}
ymm7           {v8_float = {0x0, 0x0, 0x4c460000, 0x0, 0x0, 0x0, 0x0, 0x0}, v4_double = {0x0, 0x8000000000000000, 0x0, 0x0}, v32_int8 = {0x2e, 0x75, 0x6e, 0x69, 0x73, 0x6f, 0x6e, 0x2e, 0x46, 0x4c, 0x20, 0x53, 0x74, 0x75, 0x64, 0x69, 0x0 <repeats 16 times>}, v16_int16 = {0x752e, 0x696e, 0x6f73, 0x2e6e, 0x4c46, 0x5320, 0x7574, 0x6964, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, v8_int32 = {0x696e752e, 0x2e6e6f73, 0x53204c46, 0x69647574, 0x0, 0x0, 0x0, 0x0}, v4_int64 = {0x2e6e6f73696e752e, 0x6964757453204c46, 0x0, 0x0}, v2_int128 = {0x6964757453204c462e6e6f73696e752e, 0x00000000000000000000000000000000}}
ymm8           {v8_float = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, v4_double = {0x0, 0x0, 0x0, 0x0}, v32_int8 = {0x0, 0x0, 0x0, 0x0, 0x0, 0x80, 0x3f, 0x0, 0x0, 0x0, 0x0, 0x6e, 0x0, 0x0, 0x0, 0xd, 0x0 <repeats 16 times>}, v16_int16 = {0x0, 0x0, 0x8000, 0x3f, 0x0, 0x6e00, 0x0, 0xd00, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, v8_int32 = {0x0, 0x3f8000, 0x6e000000, 0xd000000, 0x0, 0x0, 0x0, 0x0}, v4_int64 = {0x3f800000000000, 0xd0000006e000000, 0x0, 0x0}, v2_int128 = {0x0d0000006e000000003f800000000000, 0x00000000000000000000000000000000}}
ymm9           {v8_float = {0x0, 0x1, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, v4_double = {0x0, 0x0, 0x0, 0x0}, v32_int8 = {0xc6, 0xc7, 0xb8, 0xf7, 0xb0, 0xbd, 0xe6, 0x3f, 0x0 <repeats 24 times>}, v16_int16 = {0xc7c6, 0xf7b8, 0xbdb0, 0x3fe6, 0x0 <repeats 12 times>}, v8_int32 = {0xf7b8c7c6, 0x3fe6bdb0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, v4_int64 = {0x3fe6bdb0f7b8c7c6, 0x0, 0x0, 0x0}, v2_int128 = {0x00000000000000003fe6bdb0f7b8c7c6, 0x00000000000000000000000000000000}}
ymm10          {v8_float = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, v4_double = {0x0, 0x0, 0x0, 0x0}, v32_int8 = {0x0, 0x0, 0x0, 0x0, 0x68, 0xc8, 0xbc, 0x3b, 0x0 <repeats 24 times>}, v16_int16 = {0x0, 0x0, 0xc868, 0x3bbc, 0x0 <repeats 12 times>}, v8_int32 = {0x0, 0x3bbcc868, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, v4_int64 = {0x3bbcc86800000000, 0x0, 0x0, 0x0}, v2_int128 = {0x00000000000000003bbcc86800000000, 0x00000000000000000000000000000000}}
ymm11          {v8_float = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, v4_double = {0x0, 0x0, 0x0, 0x0}, v32_int8 = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0xca, 0xbc, 0x0 <repeats 24 times>}, v16_int16 = {0x0, 0x0, 0x0, 0xbcca, 0x0 <repeats 12 times>}, v8_int32 = {0x0, 0xbcca0000, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, v4_int64 = {0xbcca000000000000, 0x0, 0x0, 0x0}, v2_int128 = {0x0000000000000000bcca000000000000, 0x00000000000000000000000000000000}}
ymm12          {v8_float = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, v4_double = {0x0, 0x0, 0x0, 0x0}, v32_int8 = {0x0, 0x0, 0x0, 0x0, 0x0, 0x60, 0x73, 0xbc, 0x0 <repeats 24 times>}, v16_int16 = {0x0, 0x0, 0x6000, 0xbc73, 0x0 <repeats 12 times>}, v8_int32 = {0x0, 0xbc736000, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, v4_int64 = {0xbc73600000000000, 0x0, 0x0, 0x0}, v2_int128 = {0x0000000000000000bc73600000000000, 0x00000000000000000000000000000000}}
ymm13          {v8_float = {0x0, 0x2, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, v4_double = {0xd, 0x0, 0x0, 0x0}, v32_int8 = {0x80, 0x22, 0xc9, 0x8e, 0xef, 0x56, 0x2a, 0x40, 0x0 <repeats 24 times>}, v16_int16 = {0x2280, 0x8ec9, 0x56ef, 0x402a, 0x0 <repeats 12 times>}, v8_int32 = {0x8ec92280, 0x402a56ef, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, v4_int64 = {0x402a56ef8ec92280, 0x0, 0x0, 0x0}, v2_int128 = {0x0000000000000000402a56ef8ec92280, 0x00000000000000000000000000000000}}
ymm14          {v8_float = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, v4_double = {0x0, 0x0, 0x0, 0x0}, v32_int8 = {0xfd, 0xf2, 0x93, 0x7e, 0xd7, 0x0, 0xe4, 0x3b, 0x0 <repeats 24 times>}, v16_int16 = {0xf2fd, 0x7e93, 0xd7, 0x3be4, 0x0 <repeats 12 times>}, v8_int32 = {0x7e93f2fd, 0x3be400d7, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, v4_int64 = {0x3be400d77e93f2fd, 0x0, 0x0, 0x0}, v2_int128 = {0x00000000000000003be400d77e93f2fd, 0x00000000000000000000000000000000}}
ymm15          {v8_float = {0x0, 0x2, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, v4_double = {0xd, 0x0, 0x0, 0x0}, v32_int8 = {0x80, 0x22, 0xc9, 0x8e, 0xef, 0x56, 0x2a, 0x40, 0x0 <repeats 24 times>}, v16_int16 = {0x2280, 0x8ec9, 0x56ef, 0x402a, 0x0 <repeats 12 times>}, v8_int32 = {0x8ec92280, 0x402a56ef, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, v4_int64 = {0x402a56ef8ec92280, 0x0, 0x0, 0x0}, v2_int128 = {0x0000000000000000402a56ef8ec92280, 0x00000000000000000000000000000000}}Python Exception <class 'NameError'> name 'long' is not defined: 
Python Exception <class 'NameError'> name 'long' is not defined: 
Python Exception <class 'NameError'> name 'long' is not defined: 
Python Exception <class 'NameError'> name 'long' is not defined: 
Python Exception <class 'NameError'> name 'long' is not defined: 
Python Exception <class 'NameError'> name 'long' is not defined: 
Python Exception <class 'NameError'> name 'long' is not defined: 
Python Exception <class 'NameError'> name 'long' is not defined: 

bnd0           	
bnd1           	
bnd2           	
bnd3           	
Dump of assembler code for function camlLwt__connect_1025:
   0x000000000045e040 <+0>:	sub    $0x18,%rsp
   0x000000000045e044 <+4>:	mov    %rax,0x10(%rsp)
   0x000000000045e049 <+9>:	mov    %rbx,(%rsp)
   0x000000000045e04d <+13>:	mov    %rdi,0x8(%rsp)
   0x000000000045e052 <+18>:	mov    (%rax),%rax
   0x000000000045e055 <+21>:	cmp    $0x1,%rax
   0x000000000045e059 <+25>:	je     0x45e067 <camlLwt__connect_1025+39>
   0x000000000045e05b <+27>:	lea    0x34cbc6(%rip),%rax        # 0x7aac28
   0x000000000045e062 <+34>:	callq  0x483ee0 <camlPervasives__invalid_arg_1007>
   0x000000000045e067 <+39>:	mov    (%rsp),%rdi
   0x000000000045e06b <+43>:	mov    (%rdi),%rsi
   0x000000000045e06e <+46>:	cmp    $0x1,%rsi
   0x000000000045e072 <+50>:	jne    0x45e0e8 <camlLwt__connect_1025+168>
   0x000000000045e074 <+52>:	sub    $0x48,%r15
   0x000000000045e078 <+56>:	lea    0x39a2b9(%rip),%rax        # 0x7f8338 <caml_young_limit>
   0x000000000045e07f <+63>:	cmp    (%rax),%r15
   0x000000000045e082 <+66>:	jb     0x45e172 <camlLwt__connect_1025+306>
   0x000000000045e088 <+72>:	lea    0x8(%r15),%rax
   0x000000000045e08c <+76>:	movq   $0x14f7,-0x8(%rax)
   0x000000000045e094 <+84>:	lea    -0x40b(%rip),%rbx        # 0x45dc90 <camlLwt__fun_1129>
   0x000000000045e09b <+91>:	mov    %rbx,(%rax)
   0x000000000045e09e <+94>:	movq   $0x3,0x8(%rax)
   0x000000000045e0a6 <+102>:	mov    0x8(%rsp),%rbx
   0x000000000045e0ab <+107>:	mov    %rbx,0x10(%rax)
   0x000000000045e0af <+111>:	mov    0x10(%rsp),%rbx
   0x000000000045e0b4 <+116>:	mov    %rbx,0x18(%rax)
   0x000000000045e0b8 <+120>:	mov    %rdi,0x20(%rax)
   0x000000000045e0bc <+124>:	lea    0x30(%rax),%rsi
   0x000000000045e0c0 <+128>:	movq   $0x800,-0x8(%rsi)
   0x000000000045e0c8 <+136>:	mov    %rax,(%rsi)
   0x000000000045e0cb <+139>:	mov    0x8(%rdi),%rax
   0x000000000045e0cf <+143>:	mov    %rax,0x8(%rsi)
   0x000000000045e0d3 <+147>:	add    $0x8,%rdi
   0x000000000045e0d7 <+151>:	callq  0x4ce020 <caml_modify>
   0x000000000045e0dc <+156>:	mov    $0x1,%rax
   0x000000000045e0e3 <+163>:	add    $0x18,%rsp
   0x000000000045e0e7 <+167>:	retq   
   0x000000000045e0e8 <+168>:	mov    0x10(%rsp),%rbx
   0x000000000045e0ed <+173>:	mov    %rbx,%rdi
   0x000000000045e0f0 <+176>:	callq  0x4ce020 <caml_modify>
   0x000000000045e0f5 <+181>:	mov    0x8(%rbx),%r12
   0x000000000045e0f9 <+185>:	cmp    $0x1,%r12
   0x000000000045e0fd <+189>:	je     0x45e13c <camlLwt__connect_1025+252>
=> 0x000000000045e0ff <+191>:	mov    0x8(%r12),%rax
   0x000000000045e104 <+196>:	cmp    $0x1,%rax
   0x000000000045e108 <+200>:	je     0x45e114 <camlLwt__connect_1025+212>
   0x000000000045e10a <+202>:	mov    %rbx,0x10(%rsp)
   0x000000000045e10f <+207>:	jmp    0x45e141 <camlLwt__connect_1025+257>
   0x000000000045e111 <+209>:	nopl   (%rax)
   0x000000000045e114 <+212>:	mov    $0x1,%rsi
   0x000000000045e11b <+219>:	add    $0x8,%rbx
   0x000000000045e11f <+223>:	mov    %rbx,%rdi
   0x000000000045e122 <+226>:	callq  0x4ce020 <caml_modify>
   0x000000000045e127 <+231>:	mov    (%r12),%rbx
   0x000000000045e12b <+235>:	mov    $0x1,%rax
   0x000000000045e132 <+242>:	mov    (%rbx),%rdi
   0x000000000045e135 <+245>:	add    $0x18,%rsp
   0x000000000045e139 <+249>:	jmpq   *%rdi
   0x000000000045e13b <+251>:	nop
   0x000000000045e13c <+252>:	mov    %rbx,0x10(%rsp)
   0x000000000045e141 <+257>:	mov    0x8(%rbx),%rbx
   0x000000000045e145 <+261>:	lea    0x34ca4c(%rip),%rax        # 0x7aab98
   0x000000000045e14c <+268>:	callq  0x48a0f0 <camlList__iter_1061>
   0x000000000045e151 <+273>:	mov    $0x1,%rsi
   0x000000000045e158 <+280>:	mov    0x10(%rsp),%rdi
   0x000000000045e15d <+285>:	add    $0x8,%rdi
   0x000000000045e161 <+289>:	callq  0x4ce020 <caml_modify>
   0x000000000045e166 <+294>:	mov    $0x1,%rax
   0x000000000045e16d <+301>:	add    $0x18,%rsp
   0x000000000045e171 <+305>:	retq   
   0x000000000045e172 <+306>:	callq  0x4ddac8 <caml_call_gc>
   0x000000000045e177 <+311>:	jmpq   0x45e074 <camlLwt__connect_1025+52>
End of assembler dump.

Comment 2 Alex Markley 2016-12-09 23:06:53 UTC
I have built the latest version of ocaml (4.04.0). And I rebuilt the latest unison with that ocaml. Not only is the segfault still occurring, but I captured a much better backtrace:

===SNIP===
/home/alex/Temp/galculator-2.1.3/intltool-extract.in has already been transferred
/home/alex/Temp/galculator-2.1.3/intltool-merge.in has already been transferred
/home/alex/Temp/galculator-2.1.3/intltool-update.in has already been transferred
 33%  100:25 ETA
Program received signal SIGSEGV, Segmentation fault.
0x00000000004ec76b in invert_pointer_at (p=p@entry=0x7fffd38c7b28) at compact.c:90
90      compact.c: No such file or directory.
(gdb) thread apply all bt full

Thread 1 (process 19298):
#0  0x00000000004ec76b in invert_pointer_at (p=p@entry=0x7fffd38c7b28) at compact.c:90
        val = 140736742586384
        hp = 0x7461705f77617264
        q = 140736742586416
#1  0x00000000004ec90c in do_compaction () at compact.c:228
        q = <optimized out>
        i = <optimized out>
        sz = 6
        t = <optimized out>
        infixes = <optimized out>
        p = 0x7fffd38c7b10
        ch = 0x7fffbf2fc000 "\363\273M"
        chend = 0x7ffff09f1000 ""
#2  0x00000000004ecdea in caml_compact_heap () at compact.c:426
        target_wsz = <optimized out>
        live = <optimized out>
#3  0x00000000004ed24a in caml_compact_heap_maybe () at compact.c:547
        fw = <optimized out>
        fp = 170.748871
#4  0x00000000004daf4a in caml_major_collection_slice (howmuch=howmuch@entry=-1) at major_gc.c:785
        p = 0.0043600637275738388
        dp = <optimized out>
        filt_p = 0.0043600637275738388
        spend = <optimized out>
        computed_work = 1522479
        i = <optimized out>
#5  0x00000000004dbedf in caml_gc_dispatch () at minor_gc.c:463
        trigger = <optimized out>
#6  0x00000000004dbf77 in caml_check_urgent_gc (extra_root=<optimized out>) at minor_gc.c:482
        caml__frame = 0x0
        caml__roots_extra_root = {next = 0x0, ntables = 1, nitems = 1, tables = {0x7fffffffd758, 0x7fffffffd870, 0x4dc96a <caml_alloc_shr+170>, 0x22, 0x7fff9d02f6b0}}
#7  0x00000000004dcfe5 in caml_alloc_string (len=65497) at alloc.c:103
        result = <optimized out>
        offset_index = <optimized out>
        wosize = 8188
#8  0x000000000047205c in camlBytearray__sub_1422 () at /root/unison-git/src/bytearray.ml:63
No locals.
#9  0x0000000000447812 in camlTransfer__receiveRec_1568 () at /root/unison-git/src/transfer.ml:295
No locals.
#10 0x0000000000427cef in camlCopy__decompr_2936 () at /root/unison-git/src/transfer.ml:304
No locals.
#11 0x0000000000426bca in camlCopy__fun_3367 () at /root/unison-git/src/copy.ml:401
No locals.
#12 0x000000000046cc11 in camlUtil__convertUnixErrorsToExn_1955 () at /root/unison-git/src/ubase/util.ml:170
No locals.
#13 0x000000000043f46a in camlRemote__processStream_2291 () at /root/unison-git/src/remote.ml:664
No locals.
#14 0x000000000043fe26 in camlRemote__fun_4468 () at /root/unison-git/src/remote.ml:732
No locals.
#15 0x0000000000464e4d in camlLwt__apply_1225 () at /root/unison-git/src/lwt/lwt.ml:75
No locals.
#16 0x000000000046510e in camlLwt__fun_1451 () at /root/unison-git/src/lwt/lwt.ml:94
No locals.
#17 0x000000000048d101 in camlList__iter_1252 () at list.ml:77
No locals.
#18 0x0000000000464b2e in camlLwt__restart_1211 () at /root/unison-git/src/lwt/lwt.ml:31
No locals.
#19 0x000000000046182e in camlLwt_unix_impl__fun_2430 () at /root/unison-git/src/lwt/generic/lwt_unix_impl.ml:153
No locals.
#20 0x000000000048d101 in camlList__iter_1252 () at list.ml:77
No locals.
#21 0x0000000000461671 in camlLwt_unix_impl__run_1579 () at /root/unison-git/src/lwt/generic/lwt_unix_impl.ml:148
No locals.
#22 0x000000000040e80a in camlUitext__doTransport_1863 () at /root/unison-git/src/uitext.ml:490
No locals.
#23 0x000000000040f84e in camlUitext__doit_1922 () at /root/unison-git/src/uitext.ml:556
No locals.
#24 0x0000000000410034 in camlUitext__synchronizeOnce_1968 () at /root/unison-git/src/uitext.ml:718
No locals.
#25 0x000000000041094a in camlUitext__loop_2237 () at /root/unison-git/src/uitext.ml:788
No locals.
#26 0x0000000000410b4d in camlUitext__synchronizeUntilDone_2242 () at /root/unison-git/src/uitext.ml:810
No locals.
#27 0x0000000000410df7 in camlUitext__start_2249 () at /root/unison-git/src/uitext.ml:870
No locals.
#28 0x00000000004085fa in camlMain__Body_1550 () at /root/unison-git/src/main.ml:241
No locals.
#29 0x0000000000407a93 in camlLinktext__entry () at /root/unison-git/src/linktext.ml:19
No locals.
#30 0x0000000000404369 in caml_program ()
No symbol table info available.
#31 0x00000000004ef12e in caml_start_program ()
No symbol table info available.
#32 0x00000000004ef475 in caml_main (argv=0x7fffffffdca8) at startup.c:145
        exe_name = <optimized out>
        proc_self_exe = "/usr/local/bin/unison", '\000' <repeats 234 times>
        res = <optimized out>
        tos = 0 '\000'
#33 0x0000000000403c5c in main (argc=<optimized out>, argv=<optimized out>) at main.c:37
No locals.
(gdb)
===SNIP===

It looks like the process died in the ocaml garbage collector, so maybe this is a bug in ocaml? On the other hand, since the problem suddenly started recently, maybe a compiler regression?

I am really interested in getting feedback from somebody on this; debugging this is slow going.

Comment 3 Alex Markley 2016-12-10 16:01:20 UTC
I've opened a bug report at OCaml's bug tracker to see if I can get their feedback on this: https://caml.inria.fr/mantis/view.php?id=7431

Comment 4 Richard W.M. Jones 2016-12-13 18:18:31 UTC
I'll just say as a general comment that bugs that hit during
garbage collection usually are nothing to do with GC or the
runtime at all, they're caused by faulty C bindings corrupting
memory.

You may get a better idea of where the problem occurs by adding
calls to ``Gc.compact ()'' throughout the code (it should be safe
to call this function at any time), and working backwards from
the first compact which fails.

Comment 5 Alex Markley 2016-12-13 18:48:15 UTC
Richard,

Thanks for responding to this issue!

I've had a ton of feedback from the upstream developers (particular the OCaml developers) on this issue. Everything is contained within the linked issues.

I am very much in agreement that the garbage collection code is unlikely to be the culprit here. In fact, various backtraces show the code segfaulting in entirely different places, so I agree that early corruption is leading to a crash later on.

Right now the most promising candidate I have is a GCC compiler regression. (I may have eliminated the problem by rebuilding OCaml with GCC 5.4.0, but I won't be sure until later tonight or tomorrow.)

The other, less heartwarming possibility is that my laptop is having issues.

I am keeping the issues I've opened up-to-date with my findings.

Comment 6 Alex Markley 2016-12-24 22:07:04 UTC
I wanted to update everyone on this issue... It has been tracked down to a bug in the Unison codebase:

https://github.com/bcpierce00/unison/issues/48

As you can see in the upstream issue queue, patches have been accepted into the upstream tree which resolve this issue.

I am waiting on feedback from the developer as to when we can expect a stable release which includes this fix.

Comment 7 Richard W.M. Jones 2016-12-25 11:17:56 UTC
I don't know who "the developer" is, but please apply for
comaintainership of the package so you can add the fixes yourself.

Comment 8 Alex Markley 2016-12-25 14:00:23 UTC
Who was previously building and maintaining the RPM packages for unison? Are they no longer actively maintaining these packages?

As for me, I am open to taking some ownership here, but I have never been a package maintained before. Is there some documentation you could link me to regarding:

- Applying for maintainership
- Creating and/or modifying RPM packages
- Submitting updates to build servers and/or repositories

Thanks!

Comment 9 Richard W.M. Jones 2016-12-25 14:36:53 UTC
I don't know.  Please read: https://fedoraproject.org/wiki/Join_the_package_collection_maintainers

Comment 10 Fedora End Of Life 2017-11-16 19:23:09 UTC
This message is a reminder that Fedora 25 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 25. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as EOL if it remains open with a Fedora  'version'
of '25'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version'
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not
able to fix it before Fedora 25 is end of life. If you would still like
to see this bug fixed and are able to reproduce it against a later version
of Fedora, you are encouraged  change the 'version' to a later Fedora
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's
lifetime, sometimes those efforts are overtaken by events. Often a
more recent Fedora release includes newer upstream software that fixes
bugs or makes them obsolete.

Comment 11 Fedora End Of Life 2017-12-12 10:53:51 UTC
Fedora 25 changed to end-of-life (EOL) status on 2017-12-12. Fedora 25 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.


Note You need to log in before you can comment on or make changes to this bug.