Red Hat Bugzilla – Bug 441685
Fatal error : Uncaught exception Out of memory
Last modified: 2012-11-15 13:47:43 EST
Description of problem:
When I try synchronise two machines (one rawhide and second F8, both with unison
2.27.57) I get error message "Fatal error Uncaught exception Out of memory"
shortly after typing my password. The main window is already opened at this
moment. It works when the rawhide machine is rebooted back to F8.
Version-Release number of selected component (if applicable):
unison227-2.27.57-8.fc9.x86_64 in rawhide
Additional info: maybe this can be useful: if I mount the F8 partition to
/mnt/tmp and I run in rawhide /mnt/tmp/usr/bin/unison I get the same error, even
if this binary works without problem in FC8. So maybe there is some bad
interaction between several components.
Stephen, do you know why the unison??? bug reports are not assigned to you?
Gerard, for some reason, there's no bugzilla component for unison213/unison227
yet (I think because the new packages haven't been pushed to stable yet; I just
requested that yesterday)
So, the bugs are simply filed against unison right now, which you own. Perhaps
if you "release ownership" of the old unison package, I can own it too, which
should fix the issue (assuming I can remember how to "take ownership"...)
Jiri, a few questions:
* Are both the F8 and devel machines x86_64, or just one?
* How big is the file tree being synchronized? i.e. How many megabytes, how many
There's a Unison FAQ entry that might be relevant too (see below). Can you try
adjusting the stack size limit to see if it fixes the issue?
Finally, can you tell me the exact unison command you're running?
Unison crashes with an "out of memory" error when used to synchronize really
huge directories (e.g., with hundreds of thousands of files).
You may need to increase your maximum stack size. On Linux and Solaris systems,
for example, you can do this using the ulimit command (see the bash
documentation for details).
You are right. I should read the FAQ first. The error gets away when I do ulimit
-s unlimited. The trees I synchronize are quite large, around 100k of files.
I must be somewhere on the border, because in F8 both trees synchronize without
problem and in rawhide not. Thanks for your help and feel free to close the bug.
Excellent. Thanks very much for checking this.
*** Bug 446316 has been marked as a duplicate of this bug. ***
*** Bug 443304 has been marked as a duplicate of this bug. ***
(Directed here from Bug 446316...)
I'm synchronizing a serious amount of data
$ du -hs SYNC/
$ find SYNC/ -type f -o -type l | wc -l
unison227-2.27.57-8.fc9.x86_64 gives up, even with ulimit as discussed above;
however unison227-2.27.57-8.fc9.i386 works.
This isn't just "it uses lots of memory when you sync lots of files". I'm
getting the error even on a very small sync.
From googling around a little, it looks like it's caused by a bug in the ocaml
runtime on x86_64 (http://caml.inria.fr/mantis/view.php?id=4448) which causes it
to occasionally try to allocate ridiculous amounts of memory that it doesn't
actually need. A workaround for that bug has been committed
and is apparently in ocaml 3.10.2. So we probably want to upgrade ocaml and
rebuild any packages that use it on x86_64.
OK. In that case, you should file a bug against OCAML itself, re-open your
original unison bug report, and mark the unison bug report as depending on the
OCAML bug report. I'll rebuild unison once OCAML is fixed.
Adding dependency on bug 445545, since that appears to be what is described in
Dan, Can you take a look at the patch in bug 445545, since it seems pretty
different from the patch you linked to (although they do both affect the same
function). Do the patches simply fix the same bug in different ways, or are
there 2 different bugs?
I've only just seen this (I'm the Fedora OCaml maintainer!)
Do you think someone could 'strace' the process when it fails. It's very easy
to tell if the failure is related to bug 445545 from the strace. Also do the
OCAMLRUNPARAM=v=0x1ff thing as described here:
The good news is that if it is this bug, a simple rebuild with the latest OCaml
compiler package will fix it.
Looking back a little I see that people have mentioned the upstream bug 4448
(http://caml.inria.fr/mantis/view.php?id=4448). This bug is related to the
problem, but the solutions mentioned there & the patch that went into the
compiler _do_not_ fix the problem on Fedora. So any suggestions you read in
the 4448 thread (eg. turning off VA randomization, etc.) _will_not_ help.
You need the patch which has gone into the OCaml compiler, see bug 445545.
Based on information supplied to me separately, this is an instance of bug 445545.
I'll rebuild unison against the fixed OCaml module which should resolve this issue.
Koji is down at the moment, but I've just checked in an updated unison227 which
should fix the issue.
I'll rebuild it when the above outage is over.
I see you bumped the devel version of unison227. I assume the F9 branch also
needs a rebuild? Does unison213 also need a rebuild?
> I see you bumped the devel version of unison227.
Yes, I bumped it but didn't rebuild the devel package til just now. Koji was
down all of yesterday.
> I assume the F9 branch also
> needs a rebuild? Does unison213 also need a rebuild?
Yes too. If you want the fix in F-9, you need to rebuild the program with the
fixed compiler. (The reason is because the runtime containing the buggy GC
is statically linked into programs. A rebuild of the program statically links the
new fixed runtime into the program).
The minimum compiler versions which have the fix are:
F-9: ocaml >= 3.10.1-3
devel: ocaml >= 3.10.2-2
There is no fix in F-8's compiler. If you want it, please file a bug, but I have
never seen the problem occur in F-8 myself, and the problem may be caused
by some change in the mmap(2) call in the kernel between F-8 and F-9.
I'll leave any decision to rebuild F-9 and other versions of unison up to you
because this bug is marked against unison227 in Rawhide only.
(In reply to comment #17)
> I'll leave any decision to rebuild F-9 and other versions of unison up to you
> because this bug is marked against unison227 in Rawhide only.
F-9 rebuild makes sense, since bug 446316 is for the F-9 version of this
Installing the package unison227-2.27.57-9.fc10.x86_64.rpm fixes the problem in
unison227-2.27.57-8.fc9.1,unison213-2.13.16-10.fc9.1 has been submitted as an update for Fedora 9
I've rebuilt unison227 for F-9, and unison213 for F-9 & devel too, just in case.
Richard, are EL-4/EL-5 affected by the OCaml issue? Thanks.
Actually I don't know. I've not seen this on Fedora < 9.
In theory it _could_ happen -- it's a stupid assumption made by the OCaml runtime
about how mmap allocations happen. (Fixed permanently and properly in OCaml >= 3.11).
But for some reason it doesn't seem to happen in Fedora < 9 (or in any Debian).
Is that because the kernel mmap(2) implementation is different, or is it just
luck? I really have no idea.
unison213-2.13.16-10.fc9.1, unison227-2.27.57-8.fc9.1 has been pushed to the Fedora 9 testing repository. If problems still persist, please make note of it in this bug report.
If you want to test the update, you can install it with
su -c 'yum --enablerepo=updates-testing update unison213 unison227'. You can provide feedback for this update here: http://admin.fedoraproject.org/updates/F9/FEDORA-2008-4618
unison227-2.27.57-8.fc9.1 resolves the "out of memory" issue for me.
unison213-2.13.16-10.fc9.1, unison227-2.27.57-8.fc9.1 has been pushed to the Fedora 9 stable repository. If problems still persist, please make note of it in this bug report.
I still encounter the problem when synchronizing between a 32bit fedora-9 client
(my laptop) and an x86_64 fedora-8 server, using unison227-2.27.57-8.fc9.1 on
the client and unison227-2.27.57-7.fc8.2 on the server.
This is the same problem as described above, except that the out of memory
happens on the fedora-8 side.
Presumably we need an update for fedora-8 as well.
You need to rebuild on every branch where this happens.
The patch which fixes bug 445545 hasn't been backported as far as F-8,
although it ought to apply fairly straightforwardly because the two
versions of OCaml are very similar to each other.
Adding dependency on bug 454384; a request for a backport of the fix for 445545
to F-8. Once OCaml is fixed, I'll rebuild unison for F-8.
If this bug really is triggered by kernel mmap differences as previously
suggested, this might have only recently been exposed on F-8, due to the kernel
version having been recently bumped. It's odd that an "old" release like F-8
tracks the latest kernel so closely...
OK. I've modified the spec files and tagged then. Just have to wait until the
new ocaml build is tagged and available for Koji to build against, then I'll
rebuild the F-8 unisons too.
unison213-2.13.16-9.fc8.3,unison227-2.27.57-7.fc8.3 has been submitted as an update for Fedora 8
unison213-2.13.16-9.fc8.3, unison227-2.27.57-7.fc8.3 has been pushed to the Fedora 8 stable repository. If problems still persist, please make note of it in this bug report.