Description of Problem:
Dump (0.4b21) sometimes hangs while attempting to dump (large?)
filesystems. The hang manifests as a cessation of output; if one
examines the process tree of dump processes, everyone is waiting
deadlocked on something (grandparent for parent death, parent for
socket input, children in pause()).
Since people are not hammering down the doors with this one, I assume
that it is somewhat of a Heisenbug, depending on the exact situation
and filesystem in question. At the moment I can reproduce it reliably
on three of my filesystems; tomorrow I may not be able to do so. (Thus
I cannot 100% guarantee that I will be able to test a fix for this.)
This appears to be a known bug for/with the dump developers; at the
Sourceforge dump project page it is bug id 223582; the full URL is
The conclusion reached there is that there is a compiler bug in the
gcc shipped with RedHat 7.0 and 7.1 that dump is tripping over. If
compiled with kgcc, dump 0.4b22 works fine; I have verified this.
Jakub, can you look at this please? If it is a compiler bug, it is way
over my head. It could be just another case of "Red Hat's compiler won't
build it so their compiler is broken because my code couldn't possibly
have bugs in it such as standards compliancy issues" - which is the case
99% of the time from my experience.
I should clarify something, to be clear:
The problem happens with *Red Hat's* 7.1 dump-0.4b21 RPM of dump, as
installed normally on a Red Hat 7.1 system.
It also happens with the latest 'master' version of dump, from the dump
people; they are tracking that as a bug, and assert that it is a compiler
problem. (I don't know if their assertion is right or not; I stand mute
on the issue.)
Regardless of whether the compiler or the dump code itself is the buggy
party, Red Hat's dump RPM has a problem. The simplest short term fix may
or may not be 'rebuild with kgcc'. (My preferred long term fix is 'either
fix the compiler or fix dump's code problems, whichever is the actual
cause', but finding the core problem is probably going to take more than
a bit of work.)
Jakub, after looking into this problem a bit, it does indeed seem to point
to a compiler bug. Could you please look into this and see what is
triggering it? It appears to work fine if built without optimizations,
but not work if built normally.
If you are able to reproduce this with current GCC and dump, please