Red Hat Bugzilla – Bug 90036
race/deadlock in fork() with signal handler.
Last modified: 2016-11-24 07:34:22 EST
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.0.1) Gecko/20021003
Description of problem:
Random hangs in smbmount. roland says:
It looks like the loser case is the parent doing fork, and getting the SIGTERM
as it returns from the syscall (because the child is scheduled first). Then the
signal handler calling exit deadlocks with a lock that fork holds.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1.Add an entry to /etc/fstab like //192.168.48.120/samba1
/mnt/smb/rhl-8-0/samba1 smbfs uid=3616,gid=3616,password=samba1,username=samba1 0 0
2.mount -a -t smbfs
3. If the mount succeeds, do umount -a -t smbfs and repeat steps 2 and 3 until
Actual Results: Sometimes the mount hangs. On some boxes it hangs often (like
rhl-9.lab.boston.redhat.com) On others it (almost?) never does.
Expected Results: The mount should never hang.
*** Bug 89643 has been marked as a duplicate of this bug. ***
*** Bug 88841 has been marked as a duplicate of this bug. ***
*** Bug 82820 has been marked as a duplicate of this bug. ***
*** Bug 89197 has been marked as a duplicate of this bug. ***
The fork function is not signal-safe, which is a bug.
In the smbmount case what happens is that the fork child runs
before the parent and sends its parent a signal. The parent's signal
handler calls exit, which deadlocks with an internal lock held by
the interrupted fork. I'm attaching a trivial test program.
When that's linked with a library that has a destructor, it hangs.
e.g. "gcc -o forkloser -g forkloser.c -lanl"
Created attachment 91467 [details]
test case for glibc/nptl fork bug.
Link with some library that has destructors to demonstrate the bug.
e.g. "gcc -o forkloser -g forkloser.c -lm" is what I tried.
Hang may depend on child-runs-first, but I saw it on an smp kernel as well.
I've checked into the nptl cvs archive a patch which removes the lock for
calling the registered handlers in fork. It'll be in the next binary RPMs we'll
Sorry - trying to add myself to cc list :~)
This is serious problem for me. Any ideas on when the fix will be released?
We are experiencing smbmount hangs about 80% of the time using RH9. This is a
serious issue for us. Is there an estimated timeframe for a fix on this?
Can we recompile using LD_ASSUME_KERNEL=2.2.5 to avoid this problem? Which
specific packages should be recompiled? This is problem is killing me.
I have had good luck rolling back to kernel 2.4.18 from RedHat 8.0, so I would
recommend trying that out. The RPM is easy to find (
Changing the kernel version may change the scheduler behavior (and thus the
chance that the child process will run before the parent), but will not address
the actual bug. Only fixing glibc will 100% prevent this hang.
*** Bug 97325 has been marked as a duplicate of this bug. ***
I've used the following on my rh9 :-
mv /usr/bin/smbmount /usr/bin/_smbmount
more <<EOT > /usr/bin/smbmount
/usr/bin/_smbmount \$1 \$2 \$3 \$4 \$5 \$6 \$7 \$8 \$9 &
kill -QUIT \$! > /dev/null
chmod 0755 /usr/bin/smbmount
The following commands apparently fixes the problem for me:
mv /usr/bin/smbmount /usr/bin/smbmount.orig
cat <<EOF >/usr/bin/smbmount
exec /usr/bin/smbmount.orig "$@"
chmod 755 /usr/bin/smbmount
Is this related to Bug 88599?
What is the ETA on the new binary RPMs? (per comment #7) None of the posted
workarounds work for me, and this is becoming a serious problem.
HA! Forget it. This bug is so old I have had time to install Gentoo. Wait for
Seriously though, roll back to the 2.4.18 kernel from 8.0. It doesn't fix the
problem (as noted above), just stops you from EVER seeing it again (on the 3
boxen I have tried it on).
You could try a custom kernel too I guess.
*** Bug 97743 has been marked as a duplicate of this bug. ***
the workaround from KAS, comment #16, worked in our environemnt. RH9, Shrike,
as repackedged by KRUD, Sept. 2003 edition.
*** Bug 103202 has been marked as a duplicate of this bug. ***
Is this bug un-fixable or something? It's obviously not obscure since so many
other bugs have been marked as duplicates, so lots of folks are running into
problems with it. (none of the fixes work for me so I'm ranting a bit)
Seriously though, what gives?
100% reproducible for me on Dell Inspirion 4550 RH 9.0. I mount from
the /etc/fstab, so that means that my system doesn't boot
Bug #89589 is also a dupe of this one. For those who are still
suffering under this: the workaround in comment #11 does work.
Sorry, I was terribly unclear. I meant "the workaround in comment #11
of bug #89589 does work".
Give the code at
a try. It should have the fix for this bug (among others).
Beautiful - works great so far! Thanks for the pointer to the new
rpms. (comment #27)
Closing as fixed in current version.
What exactly is the current version? And which package are you
speaking of? Samba, glibc, or the kernel?
I Just installed RH9 over the weekend (12/20/03) and upgraded all
packages RHN suggested. I am having oplock issues with my shares, not
with mounting, but with file locking it would seem from smbd.log.
Trying to get the fix so I can leave oplocks on hopefully.
Looks like these aren't the drones...
My issue matches bug 98861 better.
The link is no longer available for:
The closest there are are:
I presume it is now: