We have found that if you're running Redhat 6.x as a NIS client talking to
Solaris NIS server, that there appears to a file descriptor leak in
yp_order which causes am-utils (amd) to slowly leak file descriptors and
eventually stop working. It gets in a munged state and must be restarted.
The problem doesn't manifest itself for a number of hours of continuous use
/uptime (e.g. 20-24 hours). Currently our workaround is just to restart
amd every 24 hours. A "hacked" patch has been written by a fellow am-utils
user and that appears to slow down the leakage but it does not fix the
problem since the problem doesn't appear to be with am-utils.
Attached below is this person's e-mail detaling more of the problem.
thanks very much!
Yes, I have, and it has finally (I think) been nailed down thanks to Johann
Pfefferl. The problem is a file descriptor leak in yp_order, which is
called by amd to verify NIS maps. Note that the problem is in libc, rather
than in amd, but you'll probably only get bitten by it from amd. (If you
aren't using NIS, then I am mistaken, and you suffer from a different
If you are using NIS, you should be able to determine if this is the
trouble by starting a fresh amd, use lsof to look at the number of open
file descriptors. Then issue an amq -f to get amd to flush its maps, and
run lsof again. The number of open file descriptors will probably
increase. If not, then the trouble is probably something else.
I believe that you may be bitten by this bug even if you don't use NIS to
distribute your amd maps (I do), if your maps require passwd or group
lookups. If the latter is the case, Johann reports that you can stop the
leakage by using nscd. He also said that he does not see the leak if he
uses a Linux machine as his NIS server. (Johann has been having this
problem on SUSE Linux 6.2.)
If you use NIS maps for amd, you might be able to work around the problem
by switching to file maps and running nscd. Or you can put in a cron job
to restart amd every 12 hours or so.... (In my case, if I wait for amd to
fail, restarting is not sufficient because it leaves funny stuff in the
mtab. I must also then umount the offending filesystems.)
I can mail you a patch (derived from the patch Johann mailed me) to the
am-utils-6.0.3s1 sources which seems to work for me. Note that this patch
is a bit of a hack (basically, it just closes stray sockets created after
calling yp_order), but without getting the Linux yp_order fixed I don't see
another way. I suspect Erez will release another snapshot shortly after he
returns from LISA to address this problem.
Institute for Mathematical Sciences
State University of New York at Stony Brook
Does it help if you disable the nisplus maps/lookups from the
assign to jakub
Is this bug still present in Red Hat Linux 7.3 or 8.0. I was unable to duplicate
some of the problems seen but may not have the exact settings in my hasty
Closing out due to bit-rot.