On an uptodate redhat 6.2 SMP machine new ypbind-mt, i.e. ypbind-1.7- 0.6.x.i386.rpm failed after only 3 hours of operation with message in logs about svc_run failure. On other uniprocessor machines it runs but cron job rmmod -as produces constantly these errors that did not happen with ypbind-3.3: yp_all: clnt_call: RPC: Timed out this error will happen also on other occasions (since new ypbind failed on me once I made a cron job to test it on all other machines but it produces very often the previous error). Again, none of these problems happened with ypbind-3.3. Thanks.
Upgrading ypserv to the latest 1.3.11 helped a great deal (though still not perfect). Until upgrading ypserv would occasionally complain of "having" too many children, not any more.
jakub: do you think this is glibc related?
Hard to say, maybe in the way glibc reacts if it doesn't get a reply quickly (longer timeout would be better?) but then again why no timeouts with old ypbind? I'd guess ypbind-mt is (overly?) aggressively querying ypserv and older version of ypserv simply couldn't keep up with 70 or so clients that were now also doing every 15min checks on which server is fastest. Newer version of ypserv does this much better - not sure why - Changelog mentions explicitly only RPC protocol fixes and better handling of fork calls (last thing could explain why no more complaints from ypserv about too many children). Still occasionally RPC timeout problems occur but at rate that it really doesn't matter (once every day?).
Forgot to add a comment about dying ypbind-mt which is pretty serious to us. It happened another time since the first time but we had a monitoring process restart it authomatically. This is on our busiest machine so I cannot guess if this is so due to load or the fact that this machine is dual CPU (but ypbind-mt should be threaded already and safe, hm?) as it never happened on workstations with exactly the same package. It could be a problem in glibc threads - is svc_run thread safe?
Do you still see this? Can you try our later packages?
I haven't seen this at all in recent versions. And reporter mentioned that it was basically "fixed" for him, so i'm closing this bug.