One of our machines complained about broken NIS configuration, and I had a look. ypbind was not running, and 'ypwhich' reported 'ypwhich: Can't communicate with ypbind'. Checking /var/log/messages, I foudn this message: Nov 22 14:25:53 ankaa kernel: ypbind[29297]: segfault at 0000000000000008 rip 000000000040627f rsp 0000007fbffff390 error 4 I got ypbind version 1.17.2-3 installed on the machine, and uname reports "Linux ankaa.uio.no 2.6.9-22.ELsmp #1 SMP Mon Sep 19 18:00:54 EDT 2005 x86_64 x86_64 x86_64 GNU/Linux". The machine have SELinux enabled with type set to 'targeted'. I'm not sure if it is relevant. I do not know how to reproduce this problem, nor how often it occures, but found it best to report the problem as ypbind is a network server. I'll keep the eyes open for this problem, but do not know how to reproduce it. I can provide syslog messages from around that period, if relevant.
We have a 300 node computing cluster and starting 5 days ago for no apparent reason ypbind has started dying on them randomly at the rate of a couple of nodes an hour. We have a mix of 64bit and 32bit nodes. The segfault message only ends up in the log on the 64bit nodes. I imagine this must be some kind of debug option only in the x86_64 kernel because we have these failures occuring equally randomly on both 64bit and 32bit nodes. Seems like something recently added to our maps is tickling some bug pusing some memory allocation thing over the edge.
Can you post the output of 'rpm -q ypbind' on your x86 & x86_64 nodes?
Would it also be possible to provide your maps so I can more easily replicate the issue on our systems?
Closing due to inactivity.