Hide Forgot
Description of problem: Customer has a shell script wrapper around /bin/kill. This is called about once a second with the arguments: killwrapper -0 <pid> It is used to test if a given process is still running. We used a systemtap script to determine the command lines. In two cases, the call to exec for the kill wrapper had the expected arguments, but when the exec for the real kill command was called, the command line for the kill wrapper showed that one of the arguments was NULL instead of the original value. The kill wrapper didn't modify its command line arguments. When its first argument is not the expected argument, it writes data to a file. In the observed cases, it did not write data. Version-Release number of selected component (if applicable): kernel-2.6.18-194.3.1.el5.s390 How reproducible: Difficult to reproduce. Reproduces at customer site every 2 weeks to 2 months using Oracle clustering Steps to Reproduce: 1. Run Oracle rac clustering between several nodes 2. Beat on it for several weeks Actual results: Cluster nodes evicted Expected results: Cluster remains running Additional info: There is plenty of memory available.
Supposedly the machines in question didn't have a problem with the -194 kernel, but have a problem with the -194.3.1 kernel. This may be a regression caused by the fix for BZ 545527. This appears to have been fixed in the -238 kernel with BZ 627298. *** This bug has been marked as a duplicate of bug 627298 ***