Red Hat Bugzilla – Bug 154969
netdump fails with incorrectly configured eth device
Last modified: 2007-11-30 17:07:17 EST
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.6) Gecko/20050225 Firefox/1.0.1 Red Hat/1.0.1-1.4.3
Description of problem:
On a multihomed system if DEV= is not set in /etc/sysconfig/netdump it defaults to eth0 in /etc/init.d/netdump. If eth0 is not up or is the incorrect interface to be arping the netdump server the netdump service will die ungracefully.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. Ensure eth0 is not active and another interface is
2. start netdump service: /etc/init.d/netdump start
3. watch errors get printed to terminal
Actual Results: netdump failed to start which is expected because the DEV= paramter is incorrectly set but the error message is ambiguous and not clear because the service still tries to start even after it can not arping the netdump server.
netdump: cannot arp <ip address>
bash: line 1: /var/crash/magic/: Is a directory
netdump: could not ssh to server <ip address>
netdump server ssh key exchange [FAILED]
Expected Results: netdump service should fail at the point that it can not arping the netdump server with a more descriptive error message.
arping -c 1 -I $DEV $host &> /dev/null
[ $? -ne 0 ] && echo "$prog: cannot arp $host" 1>&2
In this case the script needs to exit if $? -ne 0 and not continue to try and start the service causing the ambiguous error messages.
Looks like this was supposed to be addressed in Bugzilla 106546.
Consider the case where the netdump server is not online at the time you run
service netdump start. If NETDUMPKEYEXCHANGE=none is set, then the server does
not have to be online to start the service. Thus, you can fail to ping the
server, and still end up with a working netdump setup.
The init script is fragile enough as it is. I'll add in the device name tried
to the error message, but I will not keep the script from falling through.
Created attachment 114848 [details]
Include the device name in the failed arp ping error message
This should be addressed in the latest packages available in RHEL 4 U3. Any
package versioned 0.7.14 or later should contain this fix.