From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.12) Gecko/20050922 Fedora/1.0.7-1.1.fc4 Firefox/1.0.7 Description of problem: According to /etc/sysconfig/netdump, # Alternatively, to merely syslog all messages without doing network # crash dumps, you can set only SYSLOGADDR and leave NETDUMPADDR unset. # You can also set both. However, if this configuration is used, the crash signature will successfully be sent to the netdump-sever, but then it seems like netdump still wants to dump memory contents. The crashed system will hang on the message "< netdump activated - performing handshake with the server . >" and not reboot. Version-Release number of selected component (if applicable): netdump-0.7.7-2 kernel-2.4.21-37.EL How reproducible: Always Steps to Reproduce: On the server: 1. Reconfigure syslog to accept remote messages (set SYSLOGD_OPTIONS="-m 0 -r" in /etc/sysconfig/syslog). 2. Restart syslog. 3. Start netdump-server (or not, this seems to make no difference in client behavior). On the client system to be crashed: 1. Set SYSLOGADDR=<netdump-server-IP> in /etc/sysconfig/netdump. Do not change anything else from default. 2. Start netdump 3. Crash the client by running (echo "1" > /proc/sys/kernel/sysrq; echo "c" > /proc/sysrq-trigger) Actual Results: Crashed system sends crash signature to netdump-server and then hangs with the message "< netdump activated - performing handshake with the server . >" on the console. Expected Results: Crashed system should send crash signature to netdump-sever and then reboot. Additional info: Additional information from Dave Anderson: It's a RHEL3 kernel bug. The kicking off of the netdump operation is predicated by the netconsole module pre-registering its "netdump_netdump" function into the kernel's "netdump_func" pointer when the module is loaded. Looking at the module's init_netconsole() function, it does the registration regardless of the NETDUMPADDR or SYSLOGADDR arguments passed in. Here's the end of init_netconsole(), where if the platform supports it, it does the registration regardless of the arguments passed in: if (platform_supports_netdump) { if (netdump_register_hooks(netconsole_rx, netconsole_receive_skb, netconsole_netdump)) { printk("netdump: failed to register hooks.\n"); } } netconsole_dev = ndev; #define STARTUP_MSG "[...network console startup...]\n" write_netconsole_msg(NULL, STARTUP_MSG, strlen(STARTUP_MSG)); register_console(&netconsole); printk(KERN_INFO "netlog: network logging started up successfully!\n"); return 0; } It should pass a NULL as the 3rd argument if no netdump target addresses were passed in. For example, note that the kernel's netdump_func starts out life as a NULL: # crash ... crash> p netdump_func netdump_func = $1 = (void (*)(struct pt_regs *)) 0 crash> Here's a netdump session after setting only the SYSLOGADDR in /etc/netdump.config: # service netdump start initializing netdump [ OK ] # tail -10 /var/log/messages Jan 17 09:10:59 crash netdump:: inserting netconsole module with arguments magic1=0x11111111 magic2=0x11111111 dev=eth0 source_port=6666 syslog_target_ip=0xAC105012 syslog_target_port=514 syslog_target_eth_byte0=0x00 syslog_target_eth_byte1=0x30 syslog_target_eth_byte2=0x6E syslog_target_eth_byte3=0x1E syslog_target_eth_byte4=0xFE syslog_target_eth_byte5=0x40 Jan 17 09:10:59 crash kernel: netlog: using network device <eth0> Jan 17 09:10:59 crash kernel: netlog: using source IP 172.16.80.17 Jan 17 09:10:59 crash kernel: netlog: using source UDP port: 6666 Jan 17 09:10:59 crash kernel: netlog: using syslog target IP 172.16.80.18, port: 514 Jan 17 09:10:59 crash kernel: netlog: using broadcast ethernet frames to send netdump packets. Jan 17 09:10:59 crash kernel: netlog: using broadcast ethernet frames to send netdump packets. Jan 17 09:10:59 crash kernel: netlog: using syslog target ethernet address 00:30:6e:1e:fe:40. Jan 17 09:10:59 crash kernel: netlog: network logging started up successfully! Jan 17 09:10:59 crash netdump: initializing netdump succeeded [root@crash root]# If NETDUMPADDR had been set, you'd see a bunch of "netdump_target_eth_byte*" values above in the "inserting" message. But looking at the kernel's netdump_func pointer, you can see it has been set: # crash ... crash> p netdump_func netdump_func = $1 = (void (*)(struct pt_regs *)) 0xe2d8e710 crash> sym 0xe2d8e710 e2d8e710 (t) netconsole_netdump crash>
Changed component to "kernel" from "netdump", as it's a bug with the netconsole kernel module. Linda -- can you link this to RHEL3-U8 and give it a devel_ack?
Please give a qa_ack+ to this BZ. QA procedure: 1. Set up only SYSLOGADDR in /etc/sysconfig/netdump, pointing to a remote system who's /etc/sysconfig/syslog file has the "-r" flag turned on. 2. Do a "service netdump start". 3. Crash the system with alt-sysrq-c or "echo c > /proc/sysrq-trigger". 4. Verify: - the oops message made it to the remote /var/log/messages. - the client did not attempt to do a netdump operation.
A fix for this problem has just been committed to the RHEL3 U8 patch pool this evening (in kernel version 2.4.21-40.2.EL).
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2006-0437.html