From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.5) Gecko/20041107 Firefox/1.0 Description of problem: I've tuned the filesystem to panic on error, instead of remounting read-only. After pulling out FC cables so that all IO to the device is suspended, it tries to panic in line 196 of super.c. But the kernel doesn't panic. Isn't it supposed to use ext3_panic which make sure that panic doesn't try to sync the filesystem? Version-Release number of selected component (if applicable): How reproducible: Always Steps to Reproduce: 0. echo "10" >/proc/sys/kernel/panic 1. Mount ext3 filesystem 2. Disable io to the device (pull out Fibre Channel cables) 3. Create io to the filesystem 4. ext3 says it will panic but the panic does not work Actual Results: Everything continues normally execept the filesystem on the borked device Expected Results: The kernel should have panic'ed Additional info:
The functionality of /proc/sys/kernel/panic is described in the proc(5) manual page as follows: "The file panic gives read/write access to the kernel variable panic_timeout. If this is zero, the kernel will loop on a panic; if nonzero it indicates that the kernel should autoreboot after this number of seconds." Perhaps you want to use the "errors=panic" mount option.
I think I have understood the meaning of this variable. I set it to 10 to have a 10 second timeout befoore autoreboot. I did not mount with errors=panic, I used tune2fs to set the flag in the super-block. As you can see from the following excerpt from the log the EXT-3 filesystem wanted to panic, but nothing happens, everything continues to run, because the kernel is hung in sys_sync trying to sync the filesystem which just invoked panic in the first place. This possibly happens because super.c calls panic directly, instead of its own ext3_panic which disables sync for self. Dec 9 10:12:14 aasdcm04a kernel: SCSI disk error : host 2 channel 0 id 1 lun 0 return code = 10000 Dec 9 10:12:14 aasdcm04a kernel: I/O error: dev 08:31, sector 32 Dec 9 10:12:14 aasdcm04a kernel: raid1: sdd1: rescheduling block 32 Dec 9 10:12:14 aasdcm04a kernel: raid1: sdd1: unrecoverable I/O read error for block 32 Dec 9 10:12:14 aasdcm04a kernel: EXT3-fs error (device md(9,1)): ext3_get_inode_loc: unable to read inode block - inode=2, block=4 Dec 9 10:12:14 aasdcm04a kernel: Aborting journal on device md(9,1). Dec 9 10:12:14 aasdcm04a kernel: SCSI disk error : host 2 channel 0 id 1 lun 0 return code = 10000 Dec 9 10:12:14 aasdcm04a kernel: I/O error: dev 08:31, sector 3936 Dec 9 10:12:14 aasdcm04a kernel: Kernel panic: EXT3-fs (device md(9,1)): panic forced after error Dec 9 10:12:14 aasdcm04a kernel: Dec 9 10:12:16 aasdcm04a heartbeat[5226]: info: Resetting node (null) with [external STONITH device] Dec 9 10:12:16 aasdcm04a heartbeat[5226]: info: Host (null) external-reset initiating Dec 9 10:12:16 aasdcm04a heartbeat[5226]: ERROR: command '/usr/local/lib/xseries.sh aasdcm09a /etc/ha.d/stonith.nodes' failed Dec 9 10:12:16 aasdcm04a heartbeat[5226]: ERROR: Host (null) not reset! Dec 9 10:12:16 aasdcm04a heartbeat[3310]: ERROR: Exiting STONITH (null) process 5226 killed by signal 11. Dec 9 10:12:16 aasdcm04a heartbeat[3310]: ERROR: STONITH of (null) failed. Retrying... Dec 9 10:12:20 aasdcm04a watchdog[3718]: cannot stat /shared/AASTEST-1/lost+found (errno = 2 = 'No such file or directory') Dec 9 10:12:21 aasdcm04a heartbeat[5229]: info: Resetting node (null) with [external STONITH device] Dec 9 10:12:21 aasdcm04a heartbeat[5229]: info: Host (null) external-reset initiating Dec 9 10:12:21 aasdcm04a heartbeat[5229]: ERROR: command '/usr/local/lib/xseries.sh aasdcm09a /etc/ha.d/stonith.nodes' failed Dec 9 10:12:21 aasdcm04a heartbeat[5229]: ERROR: Host (null) not reset! Dec 9 10:12:26 aasdcm04a heartbeat[5231]: info: Resetting node (null) with [external STONITH device] Dec 9 10:12:26 aasdcm04a heartbeat[5231]: info: Host (null) external-reset initiating Dec 9 10:12:26 aasdcm04a heartbeat[5231]: ERROR: command '/usr/local/lib/xseries.sh aasdcm09a /etc/ha.d/stonith.nodes' failed Dec 9 10:12:26 aasdcm04a heartbeat[5231]: ERROR: Host (null) not reset! Dec 9 10:12:26 aasdcm04a heartbeat[3310]: ERROR: Exiting STONITH (null) process 5231 killed by signal 11. Dec 9 10:12:26 aasdcm04a heartbeat[3310]: ERROR: STONITH of (null) failed. Retrying... Dec 9 10:12:31 aasdcm04a heartbeat[5233]: info: Resetting node (null) with [external STONITH device] Dec 9 10:12:31 aasdcm04a heartbeat[5233]: info: Host (null) external-reset initiating Dec 9 10:12:31 aasdcm04a heartbeat[5233]: ERROR: command '/usr/local/lib/xseries.sh aasdcm09a /etc/ha.d/stonith.nodes' failed Dec 9 10:12:31 aasdcm04a heartbeat[5233]: ERROR: Host (null) not reset! Dec 9 10:12:31 aasdcm04a heartbeat[3310]: ERROR: Exiting STONITH (null) process 5233 killed by signal 11. Dec 9 10:12:31 aasdcm04a heartbeat[3310]: ERROR: STONITH of (null) failed. Retrying... Dec 9 10:12:35 aasdcm04a watchdog[3718]: cannot stat /shared/AASTEST-1/lost+found (errno = 2 = 'No such file or directory') Dec 9 10:12:36 aasdcm04a heartbeat[5236]: info: Resetting node (null) with [external STONITH device] Dec 9 10:12:36 aasdcm04a heartbeat[5236]: info: Host (null) external-reset initiating Dec 9 10:12:36 aasdcm04a heartbeat[5236]: ERROR: command '/usr/local/lib/xseries.sh aasdcm09a /etc/ha.d/stonith.nodes' failed Dec 9 10:12:36 aasdcm04a heartbeat[5236]: ERROR: Host (null) not reset! Dec 9 10:12:36 aasdcm04a heartbeat[3310]: ERROR: Exiting STONITH (null) process 5236 killed by signal 11. Dec 9 10:12:36 aasdcm04a heartbeat[3310]: ERROR: STONITH of (null) failed. Retrying... Dec 9 10:12:41 aasdcm04a heartbeat[5238]: info: Resetting node (null) with [external STONITH device] Dec 9 10:12:41 aasdcm04a heartbeat[5238]: info: Host (null) external-reset initiating Dec 9 10:12:41 aasdcm04a heartbeat[5238]: ERROR: command '/usr/local/lib/xseries.sh aasdcm09a /etc/ha.d/stonith.nodes' failed Dec 9 10:12:41 aasdcm04a heartbeat[5238]: ERROR: Host (null) not reset! Dec 9 10:12:41 aasdcm04a heartbeat[3310]: ERROR: Exiting STONITH (null) process 5238 killed by signal 11. Dec 9 10:12:41 aasdcm04a heartbeat[3310]: ERROR: STONITH of (null) failed. Retrying... Dec 9 10:12:46 aasdcm04a login(pam_unix)[3583]: session opened for user root by LOGIN(uid=0) Dec 9 10:12:46 aasdcm04a -- root[3583]: ROOT LOGIN ON tty2
Okay, Morton. Thanks for the clarification. Bug assigned to Stephen.
This bug is filed against RHEL 3, which is in maintenance phase. During the maintenance phase, only security errata and select mission critical bug fixes will be released for enterprise products. Since this bug does not meet that criteria, it is now being closed. For more information of the RHEL errata support policy, please visit: http://www.redhat.com/security/updates/errata/ If you feel this bug is indeed mission critical, please contact your support representative. You may be asked to provide detailed information on how this bug is affecting you.