From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.6) Gecko/20050323 Firefox/1.0.2 Fedora/1.0.2-1.3.1 Description of problem: This is the RHEL4 version of bugzilla 116900. Version-Release number of selected component (if applicable): How reproducible: Always Steps to Reproduce: 1. run a test program which use syncronous I/O (O_SYNC). 2. during run time, storage array purposely injects errors by: * throwing away data and * returning CHECK CONDITION (ABORTED COMMAND) or CHECK CONDITION (HARDWARE ERROR) to OS. 3. the user mode application can't detect the error condition but keep going. Additional info: The RedHat support team has recreated this problem using a simpler user mode write test case combining with a kernel scsi error injection debug patch as the following: 1. Add a new ioctl to allow user mode programs to signal the kernel to start and stop the experiment. 2. Add a kernel debug patch that places trap code within scsi_softirq_handler() after the device interrupts OS for command completion. If the kerne is signalled (via ioctl) and if the target is our experiment device, the trap code will replace the SCSI status code as following: if (SCpnt->target == OurExperimentDevice) { SCpnt->result = 0x02; /* CHECK_CONDITION */ SCpnt->sense_buffer[0] = 0x70; /* sense valid */ SCpnt->sense_buffer[2] = 0xeb; /* ABORTED_COMMAND */ } 3. The replacement will always be performed until the kernel is signalled to stop (via ioctl). 4. The expectation is that during this interval, any read/write to this particular device would either be blocked or returned with error if file is opened with O_SYNC option. 5. We then write a simple test program and expect the write to be either blocked and/or returns with error. It, unfortunely, doesn't happen - write returns with success. From the kernel log (/var/log/messages), it can be seen that the driver retries 4 more times, then EXT2 does log the error condition but the error never gets propogated back to user mode appplication. Feb 19 22:15:47 perf82 kernel: GSS_DEBUG: target=2,lun=0,channel=0 Feb 19 22:15:47 perf82 kernel: GSS_DEBUG: cmd->result=0,sb[0]=0,sb[2]=0 --- replace status and sense data here Feb 19 22:15:47 perf82 kernel: SCSI disk error : host 0 channel 0 id 2 lun 0 return code = 8000002 -- first error Feb 19 22:15:47 perf82 kernel: FMK EOM ILI Current sd08:22: sense key Aborted Command Feb 19 22:15:47 perf82 kernel: I/O error: dev 08:22, sector 4152 Feb 19 22:15:47 perf82 kernel: GSS_DEBUG: target=2,lun=0,channel=0 Feb 19 22:15:47 perf82 kernel: GSS_DEBUG: cmd->result=0,sb[0]=0,sb[2]=0 --- replace status and sense data here Feb 19 22:15:47 perf82 kernel: SCSI disk error : host 0 channel 0 id 2 lun 0 return code = 8000002 -- 2nd error Feb 19 22:15:47 perf82 kernel: FMK EOM ILI Current sd08:22: sense key Aborted Command Feb 19 22:15:47 perf82 kernel: I/O error: dev 08:22, sector 32 Feb 19 22:15:47 perf82 kernel: GSS_DEBUG: target=2,lun=0,channel=0 Feb 19 22:15:47 perf82 kernel: GSS_DEBUG: cmd->result=0,sb[0]=0,sb[2]=0 --- replace status and sense data here Feb 19 22:15:47 perf82 kernel: SCSI disk error : host 0 channel 0 id 2 lun 0 return code = 8000002 -- 3rd error Feb 19 22:15:47 perf82 kernel: FMK EOM ILI Current sd08:22: sense key Aborted Command Feb 19 22:15:47 perf82 kernel: I/O error: dev 08:22, sector 0 Feb 19 22:15:47 perf82 kernel: GSS_DEBUG: target=2,lun=0,channel=0 Feb 19 22:15:47 perf82 kernel: GSS_DEBUG: cmd->result=0,sb[0]=0,sb[2]=0 --- replace status and sense data here Feb 19 22:15:47 perf82 kernel: SCSI disk error : host 0 channel 0 id 2 lun 0 return code = 8000002 -- 4th error Feb 19 22:15:47 perf82 kernel: FMK EOM ILI Current sd08:22: sense key Aborted Command Feb 19 22:15:47 perf82 kernel: I/O error: dev 08:22, sector 0 Feb 19 22:15:47 perf82 kernel: EXT2-fs error (device sd(8,34)): ext2_write_inode: unable to read inode block - inode=13, block=4
> This is the RHEL4 version of bugzilla 116900. I don't see anything in bugzilla 116900 that says it will occur in RHEL 4. Maybe that should be tested though. Stephen, I'll assign this to you, since you own 116900. Assign it back to me if it is something I should handle.
> The RedHat support team has recreated this problem using a simpler > user mode write test case combining with a kernel scsi error injection > debug patch Hmmmm 2.5 years later I don't suppose that patch is still around anywhere..? :)
This is actually expected. Well, intended. Perhaps not expected. :) If we look in generic_file_buffered_write() in mm/filemap.c: /* * For now, when the user asks for O_SYNC, we'll actually give O_DSYNC */ if (likely(status >= 0)) { if (unlikely((file->f_flags & O_SYNC) || IS_SYNC(inode))) { if (!a_ops->writepage || !is_sync_kiocb(iocb)) status = generic_osync_inode(inode, mapping, OSYNC_METADATA|OSYNC_DATA); } } "For now" extents to current kernels as well, FWIW. Also those 2 flags, OSYNC_METADATA|OSYNC_DATA, do *not* sync the inode. So the inode gets written out only in writeback, long after the application has returned. Although I'm not really fond of it, and I'm not sure of the historical reasons for it, I'm tempted to mark this NOTABUG because it's actually working as designed... -Eric
Oh, and for what it's worth, the data write itself probably *was* successful, but your inode writeout was not.
I almost hate to do this, but because this is how Linux has been - intentionally, it seems - for at least 5 or 6 years, I'm going to close this as NOTABUG, because things are in fact working as designed and as intended. -Eric