Description of problem: When a system with iscsi devices experiences a network glitch, and the system is rebooted. On reboot, rc.sysinit does not force an fsck for the iscsi devices as it normally would for other locally attached disks. As a result, the user must manually run fsck after the system boots into read-only mode. Version-Release number of selected component (if applicable): RHEL5.2-Server-20080225.2 (initscripts-8.45.18.EL-1) How reproducible: Only tested once Steps to Reproduce: [Feb 28 14:50:30] < pjones> | 1) i/o error happens for whatever reason (broken cable, act of god, etc.) [Feb 28 14:50:38] < pjones> | 2) reboot happens [Feb 28 14:50:50] < pjones> | 3) initrd mounts fs read-only [Feb 28 14:51:06] < pjones> | 4) kernel sees that it's in an error state, marks it as needing fsck, fixes the journal [Feb 28 14:51:28] < pjones> | 5) initscripts runs "fsck -A", which doesn't touch it because of _netdev in the options (which we can't take out for other reasons) [Feb 28 14:51:53] < pjones> | 6) initscripts remounts it read-write [Feb 28 14:52:10] < pjones> | 7) kernel sees the error still and forces it back to read-only [Feb 28 14:52:37] < pjones> | 8) netfs doesn't fsck it because it can't touch the lockfile in /var/run so it never actually runs correctly [Feb 28 14:55:51] < pjones> | (also lots of other stuff fails between 7 and 8) [Feb 28 14:56:41] < pjones> | / *absolutely must* get fscked in rc.sysinit, not a separate initscript. Actual results: - filesystem mounted in read-only mode, manual fsck required Expected results: - rc.sysinit should trigger an fsck upon boot. Additional info:
This isn't really anything new - this would be the case even in 5.0 GA if you have a network root block device (iSCSI, GFS2, NBD) - they would all run into this issue. Possible solutions, of varying quality: - Remove _netdev from fstab for /. This would, however, break shutdown. (Obviously bad.) - Remove the case from rc.sysinit so that _netdev devices are fscked. This would, however, break booting with *any* non-root network block devices, as they wouldn't be found when fsck runs, including existing installations. (Also, obviously bad.) - Run fsck on / from the initrd. *ducks* - Introduce Yet Another Magic Flag, honored by shutdown as 'root is a network device', but different from _netdev so it would be fscked. Would be a rather ridiculous hack, but may work.
As for introducing Yet Another Magic flag - _netdev is handled specifically by mount(8) - so if we went that route to fix, it would require changes to (at a minimum) initscripts, anaconda, and util-linux.
Another alternative is root-causing why fsck from netfs fails for /, and fixing that issue.
notting: I'm not sure how that'll help -- the fs is already failed to RO mode by the kernel at that point, and can't ever be put back in RW mode. At any rate, I'm pretty sure the reason netfs doesn't work is that /var/lock/subsys/netfs is still present.
(In reply to comment #6) > At any rate, I'm pretty sure the reason netfs doesn't work is that > /var/lock/subsys/netfs is still present. Oh, right. And at that point you can't get to the FS to fix it and behave reliably. So we're back to the add-another-flag hack.
Just to document some of the stuff pjones & I looked at.... for the -t option to fsck, "opt=" and "noopt=" options are specified as comma-delimited, including the "opt/noopt" part. i.e. -t opt=foo,noopt=bar And they are cumulative; a filesystem must match each option (or no-option) specification to be checked. Above, only filesystems with option foo *and* without option bar will be checked. ... although at this point I guess it looks like we won't need to combine the option specifiers in this way... but just in case. -Eric
8.45.19.EL-1 built.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2008-0443.html
cmake-2.4.8-2.fc8 has been submitted as an update for Fedora 8