The automatic filesystem repair facility for the ext2 filesystem still
operates on filesystems mounted read-only. This negates much of the value
and spirit of read-only mounts, which are used to preclude any modification
to a disk.
In my case, I mounted a single filesystem on a multi-ported external drive
system onto two Linux 6.2 servers. On one system, I mounted the filesystem
read-write, on the other system read-only. The read-write system naturally
would leave the filesystem in inconsistant states during normal operation.
However, the read-only system would detect the inconsistancies, and attempt
to fix the filesystem by writing changes to the disk. This conflict between
the two hosts caused very bad filesystem corruption and data loss.
I believe the automatic filesystem repair should be disabled on read-only
In re-reading my initial bug report, I realize that I was not clear in my
description. I was describing a problem with the ext2 driver in the kernel, not
the fsck that occurs at boot time.
There is no inter-host locking on the filesystem. The linux internal disk
system maintains information about the filesystem. You must access the disk in
RO mode from both machines. If you wish to do rw between the machines, you need
to have the device simply mounted as a raw block device with sync io turned on,
then an application can access it in realtime.
I have experienced what many people have; it doesn't work if you try to share a
disk between 2 machines with updates enabled.
FWIW, network appliances have a similar architecture. They have a dual ported
FCAL shelf that is connected to both filers. The way updates are arbitrated is
via a tandem numa setup with NVRAM. The filers do not write to the drives
simultaneously, but they can take over an array in realtime with no loss. The
approach is that writes are committed over the numa card to the NVRAM of the
other filer before they are committed to disk. Once the disk write is committed
the transaction is updated in the companion filer.
I understand that there is no intent for inter-host locking. I also accept that
the in-memory caching could cause unpredictable results on the RO machine.
Nevertheless, I think it is incorrect for an RO mount to modify a filesystem.
The dual mount case only revealed the issue.