Red Hat Bugzilla – Bug 771285
mount fails with 2 XFS filesystems
Last modified: 2013-01-08 14:46:06 EST
Description of problem:
After a boot, system comes to the repair prompt due to failure to mount.
Version-Release number of selected component (if applicable):
Unknown... Seems 100% now, but somehow worked before
Steps to Reproduce:
1. configure 2 xfs filesystems
Stuck at "Give root password for maintenance"
Normal boot as usual
No idea what I break. This definitely worked before Christmas vacation.
I shut down the VMs, turned the box off, and turned it on today.
Please see attached console capture.
In it, the "first" filesystem fails (vdb), but the second filesystem (vdc)
mounts just fine. When I log in through maintenance prompt, it's mounted.
Exactly same parameters, filesystem is completely identical!
Created attachment 550363 [details]
console capture 1
The problem may to have something with XFS an unclean shutdown.
I ran xfs_check on both filesystems, and the VM now boots normally.
There were no messages about any filesystem errors, but presumably
xfs_check sets a superblock flag.
Could you attach your /etc/fstab?
systemd spawned "/bin/mount /src/node/vdb", but the mount failed with an error:
mount: unknown filesystem type 'xfs'
I don't see what systemd did wrong here. Reassigning to util-linux.
Created attachment 550439 [details]
* check dmesg output
* try "strace -o ~/log mount /src/node/vdb" and send me the ~/log file
You do realize that mount under strace is going to succeed, don't you?
I suppose I could create a wrapper that traces _all_ mount invocations.
Created attachment 550445 [details]
This dmesg is captured at the maintenance prompt after failure.
(In reply to comment #6)
> You do realize that mount under strace is going to succeed, don't you?
> I suppose I could create a wrapper that traces _all_ mount invocations.
I thought that you're able to call mount(8) manually from command line. It seems that you can disable (comment out) the /src/node/* entries in your fstab to boot successfully
(In reply to comment #8)
> you can disable (comment out) the /src/node/* entries
or add "noauto" there
The bug only occurs when two mounts are run simultaneously by the systemd.
If they run consequently or only one is run, they succeed. It's something
about the way mount detects the presense of the module before mounting.
(In reply to comment #10)
> The bug only occurs when two mounts are run simultaneously by the systemd.
> If they run consequently or only one is run, they succeed. It's something
> about the way mount detects the presense of the module before mounting.
It sounds like kernel problem, mount(8) does not care about modules, it's kernel job...
mount(8) prints the "unknown filesystem type" message only if mount(2) syscall returns ENODEV and the FS type is not found in /proc/filesystems.
udevd: segfault at 24 ip 00007f13dbd01992 sp 00007fff6dc53fa0 error 6 in udevd[7f13dbcfd000+21000]
*** Bug 790238 has been marked as a duplicate of this bug. ***
Usually the mount() syscall triggers the in-kernel modprobe loader to insert
the module for an unknown, not already loaded filesystem. This call blocks
until the module in properly linked into the kernel.
One possible explanation could be that that two competing mount() syscalls
for the same filesystem module race against each other and one of them does
not block for some reason.
The problem might be new, before systemd, we certainly did almost everything
fully serialized in userspace.
It can be that the modprobe binary returns to early, or that the kernel does
not call the second modprobe at all.
Can someone who can reproduce the problem possibly add some printk() debugs
to get a clue here. Thanks!
I added printk() into get_fs_type() as suggested and here's what I saw:
[ 18.947397] #####-----> get_fs_type() entered with name=xfs
[ 18.965933] #####-----> get_fs_type() entered with name=xfs
[ 19.214892] #####-----> get_fs_type() for name=xfs returned with (null)
[ 19.216575] SGI XFS with ACLs, security attributes, large block/inode numbers, no debug enabled
[ 19.218279] systemd: mnt-whatever.mount mount process exited, code=exited status=32
[ 19.219521] mount: mount: unknown filesystem type 'xfs'
[ 19.222075] SGI XFS Quota Management subsystem
[ 19.223593] #####-----> get_fs_type() for name=xfs returned with f7ff57e0
[ 19.225218] XFS (sdb2): Mounting Filesystem
[ 19.230243] systemd: Job fedora-autorelabel-mark.service/start failed with result 'dependency'.
[ 19.232221] systemd: Job fedora-autorelabel.service/start failed with result 'dependency'.
[ 19.233151] systemd: Job local-fs.target/start failed with result 'dependency'.
[ 19.233985] systemd: Triggering OnFailure= dependencies of local-fs.target.
[ 19.234828] systemd: Unit mnt-whatever.mount entered failed state.
[ 19.365065] XFS (sdb2): Ending clean mount
You can see from the above that one of the invocations of get_fs_type() returns with (null) while the other succeeds later.
Hope this helps!
For now, I worked around this by doing this:
cat <<EOF >/etc/rc.modules
chmod 755 /etc/rc.modules
A possible explanation is that two modprobe calls are issued by the kernel.
the first one links the module into the kernel, and the second one bails out
to early because it finds the module in /sys/module/ but it is not fully
initialized at that moment, so the second call does not block long enough
Taking over the bug until we find out if that's the case. I'm trying to fix
New kmod package on the way, which might block the second modprobe for a
kmod-5-8.fc17 has been submitted as an update for Fedora 17.
* should fix your issue,
* was pushed to the Fedora 17 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing kmod-5-8.fc17'
as soon as you are able to.
Please go to the following url:
then log in and leave karma (feedback).
Will this fix be back-ported to Fedora 16?
kmod-6-1.fc17 has been submitted as an update for Fedora 17.
kmod-7-1.fc17 has been submitted as an update for Fedora 17.
kmod-7-1.fc17 has been pushed to the Fedora 17 stable repository. If problems still persist, please make note of it in this bug report.
Still a problem on F17 with kmod-7-2.fc17 but whatever. The workaround
is still effective.
I think I am also seeing this problem on F16 with kernel-3.4.6-1 and module-init-tools-3.16-5. Mounting the two xfs systems worked when I originally configured the system with F15, but stopped after my upgrade to F16. Hoping to replace this with a new F17 install in the near future, but have copied Pete's pre-loading of the xfs module as a fix for now. (It works). Just a side observation, but it seems there are a few cases where the systemd init seems more susceptible to race conditions than the old one.
Seems we still miss the loop in kmod, that blocks the second modprobe
until the first modprobe returns and the module state has turned from
loading to ready.
https://bugs.freedesktop.org/show_bug.cgi?id=53665 [mount fails when fstab has more than one entry for unloaded fs module]
Rusty has submitted a patch in the kernel module loading to fix this issue:
That should resolve things as soon as it gets into Fedora.
Fixed in kernel-3.7.0-6.fc19 (turned out to require a kernel fix after all,
the workarounds in kmod were insufficient). Closing.