Bug 771285
Summary: | mount fails with 2 XFS filesystems | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Pete Zaitcev <zaitcev> | ||||||||
Component: | kmod | Assignee: | kmod development team <kmod-maint> | ||||||||
Status: | CLOSED RAWHIDE | QA Contact: | Kay Sievers <kay> | ||||||||
Severity: | unspecified | Docs Contact: | |||||||||
Priority: | unspecified | ||||||||||
Version: | 17 | CC: | awalkersg, circular, colin, gansalmon, itamar, johannbg, jonathan, kernel-maint, kzak, lemenkov, lpoetter, madhu.chinakonda, marcosfrm, metherid, mschmidt, msivak, notting, plautrba, systemd-maint, vmlinuz386 | ||||||||
Target Milestone: | --- | Keywords: | Reopened | ||||||||
Target Release: | --- | ||||||||||
Hardware: | Unspecified | ||||||||||
OS: | Unspecified | ||||||||||
Whiteboard: | |||||||||||
Fixed In Version: | kmod-7-1.fc17 | Doc Type: | Bug Fix | ||||||||
Doc Text: | Story Points: | --- | |||||||||
Clone Of: | Environment: | ||||||||||
Last Closed: | 2013-01-08 19:46:06 UTC | Type: | --- | ||||||||
Regression: | --- | Mount Type: | --- | ||||||||
Documentation: | --- | CRM: | |||||||||
Verified Versions: | Category: | --- | |||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||
Embargoed: | |||||||||||
Attachments: |
|
Description
Pete Zaitcev
2012-01-03 05:48:08 UTC
Created attachment 550363 [details]
console capture 1
The problem may to have something with XFS an unclean shutdown. I ran xfs_check on both filesystems, and the VM now boots normally. There were no messages about any filesystem errors, but presumably xfs_check sets a superblock flag. Could you attach your /etc/fstab? systemd spawned "/bin/mount /src/node/vdb", but the mount failed with an error: mount: unknown filesystem type 'xfs' I don't see what systemd did wrong here. Reassigning to util-linux. Created attachment 550439 [details]
/etc/fstab
Please: * check dmesg output * try "strace -o ~/log mount /src/node/vdb" and send me the ~/log file You do realize that mount under strace is going to succeed, don't you? I suppose I could create a wrapper that traces _all_ mount invocations. Created attachment 550445 [details]
dmesg
This dmesg is captured at the maintenance prompt after failure.
(In reply to comment #6) > You do realize that mount under strace is going to succeed, don't you? > I suppose I could create a wrapper that traces _all_ mount invocations. I thought that you're able to call mount(8) manually from command line. It seems that you can disable (comment out) the /src/node/* entries in your fstab to boot successfully (In reply to comment #8) > you can disable (comment out) the /src/node/* entries or add "noauto" there The bug only occurs when two mounts are run simultaneously by the systemd. If they run consequently or only one is run, they succeed. It's something about the way mount detects the presense of the module before mounting. (In reply to comment #10) > The bug only occurs when two mounts are run simultaneously by the systemd. > If they run consequently or only one is run, they succeed. It's something > about the way mount detects the presense of the module before mounting. It sounds like kernel problem, mount(8) does not care about modules, it's kernel job... mount(8) prints the "unknown filesystem type" message only if mount(2) syscall returns ENODEV and the FS type is not found in /proc/filesystems. BTW, udevd[293]: segfault at 24 ip 00007f13dbd01992 sp 00007fff6dc53fa0 error 6 in udevd[7f13dbcfd000+21000] looks strange. *** Bug 790238 has been marked as a duplicate of this bug. *** Usually the mount() syscall triggers the in-kernel modprobe loader to insert the module for an unknown, not already loaded filesystem. This call blocks until the module in properly linked into the kernel. One possible explanation could be that that two competing mount() syscalls for the same filesystem module race against each other and one of them does not block for some reason. The problem might be new, before systemd, we certainly did almost everything fully serialized in userspace. It can be that the modprobe binary returns to early, or that the kernel does not call the second modprobe at all. Can someone who can reproduce the problem possibly add some printk() debugs to: get_fs_type() in: fs/filesystems.c to get a clue here. Thanks! I added printk() into get_fs_type() as suggested and here's what I saw: [ 18.947397] #####-----> get_fs_type() entered with name=xfs [ 18.965933] #####-----> get_fs_type() entered with name=xfs <snip> [ 19.214892] #####-----> get_fs_type() for name=xfs returned with (null) [ 19.216575] SGI XFS with ACLs, security attributes, large block/inode numbers, no debug enabled [ 19.218279] systemd[1]: mnt-whatever.mount mount process exited, code=exited status=32 [ 19.219521] mount[472]: mount: unknown filesystem type 'xfs' [ 19.222075] SGI XFS Quota Management subsystem [ 19.223593] #####-----> get_fs_type() for name=xfs returned with f7ff57e0 [ 19.225218] XFS (sdb2): Mounting Filesystem [ 19.230243] systemd[1]: Job fedora-autorelabel-mark.service/start failed with result 'dependency'. [ 19.232221] systemd[1]: Job fedora-autorelabel.service/start failed with result 'dependency'. [ 19.233151] systemd[1]: Job local-fs.target/start failed with result 'dependency'. [ 19.233985] systemd[1]: Triggering OnFailure= dependencies of local-fs.target. [ 19.234828] systemd[1]: Unit mnt-whatever.mount entered failed state. [ 19.365065] XFS (sdb2): Ending clean mount You can see from the above that one of the invocations of get_fs_type() returns with (null) while the other succeeds later. Hope this helps! For now, I worked around this by doing this: cat <<EOF >/etc/rc.modules #!/bin/sh modprobe xfs EOF chmod 755 /etc/rc.modules A possible explanation is that two modprobe calls are issued by the kernel. the first one links the module into the kernel, and the second one bails out to early because it finds the module in /sys/module/ but it is not fully initialized at that moment, so the second call does not block long enough and fails. Taking over the bug until we find out if that's the case. I'm trying to fix modprobe now. New kmod package on the way, which might block the second modprobe for a longer time: http://koji.fedoraproject.org/koji/taskinfo?taskID=3814472 kmod-5-8.fc17 has been submitted as an update for Fedora 17. https://admin.fedoraproject.org/updates/kmod-5-8.fc17 Package kmod-5-8.fc17: * should fix your issue, * was pushed to the Fedora 17 testing repository, * should be available at your local mirror within two days. Update it with: # su -c 'yum update --enablerepo=updates-testing kmod-5-8.fc17' as soon as you are able to. Please go to the following url: https://admin.fedoraproject.org/updates/FEDORA-2012-2335/kmod-5-8.fc17 then log in and leave karma (feedback). Will this fix be back-ported to Fedora 16? kmod-6-1.fc17 has been submitted as an update for Fedora 17. https://admin.fedoraproject.org/updates/kmod-6-1.fc17 kmod-7-1.fc17 has been submitted as an update for Fedora 17. https://admin.fedoraproject.org/updates/kmod-7-1.fc17 kmod-7-1.fc17 has been pushed to the Fedora 17 stable repository. If problems still persist, please make note of it in this bug report. Still a problem on F17 with kmod-7-2.fc17 but whatever. The workaround is still effective. I think I am also seeing this problem on F16 with kernel-3.4.6-1 and module-init-tools-3.16-5. Mounting the two xfs systems worked when I originally configured the system with F15, but stopped after my upgrade to F16. Hoping to replace this with a new F17 install in the near future, but have copied Pete's pre-loading of the xfs module as a fix for now. (It works). Just a side observation, but it seems there are a few cases where the systemd init seems more susceptible to race conditions than the old one. Seems we still miss the loop in kmod, that blocks the second modprobe until the first modprobe returns and the module state has turned from loading to ready. https://bugs.freedesktop.org/show_bug.cgi?id=53665 [mount fails when fstab has more than one entry for unloaded fs module] Rusty has submitted a patch in the kernel module loading to fix this issue: http://thread.gmane.org/gmane.linux.kernel/1358707/focus=1358709 That should resolve things as soon as it gets into Fedora. Fixed in kernel-3.7.0-6.fc19 (turned out to require a kernel fix after all, the workarounds in kmod were insufficient). Closing. |