Description of problem: The output from systemd's filesystem "mount" procedure is extremely sub-optimal for at least XFS if the mount fails. (can lead to data loss) If (for example), partition numbering on a machine has changed, such that systemd tries to mount a non-XFS filesystem as XFS, the message displayed to the user is: XFS (sda7): Corruption detected. Unmount and run xfs_repair Data recovery procedures to fix this misidentified "corruption" can then cause data loss. (just happened) *** The problem appears to be that systemd is showing the last few lines of the system log (misleading), instead of the output from the mount command (useful, and accurate). Systemd should show the output from the failed mount command, instead of or in addition to the output from the system log. Current real world system log output ************************************ Oct 18 16:46:30 f19laptop kernel: [ 1801.893919] XFS (sda7): bad magic number Oct 18 16:46:30 f19laptop kernel: [ 1801.893924] ffff8801e1bc7000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Oct 18 16:46:30 f19laptop kernel: [ 1801.893926] ffff8801e1bc7010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Oct 18 16:46:30 f19laptop kernel: [ 1801.893927] ffff8801e1bc7020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Oct 18 16:46:30 f19laptop kernel: [ 1801.893928] ffff8801e1bc7030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Oct 18 16:46:30 f19laptop kernel: [ 1801.893929] XFS (sda7): Internal error xfs_sb_read_verify at line 780 of file fs/xfs/xfs_mount.c. Caller 0xffffffffa060ea25 Oct 18 16:46:30 f19laptop kernel: [ 1801.893929] Oct 18 16:46:30 f19laptop kernel: [ 1801.894037] XFS (sda7): Corruption detected. Unmount and run xfs_repair Output from corresponding mount command *************************************** $ sudo mount -t xfs /dev/sda7 /example mount: wrong fs type, bad option, bad superblock on /dev/sda7, missing codepage or helper program, or other error In some cases useful info is found in syslog - try dmesg | tail or so. Version-Release number of selected component (if applicable): $ rpm -qa|grep -i 'systemd\|mount'|sort libmount-2.23.2-4.fc19.x86_64 systemd-204-16.fc19.x86_64 systemd-devel-204-16.fc19.x86_64 systemd-libs-204-16.fc19.i686 systemd-libs-204-16.fc19.x86_64 systemd-python-204-16.fc19.x86_64 systemd-sysv-204-16.fc19.x86_64 How reproducible: Every time
[Adding Eric Sandeen to CC to have a look at the reported XFS error. Thanks.] (In reply to Justin Clift from comment #0) > The problem appears to be that systemd is showing the last few lines > of the system log (misleading), instead of the output from the mount > command (useful, and accurate). Where exactly is it showing that? Is it in the systemd journal? > Systemd should show the output from the failed mount command, instead of > or in addition to the output from the system log. > > Current real world system log output > ************************************ > > Oct 18 16:46:30 f19laptop kernel: [ 1801.893919] XFS (sda7): bad magic > number Looks like here xfs correctly saw that this is NOT an XFS filesystem. Why did it continue to mount it regardless? > Oct 18 16:46:30 f19laptop kernel: [ 1801.893924] ffff8801e1bc7000: 00 00 > 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ > Oct 18 16:46:30 f19laptop kernel: [ 1801.893926] ffff8801e1bc7010: 00 00 > 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ > Oct 18 16:46:30 f19laptop kernel: [ 1801.893927] ffff8801e1bc7020: 00 00 > 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ > Oct 18 16:46:30 f19laptop kernel: [ 1801.893928] ffff8801e1bc7030: 00 00 > 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ > Oct 18 16:46:30 f19laptop kernel: [ 1801.893929] XFS (sda7): Internal > error xfs_sb_read_verify at line 780 of file fs/xfs/xfs_mount.c. Caller > 0xffffffffa060ea25 "Internal" errors should never happen. This is a bug in xfs kernel code. > Oct 18 16:46:30 f19laptop kernel: [ 1801.893929] > Oct 18 16:46:30 f19laptop kernel: [ 1801.894037] XFS (sda7): Corruption > detected. Unmount and run xfs_repair Even if you saw the error output from the mount command, it still points you to check dmesg where you'd find this misleading advice. So I don't see how better off you'd be. > Output from corresponding mount command > *************************************** > > $ sudo mount -t xfs /dev/sda7 /example > mount: wrong fs type, bad option, bad superblock on /dev/sda7, > missing codepage or helper program, or other error > > In some cases useful info is found in syslog - try > dmesg | tail or so. Exactly! I would think "Bad superblock!", check dmesg and get the bad advice.
Created attachment 813855 [details] Screenshot of the error message in systemd output
(In reply to Justin Clift from comment #0) > Description of problem: > > The output from systemd's filesystem "mount" procedure is extremely > sub-optimal for at least XFS if the mount fails. (can lead to data loss) > > If (for example), partition numbering on a machine has changed, such that > systemd tries to mount a non-XFS filesystem as XFS, the message displayed > to the user is: > > XFS (sda7): Corruption detected. Unmount and run xfs_repair http://oss.sgi.com/cgi-bin/gitweb.cgi?p=xfs/xfs.git;a=commit;h=31625f28ad7be67701dc4cefcf52087addd88af4 "xfs: don't emit corruption noise on fs probes If we get EWRONGFS due to probing of non-xfs filesystems, there's no need to issue the scary corruption error and backtrace." Courtesy of yours truly. ;) > Data recovery procedures to fix this misidentified "corruption" can then > cause data loss. (just happened) > > *** > > The problem appears to be that systemd is showing the last few lines > of the system log (misleading), instead of the output from the mount > command (useful, and accurate). I disagree on that, mount output is almost always generic, dmesg shows the real problems. But xfs was being too noisy, and it should eventually be fixed. That could be a stable candidate if people want it. -Eric
Thanks Michal. Just attached screenshot showing where in the systemd output it shows up. Hopefully that helps. Also useful might be the full text of the system log output (more in depth than my summary version): ***************************************************************** Oct 18 16:46:30 f19laptop kernel: [ 1801.893919] XFS (sda7): bad magic number Oct 18 16:46:30 f19laptop kernel: [ 1801.893924] ffff8801e1bc7000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Oct 18 16:46:30 f19laptop kernel: [ 1801.893926] ffff8801e1bc7010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Oct 18 16:46:30 f19laptop kernel: [ 1801.893927] ffff8801e1bc7020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Oct 18 16:46:30 f19laptop kernel: [ 1801.893928] ffff8801e1bc7030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Oct 18 16:46:30 f19laptop kernel: [ 1801.893929] XFS (sda7): Internal error xfs_sb_read_verify at line 780 of file fs/xfs/xfs_mount.c. Caller 0xffffffffa060ea25 Oct 18 16:46:30 f19laptop kernel: [ 1801.893929] Oct 18 16:46:30 f19laptop kernel: [ 1801.893932] CPU: 7 PID: 181 Comm: kworker/7:1H Not tainted 3.11.4-201.0.bz953211.fc19.x86_64 #1 Oct 18 16:46:30 f19laptop kernel: [ 1801.893933] Hardware name: Dell Inc. Dell System Inspiron 7720/04M3YM, BIOS A15 05/17/2013 Oct 18 16:46:30 f19laptop kernel: [ 1801.893950] Workqueue: xfslogd xfs_buf_iodone_work [xfs] Oct 18 16:46:30 f19laptop kernel: [ 1801.893952] 0000000000000001 ffff88023063fd68 ffffffff81638c2f ffff880231c27800 Oct 18 16:46:30 f19laptop kernel: [ 1801.893954] ffff88023063fd80 ffffffffa06110db ffffffffa060ea25 ffff88023063fdb8 Oct 18 16:46:30 f19laptop kernel: [ 1801.893956] ffffffffa0611135 0000030c00000000 ffff8801e8df7980 0000000000000016 Oct 18 16:46:30 f19laptop kernel: [ 1801.893958] Call Trace: Oct 18 16:46:30 f19laptop kernel: [ 1801.893963] [<ffffffff81638c2f>] dump_stack+0x45/0x56 Oct 18 16:46:30 f19laptop kernel: [ 1801.893973] [<ffffffffa06110db>] xfs_error_report+0x3b/0x40 [xfs] Oct 18 16:46:30 f19laptop kernel: [ 1801.893981] [<ffffffffa060ea25>] ? xfs_buf_iodone_work+0x85/0xf0 [xfs] Oct 18 16:46:30 f19laptop kernel: [ 1801.893989] [<ffffffffa0611135>] xfs_corruption_error+0x55/0x80 [xfs] Oct 18 16:46:30 f19laptop kernel: [ 1801.894003] [<ffffffffa06622b4>] xfs_sb_read_verify+0x114/0x130 [xfs] Oct 18 16:46:30 f19laptop kernel: [ 1801.894011] [<ffffffffa060ea25>] ? xfs_buf_iodone_work+0x85/0xf0 [xfs] Oct 18 16:46:30 f19laptop kernel: [ 1801.894018] [<ffffffffa060ea25>] xfs_buf_iodone_work+0x85/0xf0 [xfs] Oct 18 16:46:30 f19laptop kernel: [ 1801.894023] [<ffffffff810810d5>] process_one_work+0x175/0x430 Oct 18 16:46:30 f19laptop kernel: [ 1801.894025] [<ffffffff81081cfb>] worker_thread+0x11b/0x3a0 Oct 18 16:46:30 f19laptop kernel: [ 1801.894027] [<ffffffff81081be0>] ? rescuer_thread+0x340/0x340 Oct 18 16:46:30 f19laptop kernel: [ 1801.894029] [<ffffffff81088640>] kthread+0xc0/0xd0 Oct 18 16:46:30 f19laptop kernel: [ 1801.894031] [<ffffffff81088580>] ? insert_kthread_work+0x40/0x40 Oct 18 16:46:30 f19laptop kernel: [ 1801.894034] [<ffffffff81647d2c>] ret_from_fork+0x7c/0xb0 Oct 18 16:46:30 f19laptop kernel: [ 1801.894036] [<ffffffff81088580>] ? insert_kthread_work+0x40/0x40 Oct 18 16:46:30 f19laptop kernel: [ 1801.894037] XFS (sda7): Corruption detected. Unmount and run xfs_repair Oct 18 16:46:30 f19laptop kernel: [ 1801.894043] XFS (sda7): SB validate failed with error 22 ***************************************************************** > Even if you saw the error output from the mount command, it still > points you to check dmesg where you'd find this misleading advice. > So I don't see how better off you'd be. > > > Output from corresponding mount command > > *************************************** > > > > $ sudo mount -t xfs /dev/sda7 /example > > mount: wrong fs type, bad option, bad superblock on /dev/sda7, > > missing codepage or helper program, or other error > > > > In some cases useful info is found in syslog - try > > dmesg | tail or so. > > Exactly! I would think "Bad superblock!", check dmesg and get the bad advice. Good point. * the system log output was directly misleading * the mount command output was a bit better, but not great I kind of wonder if there's an easy way to add a sanity check before attempting the mount. Something similar to running fstyp (a Solaris command to return the filesystem type on a block device), would be useful here, but I don't know of a Linux equivalent off the top of my head. :(
(In reply to Justin Clift from comment #4) > I kind of wonder if there's an easy way to add a sanity check before > attempting the mount. Something similar to running fstyp (a Solaris command > to return the filesystem type on a block device), would be useful here, but > I don't know of a Linux equivalent off the top of my head. :( There should be no need for that... this was bad xfs behavior, fixed upstream. No need to engineer more complex userspace workarounds...
No worries, thanks Eric. :)
*********** MASS BUG UPDATE ************** We apologize for the inconvenience. There is a large number of bugs to go through and several of them have gone stale. Due to this, we are doing a mass bug update across all of the Fedora 19 kernel bugs. Fedora 19 has now been rebased to 3.12.6-200.fc19. Please test this kernel update (or newer) and let us know if you issue has been resolved or if it is still present with the newer kernel. If you have moved on to Fedora 20, and are still experiencing this issue, please change the version to Fedora 20. If you experience different issues, please open a new bug report for those.
*********** MASS BUG UPDATE ************** This bug has been in a needinfo state for more than 1 month and is being closed with insufficient data due to inactivity. If this is still an issue with Fedora 19, please feel free to reopen the bug and provide the additional information requested.
Closing seems fine, this was fixed upstream by: http://oss.sgi.com/cgi-bin/gitweb.cgi?p=xfs/xfs.git;a=commit;h=31625f28ad7be67701dc4cefcf52087addd88af4
Changing to ERRATA. Thanks Eric.