Bug 173350
Summary: | jbd I/O errors after ext3 orphan processing on readonly device | ||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 3 | Reporter: | Scott D. Anderson <scott.anderson> | ||||||||||||
Component: | kernel | Assignee: | Eric Sandeen <esandeen> | ||||||||||||
Status: | CLOSED ERRATA | QA Contact: | Brian Brock <bbrock> | ||||||||||||
Severity: | medium | Docs Contact: | |||||||||||||
Priority: | high | ||||||||||||||
Version: | 3.0 | CC: | agk, dwysocha, esandeen, heinzm, jbacik, jtillots, mazurekd, mbroz, petrides, sct | ||||||||||||
Target Milestone: | --- | ||||||||||||||
Target Release: | --- | ||||||||||||||
Hardware: | i686 | ||||||||||||||
OS: | Linux | ||||||||||||||
Whiteboard: | |||||||||||||||
Fixed In Version: | RHSA-2007-0436 | Doc Type: | Bug Fix | ||||||||||||
Doc Text: | Story Points: | --- | |||||||||||||
Clone Of: | Environment: | ||||||||||||||
Last Closed: | 2007-06-11 17:53:05 UTC | Type: | --- | ||||||||||||
Regression: | --- | Mount Type: | --- | ||||||||||||
Documentation: | --- | CRM: | |||||||||||||
Verified Versions: | Category: | --- | |||||||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||||||
Embargoed: | |||||||||||||||
Bug Depends On: | |||||||||||||||
Bug Blocks: | 174649, 217926, 217928 | ||||||||||||||
Attachments: |
|
Description
Scott D. Anderson
2005-11-16 15:22:12 UTC
Created attachment 121131 [details]
nightly cron job
runs at 4:02 am daily. Creating the new snapshots may be what triggers the
kernel panic.
Created attachment 121132 [details]
the shell functions that do the real work of creating the snapshots
This file is sourced by the main cron job.
Are there any other errors in the logs just prior to this ext3 journal error? (In reply to comment #3) > Are there any other errors in the logs just prior to this ext3 journal error? Yes. Actually, this error doesn't appear in the logs; it seems to go only to the console, which is odd. Anyhow, below are the relevant log entries. I have filtered out the "raid5: switching cache buffer size" entries, which appear several times a second continuously. Nov 14 04:02:14 puma kernel: Can't write to read-only device 3a:07 Nov 14 04:02:15 puma kernel: Can't write to read-only device 3a:08 Nov 14 04:02:16 puma modprobe: modprobe: Can't locate module block-major-43 Nov 14 04:02:17 puma kernel: Can't write to read-only device 3a:07 Nov 14 04:02:17 puma kernel: kjournald starting. Commit interval 5 seconds Nov 14 04:02:17 puma kernel: Can't write to read-only device 3a:07 Nov 14 04:02:17 puma kernel: EXT3-fs: mounted filesystem with ordered data mode. Nov 14 04:02:17 puma modprobe: modprobe: Can't locate module block-major-43 Nov 14 04:02:19 puma kernel: Can't write to read-only device 3a:08 Nov 14 04:02:19 puma kernel: kjournald starting. Commit interval 5 seconds Nov 14 04:02:19 puma kernel: Can't write to read-only device 3a:08 Nov 14 04:02:19 puma kernel: EXT3-fs: mounted filesystem with ordered data mode. Nov 14 04:02:19 puma kernel: sym53c1010-33-1-<2,*>: FAST-40 WIDE SCSI 80.0 MB/s (25.0 ns, offset 31) Nov 14 04:02:19 puma kernel: sym53c1010-33-1-<2,*>: FAST-40 WIDE SCSI 80.0 MB/s (25.0 ns, offset 31) Nov 14 05:20:03 puma kernel: sym53c1010-33-1-<2,*>: FAST-40 WIDE SCSI 80.0 MB/s (25.0 ns, offset 31) Nov 14 05:20:05 puma kernel: sym53c1010-33-1-<2,*>: FAST-40 WIDE SCSI 80.0 MB/s (25.0 ns, offset 31) In order to interpret the message about being unable to write to a read-only device, you'll need the following information: # ls -ln /dev/vghome total 0 crw-r----- 1 0 6 109, 0 Oct 29 13:17 group brw-rw---- 1 0 6 58, 7 Nov 15 13:32 homesnap brw-rw---- 1 0 6 58, 1 Oct 29 13:17 lvhome brw-rw---- 1 0 6 58, 3 Oct 29 13:17 lvstud brw-rw---- 1 0 6 58, 8 Nov 15 04:02 studsnap Just FYI, /dev/vghome/lvhome is the home filesystem and /dev/vghome/homesnap is the snapshot of it. Similarly for lvstud (a real filesystem which has student home directories) and studsnap (a snapshot of lvstud). Thanks for your effort. Let me know if I can supply any more information. OK, the errors are occurring on the snapshot filesystems themselves. It looks like there are a number of things happening here: first, ext3 is not completely getting flushed out and quiesced properly when the snapshot is created, leaving behind a snapshot that needs recovery when mounted; secondly, the snapshot device is not being marked as a true readonly device by LVM, so ext3 tries to recover it and fails; finally, ext3 is getting confused enough by that failure that it panics. The second problem appears to be an LVM bug. The final problem is ext3's: no matter how badly the underlying storage is behaving, ext3 shouldn't be panicing. These are being triggered by the first problem, though, and it's not clear whether it's LVM or ext3 failing to do the quiescing properly in this case. I'll have to have a think about how to narrow that down. Reassigning to "lvm" component based on comment #7. We are seeing the same thing. We are using the 2.4.21-37 kernel with lvm version 1.0.8-14. I'll attach our log messages that occur when this happens. Created attachment 122947 [details]
Entries in the messages log associated with this bug
These are the error messages we're seeing associated with this problem.
Created attachment 140190 [details]
kernel panic
see "kernel panic" attachment. Any news on this? We are seeing this happen often, say, once a week or so. Running 2.4.21-47.ELsmp kernel, and lvm-1.0.8-14. Created attachment 140191 [details]
patch that resolves the problem.
This is the patch that appears to resolve the problem. If anybody out there
wants to test this and verify it that would be helpfull, as the person I had
testing it has not come back with a definite "yes" yet.
see "kernel panic" attachment. Any news on this? We are seeing this happen often, say, once a week or so. Running 2.4.21-47.ELsmp kernel, and lvm-1.0.8-14. Test my patch out and see if it helps. I need to get confirmation if it works. I'll give it a whirl & let you know. Thanks. I applied the patch on Tuesday Nov 7. The problem re-occurred on Sat Nov 11. I rebooted the machine and removed the offending snapshots on Monday Nov 13. It has been ok since then - 8 days. The patch was applied to the latest kernel - 2.4.21-47.0.1 I'm still seeing messages in the logs as above: kjournald starting. Commit interval 5 seconds Can't write to read-only device 3a:02 EXT3-fs: mounted filesystem with ordered data mode. Can't write to read-only device 3a:03 kjournald starting. Commit interval 5 seconds Can't write to read-only device 3a:03 EXT3-fs: mounted filesystem with ordered data mode. lv_access for /dev/ug_vg/ug0 is 5 Can't write to read-only device 3a:01 kjournald starting. Commit interval 5 seconds Can't write to read-only device 3a:01 EXT3-fs: mounted filesystem with ordered data mode. /dev/ug_vg/ug0 is the snapshot from which I take a backup. It is (generally) this snapshot that corrupts and sends ext3 into a panic. Not sure if this info helps you or not - I'll keep running the patched kernel on this machine (I have others without the patch too) and advise if anything is different. The unpatched machines, by the way, have continued the panic behaviour as before. I can confirm that the patch does not fix this error. deb Grep through your logs, there should be a message like this lv_access for <device> is <some number> please put those messages in here, or attach your logs, either one works for me. root@nsb<8>grep -i lv_access /var/log/messages Nov 26 23:55:05 nsb kernel: lv_access for /dev/ug_vg/ug0 is 5 Nov 27 23:55:04 nsb kernel: lv_access for /dev/ug_vg/ug0 is 5 sorry, may have the panic text as well in its entirety. in the meantime I will re-evaluate this fix, its been a few months since I've looked at this issue. sure, here it is: Nov 27 23:55:03 nsb kernel: kjournald starting. Commit interval 5 seconds Nov 27 23:55:03 nsb kernel: Can't write to read-only device 3a:03 Nov 27 23:55:03 nsb kernel: EXT3-fs: mounted filesystem with ordered data mode. Nov 27 23:55:03 nsb modprobe: modprobe: Can't locate module block-major-43 Nov 27 23:55:03 nsb last message repeated 31 times Nov 27 23:55:04 nsb kernel: lv_access for /dev/ug_vg/ug0 is 5 Nov 27 23:55:04 nsb kernel: Can't write to read-only device 3a:01 Nov 27 23:55:04 nsb kernel: kjournald starting. Commit interval 5 seconds Nov 27 23:55:04 nsb kernel: EXT3-fs: lvm(58,1): orphan cleanup on readonly fs Nov 27 23:55:04 nsb kernel: EXT3-fs: lvm(58,1): 1 orphan inode deleted Nov 27 23:55:04 nsb kernel: EXT3-fs: recovery complete. Nov 27 23:55:04 nsb kernel: lvm - lvm_map: ll_rw_blk write for readonly LV /dev/ug_vg/ug0 Nov 27 23:55:04 nsb last message repeated 5 times Nov 27 23:55:04 nsb kernel: Error (-5) on journal on device 3a:01 Nov 27 23:55:04 nsb kernel: Aborting journal on device lvm(58,1). Nov 27 23:55:05 nsb kernel: journal commit I/O error Nov 27 23:55:05 nsb kernel: Can't write to read-only device 3a:01 Nov 27 23:55:07 nsb kernel: Assertion failure in journal_flush_Rsmp_e2f189ce() at journal.c:1356: "!journal->j_checkpoint_transactions" Nov 27 23:55:07 nsb kernel: ------------[ cut here ]------------ Nov 27 23:55:07 nsb kernel: kernel BUG at journal.c:1356! Nov 27 23:55:07 nsb kernel: invalid operand: 0000 Nov 27 23:55:07 nsb kernel: parport_pc lp parport floppy sr_mod ide-cd cdrom sg nfsd lockd sunrpc dcdipm dcdbas autofs4 iptable_filter ip_tables e1000 microcode st loop keybdev mousedev Nov 27 23:55:07 nsb kernel: CPU: 2 Nov 27 23:55:07 nsb kernel: EIP: 0060:[<f8884d5c>] Tainted: PF Nov 27 23:55:07 nsb kernel: EFLAGS: 00010282 Nov 27 23:55:07 nsb kernel: Nov 27 23:55:07 nsb kernel: EIP is at journal_flush_Rsmp_e2f189ce [jbd] 0x1bc (2.4.21-47.0.1.ELcustom-lvm-patch/i686) Nov 27 23:55:07 nsb kernel: eax: 0000006f ebx: f61f4680 ecx: 00000001 edx: c0388e98 Nov 27 23:55:07 nsb kernel: esi: f5d39400 edi: 00000000 ebp: fffffffb esp: e7eede18 Nov 27 23:55:07 nsb kernel: ds: 0068 es: 0068 ss: 0068 Nov 27 23:55:07 nsb kernel: Process mount (pid: 21518, stackpage=e7eed000) Nov 27 23:55:07 nsb kernel: Stack: f88872b0 f88861e5 f88860ee 0000054c f88875c8 e2632400 e2632400 c97ca400 Nov 27 23:55:07 nsb kernel: 00000001 f8898f29 f5d39400 e2632448 e2632400 e2632448 f889867c e2632400 Nov 27 23:55:07 nsb kernel: c97ca400 00000001 00000000 e7eede9c 00000000 0000000d 00000000 00000000 Nov 27 23:55:07 nsb kernel: Call Trace: [<f88872b0>] .rodata.str1.4 [jbd] 0xf70 (0xe7eede18) Nov 27 23:55:07 nsb kernel: [<f88861e5>] .rodata.str1.1 [jbd] 0x675 (0xe7eede1c) Nov 27 23:55:07 nsb kernel: [<f88860ee>] .rodata.str1.1 [jbd] 0x57e (0xe7eede20) Nov 27 23:55:07 nsb kernel: [<f88875c8>] .rodata.str1.4 [jbd] 0x1288 (0xe7eede28) Nov 27 23:55:07 nsb kernel: [<f8898f29>] ext3_mark_recovery_complete [ext3] 0x19 (0xe7eede3c) Nov 27 23:55:07 nsb kernel: [<f889867c>] ext3_read_super [ext3] 0x73c (0xe7eede50) Nov 27 23:55:07 nsb kernel: [<c016ce62>] get_sb_bdev [kernel] 0x1b2 (0xe7eedeb0) Nov 27 23:55:07 nsb kernel: [<f88a1a34>] ext3_fs_type [ext3] 0x0 (0xe7eedef4) Nov 27 23:55:07 nsb kernel: [<c016d211>] do_kern_mount [kernel] 0x121 (0xe7eedefc) Nov 27 23:55:07 nsb kernel: [<f88a1a34>] ext3_fs_type [ext3] 0x0 (0xe7eedf00) Nov 27 23:55:07 nsb kernel: [<c0185b23>] do_add_mount [kernel] 0x93 (0xe7eedf20) Nov 27 23:55:07 nsb kernel: [<c0185e50>] do_mount [kernel] 0x160 (0xe7eedf40) Nov 27 23:55:07 nsb kernel: [<c0185ca1>] copy_mount_options [kernel] 0x81 (0xe7eedf70) Nov 27 23:55:07 nsb kernel: [<c018633f>] sys_mount [kernel] 0xdf (0xe7eedf90) Nov 27 23:55:07 nsb kernel: [<c02b006f>] no_timing [kernel] 0x7 (0xe7eedfc0) Nov 27 23:55:07 nsb kernel: Nov 27 23:55:07 nsb kernel: Code: 0f 0b 4c 05 ee 60 88 f8 e9 0d ff ff ff 8d b4 26 00 00 00 00 Nov 27 23:55:07 nsb kernel: Nov 27 23:55:07 nsb kernel: Kernel panic: Fatal exception Nov 27 23:55:07 nsb kernel: lvm - lvm_map: ll_rw_blk write for readonly LV /dev/ug_vg/ug0 Nov 27 23:55:07 nsb last message repeated 3 times Let me know if you need anything else. deb well that patch I wrote was just straight up wrong. I don't beleive this to be a problem with the LVM package itself, at least not in the sense that the problem will never occur with the previous package, as there is nothing between the two packages that changed that would cause different behavior in this case. I'm going to try and reproduce again and hopefully get something viable today or tomorrow. ok so here's the sequence of events as far as i understand it 1) main FS deletes a file, and that inode gets added to the orphan list but the file is not yet deleted. ( the inode is added to the orphan list on the disk, so in case of a crash the orphan recovery stuff will delete the inode before completely mounting the FS) 2) mount the snapshot LV 3) the FS is ext3 so we see the inode on the orphan list on the disk, and go to do recover 4) you see the EXT3-fs: lvm(58,1): orphan cleanup on readonly fs message, at this point we automatically set the superblock to RW and do the orphan cleanup. 5) the inode deletion is a journaled operation, so before mounting completely we do a journal_recover_completed or something like that, which in turn does a journal_flush(), which of course fails b/c the disk is RO. 6) we assert that the journal has been completed, which of course it wasn't, and we panic. Now this is a series of events that are completely seperate from eachother so I'm still a little puzzled on how to fix it. In the meantime, I beleive a workaround for this problem would be to mount the snapshot as ext2, which takes journalling completely out of the equation and so we should avoid this problem. I have yet to reproduce, but I think i've got a good enough idea of whats going on to try and come up with a fix. Debbie will you change your scripts to mount the snapshot volume as ext2 to make sure the problem doesn't reproduce. It shouldn't so at least you will be more stable until a solution is found. What happens if you open a file, delete it while holding it open, then create a snapshot, and try to mount that snapshot. Something like: # touch foo # tail -f foo & # rm foo # <snapshot> # <mount snapshot> Does that recreate the problem? -Eric eric you rule, reproduced first shot. I'll be mounting the snapshot as ext2 tonight on 3 servers, I'll let you know how it goes. Thanks guys. ok looks like the correct behavior is to check if the underlying block device is read only when we go to do the orphan recovery during mount, and if it is just print a message and return without doing the orphan recovery. I tested your path this morning eric and it worked well, and foo was not visible when I mounted the snapshot. I'll get the patch to fix this submitted to the relevant places. Thanks for the analysis and testing, Josef! This probably needs to be fixed in rhel4 and 5 too. For what it's worth, this does not seem to be a problem on RHEL4/5, but it is possibly a different problem - because something is allowing ext3 orphan inode recovery to proceed on the readonly lvm snapshot, which writes to it, which is probably not good. Correcting component (from "lvm" to "kernel"). Update description to something more descriptive. QE ack for RHEL3.9. update - the first component of this (skip orphan processing if underlying device is readonly) has made it into the stable upstream kernel tree. The 2nd part, processing the orphan list if the underlying device & the filesystem both transition to readonly via remount, is still languishing upstream. Pinged Andrew on that today. Taking out of POST, will re-post when the 2nd part that Stephen asked for has been approved for upstream inclusion. A fix for this problem has just been committed to the RHEL3 U9 patch pool this evening (in kernel version 2.4.21-47.5.EL). An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2007-0436.html |