Bug 664376
| Summary: | temporary loss of path to SAN results in corrupted filesystem | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 4 | Reporter: | Lachlan McIlroy <lmcilroy> |
| Component: | kernel | Assignee: | Red Hat Kernel Manager <kernel-mgr> |
| Status: | CLOSED WONTFIX | QA Contact: | Gris Ge <fge> |
| Severity: | high | Docs Contact: | |
| Priority: | urgent | ||
| Version: | 4.6 | CC: | fge, jwest, rwheeler, vgaikwad |
| Target Milestone: | rc | ||
| Target Release: | --- | ||
| Hardware: | All | ||
| OS: | All | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2012-06-14 18:45:00 UTC | Type: | --- |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Lachlan McIlroy
2010-12-20 08:22:01 UTC
While running the test for this bug I got these errors: EXT2-fs error (device dm-10): read_inode_bitmap: Cannot read inode bitmap - block_group = 6, inode_bitmap = 196609 EXT2-fs error (device dm-10): read_inode_bitmap: Cannot read inode bitmap - block_group = 6, inode_bitmap = 196609 EXT2-fs error (device dm-10): read_inode_bitmap: Cannot read inode bitmap - block_group = 6, inode_bitmap = 196609 ... And when I tried to remove the files generated by the test I got these errors: EXT2-fs error (device dm-10): ext2_free_inode: bit already cleared for inode 96002 EXT2-fs error (device dm-10): ext2_free_inode: bit already cleared for inode 96011 EXT2-fs error (device dm-10): ext2_free_inode: bit already cleared for inode 96101 ... We've seen many cases of the "bit already cleared" errors on ext3 so maybe now we have an explanation - failed metadata updates. To workaround this bug I use dm-multipath (even for just one path) with no_path_retry set to queue. This causes all failed I/Os to be queued and retried indefinitely so no metadata updates are lost. With this setup I cannot reproduce the bug. |