549397 – I/O errors while accessing loop devices or file-based Xen images from GFS volume after Update from RHEL 5.3 to 5.4

Bug 549397 - I/O errors while accessing loop devices or file-based Xen images from GFS volume after Update from RHEL 5.3 to 5.4

Summary: I/O errors while accessing loop devices or file-based Xen images from GFS vol...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 5
Classification:	Red Hat
Component:	kernel
Sub Component:
Version:	5.5
Hardware:	All
OS:	Linux
Priority:	high
Severity:	high
Target Milestone:	rc
Target Release:	---
Assignee:	Josef Bacik
QA Contact:	Red Hat Kernel QE team
Docs Contact:
URL:
Whiteboard:
Duplicates (1):	566184 (view as bug list)
Depends On:
Blocks:	526947
TreeView+	depends on / blocked

Reported:	2009-12-21 15:16 UTC by Michal Markowski
Modified:	2018-10-27 14:17 UTC (History)
CC List:	8 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2010-03-30 07:16:54 UTC
Target Upstream Version:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
Xen console output after starting a VM with disk on a gfs volume (3.69 KB, application/octet-stream) 2009-12-21 15:17 UTC, Michal Markowski	no flags	Details
dmesg output after trying to mount a loop device from gfs volume (1.30 KB, application/octet-stream) 2009-12-21 15:18 UTC, Michal Markowski	no flags	Details
console output after trying to mount a loof device from a gfs volume (708 bytes, application/octet-stream) 2009-12-21 15:18 UTC, Michal Markowski	no flags	Details
xen log after starting a VM with disk on a gfs volume (15.46 KB, application/octet-stream) 2009-12-21 15:19 UTC, Michal Markowski	no flags	Details
List of rpms istalled on the system (16.26 KB, application/octet-stream) 2009-12-21 15:24 UTC, Michal Markowski	no flags	Details
possible fix (956 bytes, patch) 2009-12-23 17:03 UTC, Josef Bacik	no flags	Details \| Diff
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHSA-2010:0178	0	normal	SHIPPED_LIVE	Important: Red Hat Enterprise Linux 5.5 kernel security and bug fix update	2010-03-29 12:18:21 UTC

Description Michal Markowski 2009-12-21 15:16:45 UTC

Description of problem:
An update on a Xen virtualization cluster has been made: yum update, RHEL 5.3 to RHEL 5.4
The cluster has a gfs volume where image files for VMs are stored.
Afterwards, the machines fail to boot with fs errors, console output: (console.log)

How reproducible:
Create a loop device from a file which is stored on GFS volume and try to mount or fsck it

Steps to Reproduce:
1. Create an image file (ex. using dd) on a local ext3 volume
2. Create a loop device from the file (kpartx)
3. Create a partition table and a partition on the loop device (fdisk)
4. Create an ext3 filesystem on the partition (mkfs -t ext3)
5. Mount the filesystem, check dmesg output
6. Unmount fs, destroy the loop device
7. Copy the image file to an GFS volume
8. Create a loop device from the file on a GFS volume
9. Mount the filesystem, check dmesg output

Actual results:
The filesystem from ext3 volume can be mounted, the one from gfs volume cannot.

Expected results:
Both filesystems should be mountable, as they are exact clones

Additional info:
It did not happen to all of the virtual machines, but ca. 30% of them. Newly created image files are affected. xend.log reports: (see xend.log)
If one tries to mount the image files as loop devices, FS-errors on all partitions in the file are reported, read-only access is possible, no rw access.
Once copied and repaired (or newly created) on a local ext3 filesystem, the image files (or files acting as loop devices generally) are healthy and usable. If they are copied back onto the GFS volume, same errors occur.

Error after trying to mount a loop device: (mount.log)
dmesg reports: (dmesg.log)

Failback to the .128.el5xen kernel (with updated packages) did not solve the issue. The issue applies both to 164.6.1.el5xen and 164.9.1.el5xen kernels.
RPM list: (rpm.list)

We were able to reproduce the problem on a different, fresh installation of a single-node cluster - RHEL5.4 Clone; 2.6.18-164.6.1.el5 kernel

Similar bug has been known: https://bugzilla.redhat.com/show_bug.cgi?id=164499

Comment 1 Michal Markowski 2009-12-21 15:17:24 UTC

Created attachment 379634 [details]
Xen console output after starting a VM with disk on a gfs volume

Comment 2 Michal Markowski 2009-12-21 15:18:06 UTC

Created attachment 379635 [details]
dmesg output after trying to mount a loop device from gfs volume

Comment 3 Michal Markowski 2009-12-21 15:18:47 UTC

Created attachment 379636 [details]
console output after trying to mount a loof device from a gfs volume

Comment 4 Michal Markowski 2009-12-21 15:19:25 UTC

Created attachment 379637 [details]
xen log after starting a VM with disk on a gfs volume

Comment 5 Michal Markowski 2009-12-21 15:24:12 UTC

Created attachment 379638 [details]
List of rpms istalled on the system

Comment 6 Steve Whitehouse 2009-12-23 14:14:31 UTC

I can see whats going on there. The loop device is trying to call (not unreasonably) gfs_prepare_write as gfs is trying to write to a block. The issue is that this is not allowed unless the caller has already locked the glock.

Unfortunately it is a consequence of the level at which gfs does its locking. GFS2 locks at the page cache level so that it will not have this issue. Also it will only affect writes, so that read only mounts via loop should be unaffected.

Comment 7 Steve Whitehouse 2009-12-23 15:12:26 UTC

The problem seems to be due to this bit of the new aops patches:

@@ -791,7 +770,7 @@ static int loop_set_fd(struct loop_device *lo, struct file *
lo_file,
                 */
                if (!file->f_op->sendfile)
                        goto out_putf;
-               if (aops->prepare_write && aops->commit_write)
+               if (aops->prepare_write || aops->write_begin)
                        lo_flags |= LO_FLAGS_USE_AOPS;
                if (!(lo_flags & LO_FLAGS_USE_AOPS) && !file->f_op->write)
                        lo_flags |= LO_FLAGS_READ_ONLY;


We were relying on the lack of a commit_write aop to not set the LO_FLAGS_USE_AOPS flag.

Comment 8 Steve Whitehouse 2009-12-23 15:19:56 UTC

Before I forget where it is, here is the gfs end of this issue:
http://git.fedoraproject.org/git/cluster.git?p=cluster.git;a=commitdiff;h=386fe588f9e9bd70568df7795d36d88534b67a7d

That patch was added to fix this issue the first time it cropped up.

Comment 9 Josef Bacik 2009-12-23 17:03:42 UTC

Created attachment 380060 [details]
possible fix

Please try this patch and verify it fixes your problem.  I've built it and tested it to make sure it doesn't blow up, but I didn't test it with gfs.

Comment 10 Robert Peterson 2009-12-23 17:14:57 UTC

Changing bug from RHEL4.x to 5.x.

Comment 11 Robert Peterson 2009-12-23 17:16:47 UTC

In my opinion, this needs to get fixed in 5.5.  Setting and/or
requesting flags appropriately.

Comment 12 Michal Markowski 2010-01-05 11:21:12 UTC

I currently have some problems building a kernel RPM from SRPM. As soon as that works, I will check out the patch and give you feedback on this matter.

Happy New Year!

Comment 13 Michal Markowski 2010-01-07 10:03:50 UTC

I've built new kernel rpm, installed it, tested. It seems to work fine now.
When can I expect an official bugfix, so I can go productive with it, with a supported kernel?

dmesg output:
[mount GFS]
Trying to join cluster "lock_nolock", "clurhel5:root"
Joined cluster. Now mounting FS...
GFS: fsid=clurhel5:root.0: jid=0: Trying to acquire journal lock...
GFS: fsid=clurhel5:root.0: jid=0: Looking at journal...
[...]
GFS: fsid=clurhel5:root.0: jid=3: Done
GFS: fsid=clurhel5:root.0: Scanning for log elements...
GFS: fsid=clurhel5:root.0: Found 0 unlinked inodes
GFS: fsid=clurhel5:root.0: Found quota changes for 0 IDs
GFS: fsid=clurhel5:root.0: Done
[create loop device from a file on the gfs volume]
loop: loaded (max 8 devices)
[create ext3 filesystem on loop device, mount it]
kjournald starting.  Commit interval 5 seconds
EXT3 FS on dm-0, internal journal
EXT3-fs: mounted filesystem with ordered data mode.
[no gfs errors! good job!]

Comment 14 Robert Peterson 2010-01-07 14:31:59 UTC

I'm reassigning this to Josef, since he wrote the patch.

We're investigating whether we can still push this into 5.5 or not.
It's very late in the process but we might be able to go through
the exception process.  I'll bump the priority toward that end,
although we might need to bump it higher.

Comment 15 RHEL Program Management 2010-01-07 14:50:41 UTC

This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 18 Michal Markowski 2010-01-12 14:40:12 UTC

RHEL 5.5 for this problem is not an option. I have a productive system with failed RHEL 5.4 update, starting the productive virtual machines may lead to data corruption.
I need a supported kernel patch ASAP.

Comment 19 Steve Whitehouse 2010-01-12 15:02:21 UTC

This bug has to go into 5.5 first. If it is also required in 5.4.z then this bug needs to be cloned in order for that to happen. That is normally done via GSS so if you have a contact there, please ask them to do that. If not let us know and I'll try and kick off the process directly.

Comment 22 Robert Peterson 2010-02-19 21:29:02 UTC

*** Bug 566184 has been marked as a duplicate of this bug. ***

Comment 26 errata-xmlrpc 2010-03-30 07:16:54 UTC

An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2010-0178.html

Note You need to log in before you can comment on or make changes to this bug.