Bug 224077

Summary:	Releasenote: Xen guests with lvm backend ontop of md raid10 do not work in 5.0
Product:	Red Hat Enterprise Linux 5	Reporter:	Daniel Riek <riek>
Component:	redhat-release-notes	Assignee:	Daniel Riek <riek>
Status:	CLOSED NOTABUG	QA Contact:	Michael Hideo <mhideo>
Severity:	high	Docs Contact:
Priority:	high
Version:	5.0	CC:	ask, clalance, ddomingo, jzhenyon, sputhenp, tis, xen-maint
Target Milestone:	---	Keywords:	Documentation
Target Release:	---
Hardware:	All
OS:	Linux
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2011-07-13 17:53:51 UTC	Type:	---
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	197865

Description Daniel Riek 2007-01-23 22:19:52 UTC

Cloning this for releasenoting in 5.0.

+++ This bug was initially created as a clone of Bug #223947 +++

Seeing this only in Xen so assigning to kernel-xen.

On the nightly snapshot from 20070122 I am trying to use an lvm backend for a
new paravirt xen domain. Underneath lvm I am running md on a sata disk.

The installation fails with the default partition layout in the guest when
anaconda tries to install grub to xvda1 with the reason that it is a readonly
device.

When I the booted the DomU into rescue mode suing the install image and tried to
rsync the /boot partition over from a different domU the filesystem whent into
R/O mode.

Messages on dom0 shows a lot lines like this:
Jan 23 01:17:01 myhost kernel: raid10_make_request bug: can't convert block
across chunks or bigger than 64k 998425343 3
Jan 23 01:17:01 myhost kernel: raid10_make_request bug: can't convert block
across chunks or bigger than 64k 998425467 4

On the DomU it shows lines like:
<3>Buffer I/O error on device xvda1, logical block 4903
<4>lost page write due to I/O error on xvda1
and then:
<4>end_request: I/O error, dev xvda, sector 9465
<4>end_request: I/O error, dev xvda, sector 9579
...
Later:
<3>Aborting journal on device xvda1.
<4>__journal_remove_journal_head: freeing b_committed_data
<4>__journal_remove_journal_head: freeing b_committed_data
<4>__journal_remove_journal_head: freeing b_committed_data
<4>__journal_remove_journal_head: freeing b_committed_data
<2>ext3_abort called.
<2>EXT3-fs error (device xvda1): ext3_journal_start_sb: Detected aborted journal
<2>Remounting filesystem read-only

I did some more testing and the problem seems only to exist when the partition
xvda1 in the guest is accessed but NOT when it accesses the logical volume on
xvda2 (as I said: it is a standard layout created by anaconda).

The md device seems to run find in Dom0 and for guests on file-images.

System details:
kernel-2.6.18-4.el5xen
xen-3.0.3-21.el5
lvm2-2.02.16-3.el5
about 3G of memory

md device is 4 500G SATA discs in a raid 1/0 configuration. So the resulting
device is 1TB

-- Additional comment from pm-rhel on 2007-01-23 10:20 EST --
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux major release.  Product Management has requested further
review of this request by Red Hat Engineering, for potential inclusion in a Red
Hat Enterprise Linux Major release.  This request is not yet committed for
inclusion.

-- Additional comment from riek on 2007-01-23 11:30 EST --
The problem does also not exist when tap:aio is used instead of phy to access
the lvm2 volume on the backend.

virt-install though does not currently support installation in that mode.

-- Additional comment from riek on 2007-01-23 16:54 EST --
At this point we are not sure if we can have a non-intrusive fix for this. The
problem might be in one or more of several layers as ext3, block-frontend,
block-backend, lvm and md are layered ontop of each other. The tap:aio
workaround has not seen a whole lot of testing for lvm storage (it is generally
only used for files) and is not suited for a default setting.

So my recommendations is:
- Have QE verify if the problem only exists for lvm ontop of md raid10 (vs.
raid5, etc.).
- Releasenote the issue in 5.0 as a known not working.
- Fix in 5.1

-- Additional comment from riek on 2007-01-23 17:13 EST --
Moving to 5.1 and cloning release-note bug for 5.0

Comment 2 Ask Bjørn Hansen 2008-08-17 10:59:33 UTC

This is still an issue in 5.2.

Comment 5 RHEL Program Management 2010-08-09 19:49:16 UTC

This request was evaluated by Red Hat Product Management for
inclusion in the current release of Red Hat Enterprise Linux.
Because the affected component is not scheduled to be updated in the
current release, Red Hat is unfortunately unable to address this
request at this time. Red Hat invites you to ask your support
representative to propose this request, if appropriate and relevant,
in the next release of Red Hat Enterprise Linux.

Comment 6 Chris Lalancette 2011-07-13 17:53:51 UTC

I'm going to close this out.  I doubt this is a problem anymore, and even if it is, we should just open a new bug about it and track it there.