Red Hat Bugzilla – Bug 490148
Xen domU, RAID1, LVM, iscsi target export with blockio bug
Last modified: 2011-02-16 10:58:25 EST
+++ This bug was initially created as a clone of Bug #460693 +++
Description of problem:
My goal was to make iSCSI export of parts (logical volumes) of software RAID1 device created inside domU.
RAID components are basically two dom0 logical volumes, pushed as block devices to domU. RAID1 device, /dev/md0, is created inside domU; then, PV, VG and LVs are created inside /dev/md0. Different logical volumes from /dev/md0 are then exported through iSCSI target software, with "blockio" mode.
After starting iscsi target software, connecting to targets from other computer was successful, but creating filesystem brings up bug in domU blkfront.c, same as writing larger amount of data (~128MB) to the target with dd. In case that there is file system already on the target, mounting FS is successful, but trying to write large amount of data to it with
dd if=/dev/zero of=dummy-file-1 bs=1024 count=$[1024*512]
brings up the same bug.
In case that iscsi target software is using "fileio" mode, everything is going just fine.
Exporting whole /dev/md0 as iscsi target also works great.
Same thing happens with both iSCSI enterprise target (IET) and scst-iscsi.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. make 2 LVs in dom0 and push them to domU
2. inside domU, make RAID1 /dev/md0 consisting of these two devices
3. create logical volumes in /dev/md0
4. export logical volumes as separate iSCSI targets, with "blockio" mode
5. connect to iscsi target(s) from other computer
6. try to write large amount of data to iscsi target(s) - either mkfs, dd
Bug shows up in domU that is running iSCSI target software, and domU reboots:
------------[ cut here ]------------
kernel BUG at drivers/xen/blkfront/blkfront.c:567!
invalid opcode: 0000 [#1]
last sysfs file: /block/ram0/dev
Modules linked in: iscsi_scst(FU) scst_disk(U) scst_vdisk(U) scst(U) iscsi_tcp(U) libiscsi(U) scsi_transport_iscsi(U) scsi_mod lock_dlm gfs2(U) dlm configfs ipv6 xfrm_nalgo crypto_api dm_multipath raid1 parport_pc lp parport pcspkr xenblk xennet dm_snapshot dm_zero dm_mirror dm_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd
EIP: 0061:[<da0b8704>] Tainted: GF VLI
EFLAGS: 00010046 (2.6.18-92.1.6.el5xen #1)
EIP is at do_blkif_request+0x182/0x37b [xenblk]
eax: 0000000c ebx: c0dd17e0 ecx: 00000008 edx: 0000000b
esi: 00000000 edi: 0000bc26 ebp: d8f97628 esp: c0c52dec
ds: 007b es: 007b ss: 0069
Process md0_raid1 (pid: 2344, ti=c0c52000 task=d4fbc000 task.ti=c0c52000)
Stack: d8f6e468 c08ec000 d8f6abe4 00000003 c08ec000 00000001 00000177 00000000
d261fec4 c0dd17e0 00000008 00000000 0000000b ffffffff d8f6e468 d8f6e468
00000000 00000060 c04d5418 d8f6abe4 c04d7530 00000000 00001000 c0660000
[<da0bfa0b>] raid1d+0xec/0xc44 [raid1]
Code: 0f b7 5b 1a 6b c3 0c 89 5c 24 2c 89 44 24 20 8b 52 30 c7 44 24 30 00 00 00 00 01 d0
No bug :)
domU disk configuration is:
disk = [ 'phy:/dev/system/root.container1,sda1,w',
# Users, who can access this target. The same rules as for discovery
# users apply here.
# Leave them alone if you don't want to use authentication.
#IncomingUser joe secret
#OutgoingUser jim 12charpasswd
# Alias name for this target
# Alias Test
# various iSCSI parameters
# (not all are used right now, see also iSCSI spec for details)
# various target parameters
--- Additional comment from firstname.lastname@example.org on 2009-01-06 18:31:39 EDT ---
In IT234267, the customer is experiencing occasional crashes while
installing a DomU. All of the crashes go through the following code
[<ed1ef2b4>] unplug_slaves+0x4f/0x83 [raid1]
[<ed1ef300>] raid1_unplug+0xe/0x1a [raid1]
[<ed247840>] dm_table_unplug_all+0x22/0x2e [dm_mod]
[<ed245c79>] dm_unplug_all+0x17/0x21 [dm_mod]
I feel the underlying cause in IT234267 is the same as experienced in this
Issue escalated to RHEL 5 Kernel by: bbraswel.
Internal Status set to 'Waiting on Engineering'
This event sent from IssueTracker by bbraswel
--- Additional comment from email@example.com on 2009-02-04 16:49:59 EDT ---
FYI; this problem *may* be solved by the upstream patch posted here:
--- Additional comment from firstname.lastname@example.org on 2009-02-05 08:35:19 EDT ---
I've done a quick port of that upstream change to the RHEL-5 kernel, and did a quick test here. Could someone who can reproduce the error (I wasn't able to) download the kernel at:
And see if it fixes the issue for them?
--- Additional comment from email@example.com on 2009-02-05 08:56:42 EDT ---
I will test kernel later today or tomorrow in the morning.
--- Additional comment from firstname.lastname@example.org on 2009-02-06 09:20:25 EDT ---
Ups, I was not able to reproduce the error, too. It looks like that something in my test configuration has been changed in last 6 months. I will try several other tests, but I'm not sure that this would lead to anything particulary useful. :(
--- Additional comment from email@example.com on 2009-02-06 09:29:57 EDT ---
Ah, OK. Thanks for trying; I appreciate the effort. If you *do* get some result, please be sure to report it here.
In the meantime, there were a couple of other people who had reported problems in this area, so I'm hoping one of them can reproduce the error and try this test patch out.
--- Additional comment from firstname.lastname@example.org on 2009-02-12 08:59:33 EDT ---
For anyone else (hint, hint) who was having problems with this bug, I've folded this patch into the main virttest kernels, since the patch referenced in Comment #2 is headed upstream. You can get that kernel at:
Please give it a test to ensure we get it into the next RHEL release!
--- Additional comment from email@example.com on 2009-02-20 17:43:45 EDT ---
I've tripped up this same bug in 5.2--same like in blkfront.c when the kernel panics. This happens mostly when doing a kickstart. I'll be giving 5.3 a test pretty soon.
Unfortunately, because I'm seeing this in the kickstart, I need the right kickstart initrd's to get it going--I've tried rolling the new modules into the existing initrd.img I have for 5.2 but there's a problem.
--- Additional comment from firstname.lastname@example.org on 2009-02-21 05:13:18 EDT ---
OK. Well, my guess is that 5.3 won't change the issue for you; we didn't do anything in 5.3 to address this. There is a patch in the virttest kernels that may address this problem, although I haven't been able to confirm it since I can't reproduce the issue at all. Do you happen to have a reproduction scenario so I can try to reproduce?
--- Additional comment from email@example.com on 2009-02-21 05:15:10 EDT ---
Oh, I should also mention that the kernels have now moved to:
--- Additional comment from firstname.lastname@example.org on 2009-03-01 13:05:18 EDT ---
Created an attachment (id=333658)
Backport of upstream Linux 9e973e64ac6dc504e6447d52193d4fff1a670156
The current patch that we are carrying in the virttest kernels. It still needs verification that it fixes the problem.
--- Additional comment from email@example.com on 2009-03-11 13:22:00 EDT ---
My problem is I haven't had the time to make a working initrd for kickstart from these test kernels. I have them running in my Xen DomU guests (already installed a running md raid1), and they're just fine.
I'm a little confused. Isn't this a RHEL 5 problem? Why is it assigned to 4.8? I haven't noticed this behavior on my 4.7 DomUs...
This is a clone of the rhel 5 issue as there is a fix needed for the RHEL 4 guest kernel.
(In reply to comment #2)
> I'm a little confused. Isn't this a RHEL 5 problem? Why is it assigned to 4.8?
> I haven't noticed this behavior on my 4.7 DomUs...
Right, as Bill mentions, this bug is for RHEL-4, but we cloned it out of a RHEL-5 bug. Since this is a guest-side issue, it's theoretically possible for a RHEL-4 guest to run into it.
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release. Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products. This request is not yet committed for inclusion in an Update
Committed in 89.43.EL . RPMS are available at http://people.redhat.com/vgoyal/rhel4/
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.