Bug 1177094 - endian issue due to ppc64le guest w/ data-plane
Summary: endian issue due to ppc64le guest w/ data-plane
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: qemu-kvm-rhev
Version: 7.1
Hardware: ppc64le
OS: Linux
unspecified
unspecified
Target Milestone: rc
: ---
Assignee: David Gibson
QA Contact: Virtualization Bugs
URL:
Whiteboard:
Depends On:
Blocks: RHEV3.6PPC
TreeView+ depends on / blocked
 
Reported: 2014-12-24 07:18 UTC by Xu Han
Modified: 2016-02-21 10:59 UTC (History)
11 users (show)

Fixed In Version: qemu 2.3
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-12-04 16:24:10 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2015:2546 normal SHIPPED_LIVE qemu-kvm-rhev bug fix and enhancement update 2015-12-04 21:11:56 UTC

Description Xu Han 2014-12-24 07:18:51 UTC
Description of problem:
endian issue due to ppc64le guest w/ data-plane.

e.g. fail to reboot guest:
[  144.043553] dracut Warning: Cannot umount /oldroot
...
[  144.072464] dracut Warning: lr-x------. 1 root 0 64 Dec 24 08:10 6 -> /oldroot/usr/lib/modules/3.10.0-201.ael7a.ppc64le/kernel/drivers/block/virtio_blk.ko
...
[  144.079606] device-mapper: ioctl: remove_all left 1 open device(s)
Rebooting.
[  240.602272] INFO: task systemd-udevd:667 blocked for more than 120 seconds.
[  240.602937] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  240.603488] systemd-udevd   D 00003fffa083fe44     0   667      1 0x00040002
[  240.604038] Call Trace:
[  240.604660] [c0000003f64b69f0] [0000000000000400] 0x400 (unreliable)
[  240.605239] [c0000003f64b6bc0] [c0000000000164d8] __switch_to+0x268/0x450
[  240.605801] [c0000003f64b6c20] [c0000000009000e0] __schedule+0x2f0/0xc00
[  240.606357] [c0000003f64b6e70] [c000000000900f78] io_schedule+0xc8/0x1b0
[  240.607048] [c0000003f64b6ea0] [c00000000021aa88] sleep_on_page+0x18/0x30
[  240.607601] [c0000003f64b6ec0] [c0000000008fd4a8] __wait_on_bit_lock+0x118/0x2e0
[  240.608164] [c0000003f64b6f30] [c000000000220124] do_read_cache_page+0x534/0x590
[  240.608732] [c0000003f64b7000] [c0000000002201c4] read_cache_page+0x24/0x170
[  240.609304] [c0000003f64b7070] [c000000000458c70] read_dev_sector+0x40/0xe0
[  240.609872] [c0000003f64b70a0] [c00000000045d29c] read_lba+0xdc/0x200
[  240.610439] [c0000003f64b7100] [c00000000045da08] find_valid_gpt+0x108/0x6e0
[  240.611011] [c0000003f64b7210] [c00000000045e3dc] efi_partition+0x3fc/0x470
[  240.611585] [c0000003f64b7390] [c00000000045a198] check_partition+0x148/0x2f0
[  240.612176] [c0000003f64b7410] [c000000000459700] rescan_partitions+0x250/0x960
[  240.612766] [c0000003f64b7560] [c00000000033d68c] __blkdev_get+0x48c/0x670
[  240.613349] [c0000003f64b75d0] [c00000000033daa0] blkdev_get+0x230/0x4a0
[  240.613937] [c0000003f64b7680] [c0000000004565ec] add_disk+0x62c/0x6d0
[  240.614535] [c0000003f64b7740] [d000000005121768] virtblk_probe+0x488/0x8b0 [virtio_blk]
[  240.615139] [c0000003f64b77c0] [d0000000050003a8] virtio_dev_probe+0x1c8/0x2e0 [virtio]
[  240.615751] [c0000003f64b7800] [c0000000005b5368] driver_probe_device+0x258/0x500
[  240.616359] [c0000003f64b7890] [c0000000005b57ac] __driver_attach+0x10c/0x110
[  240.616949] [c0000003f64b78d0] [c0000000005b16cc] bus_for_each_dev+0x8c/0xf0
[  240.617540] [c0000003f64b7920] [c0000000005b45bc] driver_attach+0x2c/0x40
[  240.618128] [c0000003f64b7940] [c0000000005b4088] bus_add_driver+0x298/0x3b0
[  240.618711] [c0000003f64b79d0] [c0000000005b6200] driver_register+0xb0/0x1a0
[  240.619290] [c0000003f64b7a40] [d0000000050000cc] register_virtio_driver+0x4c/0x60 [virtio]
[  240.619878] [c0000003f64b7a60] [d000000005121ef8] init+0x90/0xe8 [virtio_blk]
[  240.620456] [c0000003f64b7ae0] [c00000000000c17c] do_one_initcall+0x12c/0x2c0
[  240.621021] [c0000003f64b7b70] [c000000000166528] load_module+0x19a8/0x2120
[  240.621574] [c0000003f64b7d50] [c000000000166ef0] SyS_finit_module+0xd0/0x130
[  240.622127] [c0000003f64b7e30] [c00000000000a0fc] syscall_exit+0x0/0x7c


Version-Release number of selected component (if applicable):
qemu-kvm-rhev-2.1.2-16.el7.ppc64

Guest: RHEL 7.1 ppc64le
kernel-3.10.0-201.ael7a.ppc64le

How reproducible:
100%

Steps to Reproduce:
1. Boot ppc64le guest with data-plane.
# /usr/libexec/qemu-kvm ... \
    -object iothread,id=iothread0 \
    -drive file=/home/mnt/data-virtio-blk.qcow2,if=none,id=drive-virtio-disk0,format=qcow2 \
    -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0,iothread=iothread0


Actual results:


Expected results:


Additional info:

Comment 2 Stefan Hajnoczi 2015-01-08 16:52:52 UTC
virtio-blk dataplane does not use hw/virtio/virtio.c for virtqueue accesses.  Instead it uses hw/virtio/dataplane/vring.c, which does not go through the virtio_lduw_phys() family of memory access functions.

I haven't checked the details but I suspect vring.c needs something like virtio.c's endianness support.

Comment 3 David Gibson 2015-01-13 03:49:31 UTC
Ugh, yes, it appears that the dataplane vring stuff assumes guest endian == host endian, which is broken.

It's not quickly obvious to me which are the structures lying in guest visible space, so it would take me a while to figure this one out (especially since I've never tried to use dataplane before).

Is there anyone more familiar with the dataplane code who has time to attempt this?

Comment 4 Stefan Hajnoczi 2015-01-14 14:10:16 UTC
(In reply to David Gibson from comment #3)
> Is there anyone more familiar with the dataplane code who has time to
> attempt this?

Please email me login details to a ppc host with ppc64le guest I can use for testing.

Comment 11 David Gibson 2015-01-22 03:39:17 UTC
I've now tested Cornelia Huck's patches for this (http://lists.gnu.org/archive/html/qemu-devel/2015-01/msg02494.html).

It looks like it solves the problem, I'll port it back down once it's merged upstream.

Comment 15 Qunfang Zhang 2015-07-24 06:18:02 UTC
Verified the bug on the following package version, the issue does not exist any more. The original issue is LE guest on BE host. As we will only test LE host on 7.2, so I test both BE and LE guest. Boot and repeat reboot operation. 

Host:
kernel-3.10.0-292.el7.ppc64le
qemu-kvm-rhev-2.3.0-12.el7.ppc64le

Guest:
kernel-3.10.0-229.ael7b.ppc64le
kernel-3.10.0.195.el7.ppc64
kernel-3.10.0.290.el7.ppc64

Steps:

1. Boot up a rhel7 le guest 

#  /usr/libexec/qemu-kvm -name test -machine pseries,accel=kvm,usb=off -m 4G -smp 4,sockets=1,cores=4,threads=1 -uuid f8f86c51-7018-4d0b-1212-ed1d513e2f57 -realtime mlock=off -no-user-config -nodefaults -monitor stdio -rtc base=utc -boot strict=on -object iothread,id=iothread0 -drive file=RHEL-Server-7.2-ppc64le-virtio.qcow2,if=none,id=drive-virtio-disk0,format=qcow2 -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0,iothread=iothread0 -netdev tap,id=hostnet0,script=/etc/qemu-ifup,downscript=/etc/qemu-ifdown -device spapr-vlan,netdev=hostnet0,id=net0,mac=00:54:5f:5d:5c:5a,reg=0x2000 -vnc :20 -msg timestamp=on -usb -device usb-tablet,id=tablet1 -vga std

2. Reboot the guest (tried 3 times)

Result: 
Guest could reboot successfully. 

Based on above, I will set the status to VERIFIED. Any issue please comment here. Thanks.

Comment 17 errata-xmlrpc 2015-12-04 16:24:10 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2015-2546.html


Note You need to log in before you can comment on or make changes to this bug.