Bug 1177094

Summary: endian issue due to ppc64le guest w/ data-plane
Product: Red Hat Enterprise Linux 7 Reporter: Xu Han <xuhan>
Component: qemu-kvm-rhevAssignee: David Gibson <dgibson>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 7.1CC: areis, hannsj_uhl, knoel, michen, mrezanin, ngu, qzhang, sherold, stefanha, virt-maint, ypu
Target Milestone: rc   
Target Release: ---   
Hardware: ppc64le   
OS: Linux   
Whiteboard:
Fixed In Version: qemu 2.3 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-12-04 16:24:10 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1201513    

Description Xu Han 2014-12-24 07:18:51 UTC
Description of problem:
endian issue due to ppc64le guest w/ data-plane.

e.g. fail to reboot guest:
[  144.043553] dracut Warning: Cannot umount /oldroot
...
[  144.072464] dracut Warning: lr-x------. 1 root 0 64 Dec 24 08:10 6 -> /oldroot/usr/lib/modules/3.10.0-201.ael7a.ppc64le/kernel/drivers/block/virtio_blk.ko
...
[  144.079606] device-mapper: ioctl: remove_all left 1 open device(s)
Rebooting.
[  240.602272] INFO: task systemd-udevd:667 blocked for more than 120 seconds.
[  240.602937] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  240.603488] systemd-udevd   D 00003fffa083fe44     0   667      1 0x00040002
[  240.604038] Call Trace:
[  240.604660] [c0000003f64b69f0] [0000000000000400] 0x400 (unreliable)
[  240.605239] [c0000003f64b6bc0] [c0000000000164d8] __switch_to+0x268/0x450
[  240.605801] [c0000003f64b6c20] [c0000000009000e0] __schedule+0x2f0/0xc00
[  240.606357] [c0000003f64b6e70] [c000000000900f78] io_schedule+0xc8/0x1b0
[  240.607048] [c0000003f64b6ea0] [c00000000021aa88] sleep_on_page+0x18/0x30
[  240.607601] [c0000003f64b6ec0] [c0000000008fd4a8] __wait_on_bit_lock+0x118/0x2e0
[  240.608164] [c0000003f64b6f30] [c000000000220124] do_read_cache_page+0x534/0x590
[  240.608732] [c0000003f64b7000] [c0000000002201c4] read_cache_page+0x24/0x170
[  240.609304] [c0000003f64b7070] [c000000000458c70] read_dev_sector+0x40/0xe0
[  240.609872] [c0000003f64b70a0] [c00000000045d29c] read_lba+0xdc/0x200
[  240.610439] [c0000003f64b7100] [c00000000045da08] find_valid_gpt+0x108/0x6e0
[  240.611011] [c0000003f64b7210] [c00000000045e3dc] efi_partition+0x3fc/0x470
[  240.611585] [c0000003f64b7390] [c00000000045a198] check_partition+0x148/0x2f0
[  240.612176] [c0000003f64b7410] [c000000000459700] rescan_partitions+0x250/0x960
[  240.612766] [c0000003f64b7560] [c00000000033d68c] __blkdev_get+0x48c/0x670
[  240.613349] [c0000003f64b75d0] [c00000000033daa0] blkdev_get+0x230/0x4a0
[  240.613937] [c0000003f64b7680] [c0000000004565ec] add_disk+0x62c/0x6d0
[  240.614535] [c0000003f64b7740] [d000000005121768] virtblk_probe+0x488/0x8b0 [virtio_blk]
[  240.615139] [c0000003f64b77c0] [d0000000050003a8] virtio_dev_probe+0x1c8/0x2e0 [virtio]
[  240.615751] [c0000003f64b7800] [c0000000005b5368] driver_probe_device+0x258/0x500
[  240.616359] [c0000003f64b7890] [c0000000005b57ac] __driver_attach+0x10c/0x110
[  240.616949] [c0000003f64b78d0] [c0000000005b16cc] bus_for_each_dev+0x8c/0xf0
[  240.617540] [c0000003f64b7920] [c0000000005b45bc] driver_attach+0x2c/0x40
[  240.618128] [c0000003f64b7940] [c0000000005b4088] bus_add_driver+0x298/0x3b0
[  240.618711] [c0000003f64b79d0] [c0000000005b6200] driver_register+0xb0/0x1a0
[  240.619290] [c0000003f64b7a40] [d0000000050000cc] register_virtio_driver+0x4c/0x60 [virtio]
[  240.619878] [c0000003f64b7a60] [d000000005121ef8] init+0x90/0xe8 [virtio_blk]
[  240.620456] [c0000003f64b7ae0] [c00000000000c17c] do_one_initcall+0x12c/0x2c0
[  240.621021] [c0000003f64b7b70] [c000000000166528] load_module+0x19a8/0x2120
[  240.621574] [c0000003f64b7d50] [c000000000166ef0] SyS_finit_module+0xd0/0x130
[  240.622127] [c0000003f64b7e30] [c00000000000a0fc] syscall_exit+0x0/0x7c


Version-Release number of selected component (if applicable):
qemu-kvm-rhev-2.1.2-16.el7.ppc64

Guest: RHEL 7.1 ppc64le
kernel-3.10.0-201.ael7a.ppc64le

How reproducible:
100%

Steps to Reproduce:
1. Boot ppc64le guest with data-plane.
# /usr/libexec/qemu-kvm ... \
    -object iothread,id=iothread0 \
    -drive file=/home/mnt/data-virtio-blk.qcow2,if=none,id=drive-virtio-disk0,format=qcow2 \
    -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0,iothread=iothread0


Actual results:


Expected results:


Additional info:

Comment 2 Stefan Hajnoczi 2015-01-08 16:52:52 UTC
virtio-blk dataplane does not use hw/virtio/virtio.c for virtqueue accesses.  Instead it uses hw/virtio/dataplane/vring.c, which does not go through the virtio_lduw_phys() family of memory access functions.

I haven't checked the details but I suspect vring.c needs something like virtio.c's endianness support.

Comment 3 David Gibson 2015-01-13 03:49:31 UTC
Ugh, yes, it appears that the dataplane vring stuff assumes guest endian == host endian, which is broken.

It's not quickly obvious to me which are the structures lying in guest visible space, so it would take me a while to figure this one out (especially since I've never tried to use dataplane before).

Is there anyone more familiar with the dataplane code who has time to attempt this?

Comment 4 Stefan Hajnoczi 2015-01-14 14:10:16 UTC
(In reply to David Gibson from comment #3)
> Is there anyone more familiar with the dataplane code who has time to
> attempt this?

Please email me login details to a ppc host with ppc64le guest I can use for testing.

Comment 11 David Gibson 2015-01-22 03:39:17 UTC
I've now tested Cornelia Huck's patches for this (http://lists.gnu.org/archive/html/qemu-devel/2015-01/msg02494.html).

It looks like it solves the problem, I'll port it back down once it's merged upstream.

Comment 15 Qunfang Zhang 2015-07-24 06:18:02 UTC
Verified the bug on the following package version, the issue does not exist any more. The original issue is LE guest on BE host. As we will only test LE host on 7.2, so I test both BE and LE guest. Boot and repeat reboot operation. 

Host:
kernel-3.10.0-292.el7.ppc64le
qemu-kvm-rhev-2.3.0-12.el7.ppc64le

Guest:
kernel-3.10.0-229.ael7b.ppc64le
kernel-3.10.0.195.el7.ppc64
kernel-3.10.0.290.el7.ppc64

Steps:

1. Boot up a rhel7 le guest 

#  /usr/libexec/qemu-kvm -name test -machine pseries,accel=kvm,usb=off -m 4G -smp 4,sockets=1,cores=4,threads=1 -uuid f8f86c51-7018-4d0b-1212-ed1d513e2f57 -realtime mlock=off -no-user-config -nodefaults -monitor stdio -rtc base=utc -boot strict=on -object iothread,id=iothread0 -drive file=RHEL-Server-7.2-ppc64le-virtio.qcow2,if=none,id=drive-virtio-disk0,format=qcow2 -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0,iothread=iothread0 -netdev tap,id=hostnet0,script=/etc/qemu-ifup,downscript=/etc/qemu-ifdown -device spapr-vlan,netdev=hostnet0,id=net0,mac=00:54:5f:5d:5c:5a,reg=0x2000 -vnc :20 -msg timestamp=on -usb -device usb-tablet,id=tablet1 -vga std

2. Reboot the guest (tried 3 times)

Result: 
Guest could reboot successfully. 

Based on above, I will set the status to VERIFIED. Any issue please comment here. Thanks.

Comment 17 errata-xmlrpc 2015-12-04 16:24:10 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2015-2546.html