Bug 1972515

Summary: Windows Installation blocked on 4k disk when using blk+raw+iothread
Product: Red Hat Enterprise Linux 8 Reporter: qing.wang <qinwang>
Component: qemu-kvmAssignee: Kevin Wolf <kwolf>
qemu-kvm sub component: virtio-blk,scsi QA Contact: qing.wang <qinwang>
Status: CLOSED ERRATA Docs Contact:
Severity: high    
Priority: medium CC: coli, ddepaula, jinzhao, juzhang, kkiwi, kwolf, lijin, menli, qzhang, virt-maint, xuwei
Version: 8.5Keywords: Regression, Triaged
Target Milestone: rc   
Target Release: 8.6   
Hardware: All   
OS: All   
Whiteboard:
Fixed In Version: qemu-kvm-6.1.0-2.module+el8.6.0+12861+13975d62 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1972079 Environment:
Last Closed: 2022-05-10 13:18:42 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1972079, 1997410, 2002631    
Bug Blocks:    

Description qing.wang 2021-06-16 06:11:29 UTC
+++ This bug was initially created as a clone of Bug #1972079 +++

Description of problem:

Install windows guest (example win10,win2016,win2019),
It will auto reboot the guest after most steps finished in the installation.

The windows guest will step in  black screen.
It can not finish installation with specific configuration:
raw image + virtio + iothread on 4K disk.
.



Version-Release number of selected component (if applicable):
Red Hat Enterprise Linux release 8.5 Beta (Ootpa)
4.18.0-310.el8.x86_64
qemu-kvm-common-6.0.0-18.module+el8.5.0+11243+5269aaa1.x86_64


How reproducible:
100% on specific host (4k disk)

Disk /dev/mapper/rhel_dell--per440--10-home: 455.5 GiB, 489093595136 bytes, 119407616 sectors
Units: sectors of 1 * 4096 = 4096 bytes
Sector size (logical/physical): 4096 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes


other host can not reproduce this issue.

Steps to Reproduce:
1.create raw image file
qemu-img create -f raw /home/kvm_autotest_root/images/win10.raw 30g

2.boot vm with blk+raw+iothread
/usr/libexec/qemu-kvm \
    \
    -name 'avocado-vt-vm1'  \
    -sandbox on  \
    -machine q35,memory-backend=mem-machine_mem \
    -device pcie-root-port,id=pcie-root-port-0,multifunction=on,bus=pcie.0,addr=0x1,chassis=1 \
    -device pcie-pci-bridge,id=pcie-pci-bridge-0,addr=0x0,bus=pcie-root-port-0  \
    -nodefaults \
    -device VGA,bus=pcie.0,addr=0x2 \
    -m 12288 \
    -object memory-backend-ram,size=12288M,id=mem-machine_mem  \
    -smp 10,maxcpus=10,cores=5,threads=1,dies=1,sockets=2  \
    -cpu 'Cascadelake-Server-noTSX',hv_stimer,hv_synic,hv_vpindex,hv_relaxed,hv_spinlocks=0x1fff,hv_vapic,hv_time,hv_frequencies,hv_runtime,hv_tlbflush,hv_reenlightenment,hv_stimer_direct,hv_ipi,+kvm_pv_unhalt \
   \
    -device pcie-root-port,id=pcie-root-port-1,port=0x1,addr=0x1.0x1,bus=pcie.0,chassis=2 \
    -device qemu-xhci,id=usb1,bus=pcie-root-port-1,addr=0x0 \
    -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \
    -object iothread,id=iothread0 \
    -object iothread,id=iothread1 \
    -blockdev node-name=file_image1,driver=file,auto-read-only=on,discard=unmap,aio=native,filename=/home/kvm_autotest_root/images/win10.raw,cache.direct=on,cache.no-flush=off \
    -blockdev node-name=drive_image1,driver=raw,read-only=off,cache.direct=on,cache.no-flush=off,file=file_image1 \
    -device pcie-root-port,id=pcie-root-port-2,port=0x2,addr=0x1.0x2,bus=pcie.0,chassis=3 \
    -device virtio-blk-pci,id=image1,drive=drive_image1,bootindex=0,write-cache=on,serial=SYSTEM_DISK0,bus=pcie-root-port-2,addr=0x0,iothread=iothread0 \
    -device pcie-root-port,id=pcie-root-port-3,port=0x3,addr=0x1.0x3,bus=pcie.0,chassis=4 \
    -device virtio-net-pci,mac=9a:a6:6b:c2:93:56,id=idd0M4NV,netdev=idtL9U8k,bus=pcie-root-port-3,addr=0x0  \
    -netdev tap,id=idtL9U8k,vhost=on \
    -blockdev node-name=file_cd1,driver=file,auto-read-only=on,discard=unmap,aio=native,filename=/home/kvm_autotest_root/iso/ISO/Win10/en_windows_10_business_editions_version_21h1_x64_dvd_ec5a76c1.iso,cache.direct=on,cache.no-flush=off \
    -blockdev node-name=drive_cd1,driver=raw,read-only=on,cache.direct=on,cache.no-flush=off,file=file_cd1 \
    -device ide-cd,id=cd1,drive=drive_cd1,bootindex=1,write-cache=on,bus=ide.0,unit=0 \
    -blockdev node-name=file_unattended,driver=file,auto-read-only=on,discard=unmap,aio=native,filename=/home/kvm_autotest_root/iso/windows/virtio-win-prewhql-0.1-201.iso,cache.direct=on,cache.no-flush=off \
    -blockdev node-name=drive_unattended,driver=raw,read-only=on,cache.direct=on,cache.no-flush=off,file=file_unattended \
    -device ide-cd,id=unattended,drive=drive_unattended,bootindex=3,write-cache=on,bus=ide.2,unit=0  \
    -vnc :5  \
    -rtc base=localtime,clock=host,driftfix=slew  \
    -boot menu=off,order=cdn,once=d,strict=off \
    -enable-kvm -monitor stdio \
    -device pcie-root-port,id=pcie_extra_root_port_0,multifunction=on,bus=pcie.0,addr=0x3,chassis=5


3. start to install

Actual results:
Windows guest step in blockscreen and no response

Expected results:
The Installation can succeed.

Additional info:
automation:
python ConfigTest.py --testcase=unattended_install.cdrom.extra_cdrom_ks.default_install.aio_threads --iothread_scheme=roundrobin --nr_iothreads=2 --platform=x86_64 --guestname=Win10 --driveformat=virtio_blk --nicmodel=virtio_net --imageformat=raw --machines=q35  --customsparams="cd_format=ide\nimage_aio=native"


It may pass following combination.
blk+raw
blk+qcow2+iothread 
scsi+raw+iothread

It may pass if we put the raw file on non-4k disk, like  nfs 
-blockdev node-name=file_image1,driver=file,auto-read-only=on,discard=unmap,aio=native,filename=/home/nfs/win10.raw,cache.direct=on,cache.no-flush=off \
    -blockdev node-name=drive_image1,driver=raw,read-only=off,cache.direct=on,cache.no-flush=off,file=file_image1 \
    -device pcie-root-port,id=pcie-root-port-2,port=0x2,addr=0x1.0x2,bus=pcie.0,chassis=3 \



So this issue look like related to blk+raw+iothread+4k disks ?

--- Additional comment from Klaus Heinrich Kiwi on 2021-06-15 21:27:45 CST ---

Kevin,

 assigning this for you for now. Care to take a look?

I see that the machine type is q35 - that edk2-based image should in theory support 4k disks, doesn't it?

Qing wang: do you have results for this exact same test in other versions involved (i.e., RHEL8)? Just for us to try to clarify if this is a regression or it simply never worked?

 -Klaus

--- Additional comment from qing.wang on 2021-06-16 14:08:45 CST ---

The 4K disks can be emulated via targetcli iscsi server:
example: Targetcli /backstores/fileio/disk set attribute block_size=4096

o- / ..................................................................... [...]
  o- backstores .......................................................... [...]
  | o- block .............................................. [Storage Objects: 0]
  | o- fileio ............................................. [Storage Objects: 1]
  | | o- disk1 ........... [/home/iscsi/onex.img (40.0GiB) write-back activated]
  | |   o- alua ............................................... [ALUA Groups: 1]
  | |     o- default_tg_pt_gp ................... [ALUA state: Active/optimized]
  | o- pscsi .............................................. [Storage Objects: 0]
  | o- ramdisk ............................................ [Storage Objects: 0]
  o- iscsi ........................................................ [Targets: 1]
  | o- iqn.2016-06.one.server:one-a .................................. [TPGs: 1]
  |   o- tpg1 ........................................... [no-gen-acls, no-auth]
  |     o- acls ...................................................... [ACLs: 2]
  |     | o- iqn.1994-05.com.redhat:clienta ................... [Mapped LUNs: 1]
  |     | | o- mapped_lun0 ............................ [lun0 fileio/disk1 (rw)]
  |     | o- iqn.1994-05.com.redhat:clientb ................... [Mapped LUNs: 1]
  |     |   o- mapped_lun0 ............................ [lun0 fileio/disk1 (rw)]
  |     o- luns ...................................................... [LUNs: 1]
  |     | o- lun0 ..... [fileio/disk1 (/home/iscsi/onex.img) (default_tg_pt_gp)]
  |     o- portals ................................................ [Portals: 1]
  |       o- 0.0.0.0:3260 ................................................. [OK]
  o- loopback ..................................................... [Targets: 0]

It looks like regression issue. Hit same on 

Red Hat Enterprise Linux release 8.5 Beta (Ootpa)
4.18.0-310.el8.x86_64
qemu-kvm-common-6.0.0-18.module+el8.5.0+11243+5269aaa1.x86_64

But not found on
Red Hat Enterprise Linux release 8.4 (Ootpa)
4.18.0-305.el8.x86_64
qemu-kvm-common-5.2.0-16.module+el8.4.0+10806+b7d97207.x86_64

Comment 1 John Ferlan 2021-09-09 11:33:06 UTC
Bulk update: Move RHEL-AV bugs to RHEL8 with existing RHEL9 clone.

Comment 2 John Ferlan 2021-09-09 11:38:19 UTC
Update based on what was found in the cloned from bug 1972079. The noted patch was included in qemu-6.1 which will be used for the initial rebase for 8.6.0, so move directly to POST.

If this requires a RHEL-AV 8.5.0 fix, then please clone this bug in order to resolve.

Comment 4 qing.wang 2021-09-22 10:00:33 UTC
Not find this issue on


Red Hat Enterprise Linux release 9.0 Beta (Plow)
5.14.0-2.el9.x86_64
qemu-kvm-6.1.0-2.el9.x86_64
seabios-bin-1.14.0-6.el9.noarch
edk2-ovmf-20210527gite1999b264f1f-6.el9.noarch
virtio-win-prewhql-0.1-207.iso

Comment 7 errata-xmlrpc 2022-05-10 13:18:42 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: virt:rhel and virt-devel:rhel security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:1759