Bug 1994041
Summary: | qemu-kvm scsi: change default passthrough timeout to non-infinite | |||
---|---|---|---|---|
Product: | Red Hat Enterprise Linux 8 | Reporter: | Frank DeLorey <fdelorey> | |
Component: | qemu-kvm | Assignee: | qing.wang <qinwang> | |
qemu-kvm sub component: | virtio-blk,scsi | QA Contact: | qing.wang <qinwang> | |
Status: | CLOSED ERRATA | Docs Contact: | ||
Severity: | high | |||
Priority: | urgent | CC: | areis, chorn, cnagarka, coli, darren.lavender, ddepaula, gveitmic, jinzhao, juzhang, kkiwi, knoel, pbonzini, qinwang, ribarry, rick.beldin, shane.seymour, virt-maint, zhguo | |
Version: | 8.4 | Keywords: | Triaged, ZStream | |
Target Milestone: | rc | |||
Target Release: | --- | |||
Hardware: | x86_64 | |||
OS: | All | |||
Whiteboard: | ||||
Fixed In Version: | qemu-kvm-4.2.0-59.module+el8.5.0+12817+cb650d43 | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 2001587 2003071 (view as bug list) | Environment: | ||
Last Closed: | 2021-11-09 18:02:58 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 2001587, 2003071, 2004334 |
Description
Frank DeLorey
2021-08-16 14:51:52 UTC
I can not totally reproduce this issue, but i hit similar issue. I am not sure they are same reason, it will postpone about 2 minutes then the vm be killed. Environment prepare: Iscsi target server 1.build iscsi server root@qing /home/vbugs $ targetcli ls o- / ................................................................................................ [...] o- backstores ..................................................................................... [...] | o- block ......................................................................... [Storage Objects: 0] | o- fileio ........................................................................ [Storage Objects: 1] | | o- one ........................................ [/home/iscsi/onex.img (30.0GiB) write-back activated] | | o- alua .......................................................................... [ALUA Groups: 1] | | o- default_tg_pt_gp .............................................. [ALUA state: Active/optimized] | o- pscsi ......................................................................... [Storage Objects: 0] | o- ramdisk ....................................................................... [Storage Objects: 0] o- iscsi ................................................................................... [Targets: 1] | o- iqn.2016-06.one.server:one-a ............................................................. [TPGs: 1] | o- tpg1 ...................................................................... [no-gen-acls, no-auth] | o- acls ................................................................................. [ACLs: 2] | | o- iqn.1994-05.com.redhat:clienta .............................................. [Mapped LUNs: 1] | | | o- mapped_lun0 ......................................................... [lun0 fileio/one (rw)] | | o- iqn.1994-05.com.redhat:clientb .............................................. [Mapped LUNs: 1] | | o- mapped_lun0 ......................................................... [lun0 fileio/one (rw)] | o- luns ................................................................................. [LUNs: 1] | | o- lun0 .................................. [fileio/one (/home/iscsi/onex.img) (default_tg_pt_gp)] | o- portals ........................................................................... [Portals: 1] | o- 0.0.0.0:3260 ............................................................................ [OK] o- loopback ................................................................................ [Targets: 0] 2. connect the iscsi disk on host iscsiadm -m discovery -t st -p qing iscsiadm -m node -T iqn.2016-06.one.server:one-a -p qing:3260 -l root@dell-per440-07 ~ $ lsblk ... sdd 8:48 0 30G 0 disk 3. boot vm with the lun (/dev/sdd) /usr/libexec/qemu-kvm \ -name testvm \ -machine pc \ -m 8G \ -smp 8 \ -cpu host,+kvm_pv_unhalt \ -device ich9-usb-ehci1,id=usb1 \ -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \ -device virtio-scsi-pci,id=scsi0,bus=pci.0,addr=0xa \ -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x6 \ -drive file=/home/kvm_autotest_root/images/rhel840-64-virtio-scsi.qcow2,format=qcow2,if=none,id=drive-virtio-disk0,cache=none \ -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x7,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 \ \ -drive file=/dev/sdd,format=raw,if=none,id=drive-scsi0-0-0-0,cache=none \ -device scsi-block,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0,bootindex=2 \ -vnc :5 \ -qmp tcp:0:5955,server,nowait \ -monitor stdio \ -netdev \ tap,id=hostnet0,vhost=on \ -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:dd:0a:82,bus=pci.0 4.login the guest and execute io on the disk sg_dd if=/dev/zero of=/dev/sda bs=4k count=7000000 5.stop the iscsi target server systemctl stop target 6.find the qemu process and kill it root@ibm-x3850x5-08 /home/vbugs $ pgrep qemu-kvm 5222 root@ibm-x3850x5-08 /home/vbugs $ kill -9 5222 the process step in Sl+ status not still exist (It will be truly killed about after 2 minutes) root@dell-per440-07 ~ $ ps 87602 PID TTY STAT TIME COMMAND 5222 pts/0 Sl+ 1:27 /usr/libexec/qemu-kvm if i skip step5 ,this process will be killed quickly. if i recover target service systemctl start target,this process also be killed quickly. (In reply to qing.wang from comment #7) > I can not totally reproduce this issue, but i hit similar issue. > I am not sure they are same reason, it will postpone about 2 minutes then > the vm be killed. > > Environment prepare: > Iscsi target server > > 1.build iscsi server > > root@qing /home/vbugs $ targetcli ls > o- / > ............................................................................. > ................... [...] > o- backstores > ............................................................................. > ........ [...] > | o- block > ......................................................................... > [Storage Objects: 0] > | o- fileio > ........................................................................ > [Storage Objects: 1] > | | o- one ........................................ [/home/iscsi/onex.img > (30.0GiB) write-back activated] > | | o- alua > .......................................................................... > [ALUA Groups: 1] > | | o- default_tg_pt_gp .............................................. > [ALUA state: Active/optimized] > | o- pscsi > ......................................................................... > [Storage Objects: 0] > | o- ramdisk > ....................................................................... > [Storage Objects: 0] > o- iscsi > ............................................................................. > ...... [Targets: 1] > | o- iqn.2016-06.one.server:one-a > ............................................................. [TPGs: 1] > | o- tpg1 > ...................................................................... > [no-gen-acls, no-auth] > | o- acls > ............................................................................. > .... [ACLs: 2] > | | o- iqn.1994-05.com.redhat:clienta > .............................................. [Mapped LUNs: 1] > | | | o- mapped_lun0 > ......................................................... [lun0 fileio/one > (rw)] > | | o- iqn.1994-05.com.redhat:clientb > .............................................. [Mapped LUNs: 1] > | | o- mapped_lun0 > ......................................................... [lun0 fileio/one > (rw)] > | o- luns > ............................................................................. > .... [LUNs: 1] > | | o- lun0 .................................. [fileio/one > (/home/iscsi/onex.img) (default_tg_pt_gp)] > | o- portals > ........................................................................... > [Portals: 1] > | o- 0.0.0.0:3260 > ............................................................................ > [OK] > o- loopback > ............................................................................. > ... [Targets: 0] > > 2. connect the iscsi disk on host > > iscsiadm -m discovery -t st -p qing > iscsiadm -m node -T iqn.2016-06.one.server:one-a -p qing:3260 -l > > root@dell-per440-07 ~ $ lsblk > ... > sdd 8:48 0 30G 0 disk > > 3. boot vm with the lun (/dev/sdd) > /usr/libexec/qemu-kvm \ > -name testvm \ > -machine pc \ > -m 8G \ > -smp 8 \ > -cpu host,+kvm_pv_unhalt \ > -device ich9-usb-ehci1,id=usb1 \ > -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \ > -device virtio-scsi-pci,id=scsi0,bus=pci.0,addr=0xa \ > -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x6 \ > -drive > file=/home/kvm_autotest_root/images/rhel840-64-virtio-scsi.qcow2, > format=qcow2,if=none,id=drive-virtio-disk0,cache=none \ > -device > virtio-blk-pci,scsi=off,bus=pci.0,addr=0x7,drive=drive-virtio-disk0, > id=virtio-disk0,bootindex=1 \ > \ > -drive file=/dev/sdd,format=raw,if=none,id=drive-scsi0-0-0-0,cache=none \ > -device > scsi-block,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0-0-0-0, > id=scsi0-0-0-0,bootindex=2 \ > -vnc :5 \ > -qmp tcp:0:5955,server,nowait \ > -monitor stdio \ > -netdev \ > tap,id=hostnet0,vhost=on \ > -device > virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:dd:0a:82,bus=pci.0 > > 4.login the guest and execute io on the disk > sg_dd if=/dev/zero of=/dev/sda bs=4k count=7000000 > > > 5.stop the iscsi target server > systemctl stop target > > 6.find the qemu process and kill it > root@ibm-x3850x5-08 /home/vbugs $ pgrep qemu-kvm > 5222 > root@ibm-x3850x5-08 /home/vbugs $ kill -9 5222 > > the process step in Sl+ status not still exist > (It will be truly killed about after 2 minutes) > > root@dell-per440-07 ~ $ ps 87602 > PID TTY STAT TIME COMMAND > 5222 pts/0 Sl+ 1:27 /usr/libexec/qemu-kvm > > if i skip step5 ,this process will be killed quickly. > if i recover target service > systemctl start target,this process also be killed quickly. I have same result on qemu-kvm-common-6.0.0-28.module+el8.5.0+12271+fffa967b.x86_64 This bz got cloned to rhel8.4.z stream with bz2001587 (per default permissions are not copied to z-stream clones, as technical discussion happens mostly in the main bz, so this one in our case) I also tried the fc backend and simulating faulty disk,but it can not reproduce this issue. I wonder what operation result in io error and wait infinite. 1.emualte bad block on FC dmsetup create test << EOF 0 160000 linear /dev/sdb 0 160000 5 error 160005 80000 linear /dev/sdb 40005 EOF 2.expose it wit target server root@dell-per440-07 /home/vbugs/feature $ targetcli ls o- / ...................................................................................... [...] o- backstores ........................................................................... [...] | o- block ............................................................... [Storage Objects: 1] | | o- disk0 ............................... [/dev/mapper/test (117.2MiB) write-thru activated] | | o- alua ................................................................ [ALUA Groups: 1] | | o- default_tg_pt_gp .................................... [ALUA state: Active/optimized] | o- fileio .............................................................. [Storage Objects: 0] | o- pscsi ............................................................... [Storage Objects: 0] | o- ramdisk ............................................................. [Storage Objects: 0] o- iscsi ......................................................................... [Targets: 1] | o- iqn.2016-06.one.server:block ................................................... [TPGs: 1] | o- tpg1 ............................................................ [no-gen-acls, no-auth] | o- acls ....................................................................... [ACLs: 1] | | o- iqn.1994-05.com.redhat:clientb .................................... [Mapped LUNs: 1] | | o- mapped_lun0 .............................................. [lun0 block/disk0 (rw)] | o- luns ....................................................................... [LUNs: 1] | | o- lun0 ........................... [block/disk0 (/dev/mapper/test) (default_tg_pt_gp)] | o- portals ................................................................. [Portals: 1] | o- 0.0.0.0:3260 .................................................................. [OK] o- loopback ...................................................................... [Targets: 0] 3. boot vm with attached disk /usr/libexec/qemu-kvm \ -name testvm \ -machine pc \ -m 8G \ -smp 8 \ -cpu host,+kvm_pv_unhalt \ -device ich9-usb-ehci1,id=usb1 \ -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \ -device virtio-scsi-pci,id=scsi0,bus=pci.0,addr=0xa \ -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x6 \ -drive file=/home/kvm_autotest_root/images/rhel840-64-virtio-scsi.qcow2,format=qcow2,if=none,id=drive-virtio-disk0,cache=none \ -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x7,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 \ \ -drive file=/dev/sdd,format=raw,if=none,id=drive-scsi0-0-0-0 \ -device scsi-block,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0-0-0-0,id=disk1,bootindex=2 \ \ -vnc :5 \ -qmp tcp:0:5955,server,nowait \ -monitor stdio \ -netdev \ tap,id=hostnet0,vhost=on \ -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:dd:0a:82,bus=pci.0 4.execute io in guest sg_dd if=/dev/zero of=/dev/sda bs=1M count=100 oflag=direct the io operation will hit io error but not step in wait status. So the question is how to let guest io operation blocked in wait ? then may verify the kill operation. FYI for HPE, also iSCSI and nbd (network block devices) as backend were tried. Maybe HPE has further ideas.. (In reply to Christian Horn from comment #24) > FYI for HPE, also iSCSI and nbd (network block devices) as backend were > tried. > Maybe HPE has further ideas.. Thanks, but i think the backend is not the points, If it is related to specific HW , we need to know what HW error make io step in wait. -drive file=/dev/sdd,format=raw,if=none,id=drive-scsi0-0-0-0 \ -device scsi-block,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0-0-0-0,id=disk1,bootindex=2 \
> --- Comment #24 from Christian Horn <chorn> ---
> FYI for HPE, also iSCSI and nbd (network block devices) as backend were tried.
> Maybe HPE has further ideas..
>
I believe there are two key things to reproducing this (somewhat artificially):
a. the initiator needs to be sending periodic SG_IO requests to the target via the kvm execute_command() function below, ideally a TUR as we see in the vmcore. I think the TURs that we see in the vmcore probably originate from the guest's multipathd. I assume the TURs get re-written/modified by kvm to add the infinite timeout but honestly, I never looked in to seeing how this was originated or how we come through this kvm execute_command() function so I am somewhat guessing....
b. the target needs to not respond to some of these SG_IO TURs
Can you try and configure multipathd in the guest? You will need to do some tracing to see what happens to the IO, does it get emitted from kvm with the infinite SG_IO timeout? That is fundamental and key to the behaviour. As mentioned, from the kvm side I don't know how we get to this function but I suspect this is where the IOs that get stuck originate:
"hw/scsi/scsi-generic.c" 520L, 14606C
160 static int execute_command(BlockDriverState *bdrv,
161 SCSIGenericReq *r, int direction,
162 BlockDriverCompletionFunc *complete)
163 {
164 r->io_header.interface_id = 'S';
165 r->io_header.dxfer_direction = direction;
166 r->io_header.dxferp = r->buf;
167 r->io_header.dxfer_len = r->buflen;
168 r->io_header.cmdp = r->req.cmd.buf;
169 r->io_header.cmd_len = r->req.cmd.len;
170 r->io_header.mx_sb_len = sizeof(r->req.sense);
171 r->io_header.sbp = r->req.sense;
172 r->io_header.timeout = MAX_UINT; <<<<<<-------
173 r->io_header.usr_ptr = r;
174 r->io_header.flags |= SG_FLAG_DIRECT_IO;
175
176 r->req.aiocb = bdrv_aio_ioctl(bdrv, SG_IO, &r->io_header, complete, r);
177 if (r->req.aiocb == NULL) {
178 return -EIO;
179 }
180
181 return 0;
182 }
You may need to set a gdb bp there to see whether a multipathd TUR uses this and has that TUR modified. If you don't see this code exercised then you are not going to see the problem.
The target part is a bit more involved since you are going to have to find a way of having the target periodically accept the request but not respond (complete) the IO request for a TUR. You don't have to ignore all of them, just every N-th request, they'll soon stack up on the initiator side. Remember for whatever reasons storage sometimes behaves like this, f/w bugs, busy-levels, who knows what, it just doesn't respond.... so you need to be able to simulate such behaviour.
At some point the queue on the initiator will get blocked. The IOs never get timed out by the initiator because they have infinite timeouts so no recovery happens. The kvm process will not shutdown/die because some threads will go UN state. That is what we observed in the vmcores...
What about using a dm-delay block device? https://www.kernel.org/doc/html/latest/admin-guide/device-mapper/delay.html If the delay is not enough or cannot be high enough to test this, we can also suspend/resume the device: dmsetup suspend <dev> dmsetup resume <dev> (In reply to Germano Veit Michel from comment #28) > What about using a dm-delay block device? > https://www.kernel.org/doc/html/latest/admin-guide/device-mapper/delay.html > > If the delay is not enough or cannot be high enough to test this, we can > also suspend/resume the device: > > dmsetup suspend <dev> > dmsetup resume <dev> Hmm, to be clearer, to put an iscsi on top of, like on comment #7. That was using an LV, here we could delay and even stop the writes to the actual device. When if we use no caches it may reproduce... I emulated low respond scenario about 10 minutes, It looks like reproduce this issue. Red Hat Enterprise Linux Server release 7.9 (Maipo) 3.10.0-1160.el7.x86_64 qemu-kvm-1.5.3-175.el7_9.3.x86_64 1.create scsi_debug disk 128 modprobe scsi_debug dev_size_mb=128 2.create mapper device with 10 minutes delay on the disk dmsetup create test2 << EOF 0 160000 linear /dev/sdb 0 160000 5 delay /dev/sdb 0 0 /dev/sdd 0 600000 160005 80000 linear /dev/sdb 40005 EOF 3.expose mapper device with iscsi target o- / ...................................................................................... [...] o- backstores ........................................................................... [...] | o- block ............................................................... [Storage Objects: 1] | | o- disk0 .............................. [/dev/mapper/test2 (117.2MiB) write-thru activated] | | o- alua ................................................................ [ALUA Groups: 1] | | o- default_tg_pt_gp .................................... [ALUA state: Active/optimized] | o- fileio .............................................................. [Storage Objects: 0] | o- pscsi ............................................................... [Storage Objects: 0] | o- ramdisk ............................................................. [Storage Objects: 0] o- iscsi ......................................................................... [Targets: 1] | o- iqn.2016-06.one.server:block ................................................... [TPGs: 1] | o- tpg1 ............................................................ [no-gen-acls, no-auth] | o- acls ....................................................................... [ACLs: 2] | | o- iqn.1994-05.com.redhat:clienta .................................... [Mapped LUNs: 1] | | | o- mapped_lun0 .............................................. [lun0 block/disk0 (rw)] | | o- iqn.1994-05.com.redhat:clientb .................................... [Mapped LUNs: 1] | | o- mapped_lun0 .............................................. [lun0 block/disk0 (rw)] | o- luns ....................................................................... [LUNs: 1] | | o- lun0 .......................... [block/disk0 (/dev/mapper/test2) (default_tg_pt_gp)] | o- portals ................................................................. [Portals: 1] | o- 0.0.0.0:3260 .................................................................. [OK] o- loopback ...................................................................... [Targets: 0] 4. attach disk on [other] host sdd -> iscsi 5.boot vm with passthrough disk /usr/libexec/qemu-kvm \ -name testvm \ -machine pc \ -m 8G \ -smp 8 \ -cpu host,+kvm_pv_unhalt \ -device ich9-usb-ehci1,id=usb1 \ -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \ -device virtio-scsi-pci,id=scsi0,bus=pci.0,addr=0xa \ -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x6 \ -drive file=/home/kvm_autotest_root/images/rhel840-64-virtio-scsi.qcow2,format=qcow2,if=none,id=drive-virtio-disk0,cache=none \ -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x7,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 \ \ -drive file=/dev/sdd,format=raw,if=none,id=drive-scsi0-0-0-0 \ -device scsi-block,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0-0-0-0,id=disk1,bootindex=2 \ \ -vnc :5 \ -qmp tcp:0:5955,server,nowait \ -monitor stdio \ -netdev \ tap,id=hostnet0,vhost=on \ -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:dd:0a:82,bus=pci.0 /usr/libexec/qemu-kvm \ -name testvm \ -machine pc \ -m 8G \ -smp 8 \ -cpu host,+kvm_pv_unhalt \ -device ich9-usb-ehci1,id=usb1 \ -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \ -device virtio-scsi-pci,id=scsi0,bus=pci.0,addr=0xa \ -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x6 \ -blockdev driver=qcow2,file.driver=file,file.filename=/home/kvm_autotest_root/images/rhel840-64-virtio-scsi.qcow2,node-name=os_image1 \ -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x7,drive=os_image1,id=virtio-disk0,bootindex=1 \ \ -blockdev driver=raw,file.driver=host_device,file.filename=/dev/sdd,node-name=data1 \ -device scsi-block,bus=scsi0.0,drive=data1,id=disk1,bootindex=2 \ \ -vnc :5 \ -qmp tcp:0:5955,server,nowait \ -monitor stdio \ -netdev \ tap,id=hostnet0,vhost=on \ -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:dd:0a:82,bus=pci.0 6. execute io on guest and step in low speed write status dev=sda if [ "x$1" != "x" ];then dev=$1 fi echo "$dev" while true;do sg_dd if=/dev/zero of=/dev/$dev bs=1M count=100 oflag=direct echo "do dd" done 7.find the qemu process and kill it pid=`pgrep qemu-kvm`;echo $pid; kill -9 $pid time while true;do if ps $pid; then sleep 10;echo "active";else echo "exit";break;fi done 8.the real kill time is related to the delay time at step 2 But it cost 3 minutes on following version Red Hat Enterprise Linux release 8.5 Beta (Ootpa) 4.18.0-339.el8.x86_64 qemu-kvm-6.0.0-30.module+el8.5.0+12586+476da3e1.x86_64 I check the code of 12586 , it have apply the fix in scsi-generic.c. If possible who may help to confirm it. thanks. > But it cost 3 minutes on following version
> Red Hat Enterprise Linux release 8.5 Beta (Ootpa)
> 4.18.0-339.el8.x86_64
> qemu-kvm-6.0.0-30.module+el8.5.0+12586+476da3e1.x86_64
Hi qing.wang, the bug is for RHEL 8, not AV; so you need to test with the 4.2 versions of QEMU.
reproduce this with steps as # comment 30 on Red Hat Enterprise Linux release 8.5 Beta (Ootpa) 4.18.0-339.el8.x86_64 qemu-kvm-4.2.0-58.module+el8.5.0+12272+74ace547.x86_64 seabios-bin-1.13.0-2.module+el8.3.0+7353+9de0a3cc.noarch QE bot(pre verify): Set 'Verified:Tested,SanityOnly' as gating/tier1 test pass. Passed test on Red Hat Enterprise Linux release 8.5 Beta (Ootpa) 4.18.0-348.el8.x86_64 qemu-kvm-4.2.0-59.module+el8.5.0+12817+cb650d43.x86_64 seabios-bin-1.13.0-2.module+el8.3.0+7353+9de0a3cc.noarch The kill time is about 3m30s 1.create scsi_debug disk 128 modprobe scsi_debug dev_size_mb=128 2.create mapper device with 5 minutes delay on the disk disk=/dev/sdb dmsetup create test << EOF 0 160000 linear ${disk} 0 160000 5 delay ${disk} 160000 0 ${disk} 160000 300000 160005 80000 linear ${disk} 160005 EOF 3.expose mapper device with iscsi target o- / ..................................................................... [...] o- backstores .......................................................... [...] | o- block .............................................. [Storage Objects: 1] | | o- disk0 .............. [/dev/mapper/test (117.2MiB) write-thru activated] | | o- alua ............................................... [ALUA Groups: 1] | | o- default_tg_pt_gp ................... [ALUA state: Active/optimized] | o- fileio ............................................. [Storage Objects: 0] | o- pscsi .............................................. [Storage Objects: 0] | o- ramdisk ............................................ [Storage Objects: 0] o- iscsi ........................................................ [Targets: 1] | o- iqn.2016-06.one.server:block .................................. [TPGs: 1] | o- tpg1 ........................................... [no-gen-acls, no-auth] | o- acls ...................................................... [ACLs: 1] | | o- iqn.1994-05.com.redhat:clientb ................... [Mapped LUNs: 1] | | o- mapped_lun0 ............................. [lun0 block/disk0 (rw)] | o- luns ...................................................... [LUNs: 1] | | o- lun0 .......... [block/disk0 (/dev/mapper/test) (default_tg_pt_gp)] | o- portals ................................................ [Portals: 1] | o- 0.0.0.0:3260 ................................................. [OK] o- loopback ..................................................... [Targets: 0 4. attach disk on [other] host sdc -> iscsi 5.boot vm with passthrough disk /usr/libexec/qemu-kvm \ -name testvm \ -machine pc \ -m 8G \ -smp 8 \ -cpu host,+kvm_pv_unhalt \ -device ich9-usb-ehci1,id=usb1 \ -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \ -device virtio-scsi-pci,id=scsi0,bus=pci.0,addr=0xa \ -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x6 \ -blockdev driver=qcow2,file.driver=file,file.filename=/home/kvm_autotest_root/images/rhel840-64-virtio-scsi.qcow2,node-name=os_image1 \ -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x7,drive=os_image1,id=virtio-disk0,bootindex=1 \ \ -blockdev driver=raw,file.driver=host_device,file.filename=/dev/sdc,node-name=data1 \ -device scsi-block,bus=scsi0.0,drive=data1,id=disk1,bootindex=2 \ \ -vnc :5 \ -qmp tcp:0:5955,server,nowait \ -monitor stdio \ -netdev \ tap,id=hostnet0,vhost=on \ -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:dd:0a:82,bus=pci.0 6. execute io on guest and step in low speed write status dev=sda if [ "x$1" != "x" ];then dev=$1 fi echo "$dev" while true;do sg_dd if=/dev/zero of=/dev/$dev bs=1M count=100 oflag=direct echo "do dd" done 7.find the qemu process and kill it pid=`pgrep qemu-kvm`;echo $pid; kill -9 $pid time while true;do if ps $pid; then sleep 10;echo "active";else echo "exit";break;fi done The kill time is about 3m30s Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: virt:rhel and virt-devel:rhel security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:4191 |