Bug 1989717 - [regression] blk_update_request: I/O error when generating load on iSCSI
Summary: [regression] blk_update_request: I/O error when generating load on iSCSI
Keywords:
Status: CLOSED DUPLICATE of bug 1994494
Alias: None
Product: Red Hat Enterprise Linux Advanced Virtualization
Classification: Red Hat
Component: qemu-kvm
Version: 8.5
Hardware: Unspecified
OS: Unspecified
high
urgent
Target Milestone: rc
: 8.5
Assignee: Paolo Bonzini
QA Contact: qing.wang
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-08-03 19:11 UTC by Igor Raits
Modified: 2021-10-11 01:33 UTC (History)
10 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-09-05 08:32:17 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
libvirt.xml (12.76 KB, text/plain)
2021-08-03 19:11 UTC, Igor Raits
no flags Details
journal.log (105.12 KB, text/plain)
2021-08-04 07:29 UTC, Igor Raits
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker RHELPLAN-92210 0 None None None 2021-08-03 19:17:06 UTC

Description Igor Raits 2021-08-03 19:11:48 UTC
Created attachment 1810588 [details]
libvirt.xml

Description of problem:
After upgrade of qemu-kvm, with many I/O operations on iSCSI mounted disk inside a VM, filesystem gets I/O errors and gets remounted to the R/O mode.

Our setup is not something extremely unusual. What we do is run openstack+libvirt+qemu-kvm and add there volume from netapp via iSCSI.

[EU2][CHECK][root@cmp0019 ~]# multipath -ll
3600a098038314357363f506449317759 dm-3 NETAPP,LUN C-Mode
size=1000G features='3 queue_if_no_path pg_init_retries 50' hwhandler='1 alua' wp=rw
|-+- policy='service-time 0' prio=50 status=active
| `- 2:0:0:0 sdc 8:32 active ready running
`-+- policy='service-time 0' prio=10 status=enabled
  `- 1:0:0:0 sdb 8:16 active ready running

Version-Release number of selected component (if applicable):
qemu-kvm-6.0.0-26.el8s.x86_64
→ With -19.el8s I can't reproduce this issue.

How reproducible:
Always.

Steps to Reproduce:
1. Run virtual machine
2. Attach a volume (in my case it contains 700G of very small files) via iSCSI
3. Mount it inside a VM and run some loop like
while true; do chown -R root:root /mnt/carbon/whisper/; chown -R nobody:nobody /mnt/carbon/whisper/; done

Actual results:
In couple minutes, volume is remounted to R/O.
Aug 03 16:46:52 igorovo.eu2.intgdc.com kernel: blk_update_request: I/O error, dev vdc, sector 3146082984
Aug 03 16:46:52 igorovo.eu2.intgdc.com kernel: blk_update_request: I/O error, dev vdc, sector 3146084008
Aug 03 16:46:52 igorovo.eu2.intgdc.com kernel: blk_update_request: I/O error, dev vdc, sector 3146085032
Aug 03 16:46:52 igorovo.eu2.intgdc.com kernel: blk_update_request: I/O error, dev vdc, sector 3146086056
Aug 03 16:46:52 igorovo.eu2.intgdc.com kernel: blk_update_request: I/O error, dev vdc, sector 3146087080
Aug 03 16:46:52 igorovo.eu2.intgdc.com kernel: blk_update_request: I/O error, dev vdc, sector 3146088104
Aug 03 16:46:52 igorovo.eu2.intgdc.com kernel: blk_update_request: I/O error, dev vdc, sector 3146089128
Aug 03 16:46:52 igorovo.eu2.intgdc.com kernel: blk_update_request: I/O error, dev vdc, sector 3146090152
Aug 03 16:46:52 igorovo.eu2.intgdc.com kernel: blk_update_request: I/O error, dev vdc, sector 3146091176
Aug 03 16:46:52 igorovo.eu2.intgdc.com kernel: blk_update_request: I/O error, dev vdc, sector 3146092200
Aug 03 16:46:52 igorovo.eu2.intgdc.com kernel: Aborting journal on device vdc-8.
Aug 03 16:46:52 igorovo.eu2.intgdc.com kernel: EXT4-fs error (device vdc): ext4_journal_check_start:56: Detected aborted journal
Aug 03 16:46:52 igorovo.eu2.intgdc.com kernel: EXT4-fs (vdc): Remounting filesystem read-only

Expected results:
No errors.

Additional info:
Relevant qemu command (spawned by libvirt):
/usr/libexec/qemu-kvm -name guest=instance-0000187f,debug-threads=on -S -object {"qom-type":"secret","id":"masterKey0","format":"raw","file":"/var/lib/libvirt/qemu/domain-3-instance-0000187f/master-key.aes"} -machine pc-q35-rhel8.4.0,accel=kvm,usb=off,dump-guest-core=off -cpu host,migratable=on -m 122880 -overcommit mem-lock=off -smp 12,sockets=1,dies=1,cores=6,threads=2 -object {"qom-type":"iothread","id":"iothread1"} -object {"qom-type":"iothread","id":"iothread2"} -object {"qom-type":"memory-backend-file","id":"ram-node0","mem-path":"/dev/hugepages/libvirt/qemu/3-instance-0000187f","share":true,"prealloc":true,"size":128849018880,"host-nodes":[0],"policy":"bind"} -numa node,nodeid=0,cpus=0-11,memdev=ram-node0 -uuid b399a7d6-170d-4cf7-970d-c9375ea3dfc1 -smbios type=1,manufacturer=RDO,product=OpenStack Compute,version=0.0.0-1.el8+gdc,serial=b399a7d6-170d-4cf7-970d-c9375ea3dfc1,uuid=b399a7d6-170d-4cf7-970d-c9375ea3dfc1,family=Virtual Machine -no-user-config -nodefaults -chardev socket,id=charmonitor,fd=38,server=on,wait=off -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=delay -no-hpet -no-shutdown -boot strict=on -device pcie-root-port,port=0x10,chassis=1,id=pci.1,bus=pcie.0,multifunction=on,addr=0x2 -device pcie-root-port,port=0x11,chassis=2,id=pci.2,bus=pcie.0,addr=0x2.0x1 -device pcie-root-port,port=0x12,chassis=3,id=pci.3,bus=pcie.0,addr=0x2.0x2 -device pcie-root-port,port=0x13,chassis=4,id=pci.4,bus=pcie.0,addr=0x2.0x3 -device pcie-root-port,port=0x14,chassis=5,id=pci.5,bus=pcie.0,addr=0x2.0x4 -device pcie-root-port,port=0x15,chassis=6,id=pci.6,bus=pcie.0,addr=0x2.0x5 -device pcie-root-port,port=0x16,chassis=7,id=pci.7,bus=pcie.0,addr=0x2.0x6 -device pcie-root-port,port=0x17,chassis=8,id=pci.8,bus=pcie.0,addr=0x2.0x7 -device pcie-root-port,port=0x18,chassis=9,id=pci.9,bus=pcie.0,multifunction=on,addr=0x3 -device pcie-root-port,port=0x19,chassis=10,id=pci.10,bus=pcie.0,addr=0x3.0x1 -device pcie-root-port,port=0x1a,chassis=11,id=pci.11,bus=pcie.0,addr=0x3.0x2 -device pcie-root-port,port=0x1b,chassis=12,id=pci.12,bus=pcie.0,addr=0x3.0x3 -device pcie-root-port,port=0x1c,chassis=13,id=pci.13,bus=pcie.0,addr=0x3.0x4 -device pcie-root-port,port=0x1d,chassis=14,id=pci.14,bus=pcie.0,addr=0x3.0x5 -device pcie-root-port,port=0x1e,chassis=15,id=pci.15,bus=pcie.0,addr=0x3.0x6 -device pcie-root-port,port=0x1f,chassis=16,id=pci.16,bus=pcie.0,addr=0x3.0x7 -device pcie-root-port,port=0x20,chassis=17,id=pci.17,bus=pcie.0,addr=0x4 -device pcie-pci-bridge,id=pci.18,bus=pci.1,addr=0x0 -device piix3-usb-uhci,id=usb,bus=pci.18,addr=0x1 -blockdev {"driver":"host_device","filename":"/dev/volumes/b399a7d6-170d-4cf7-970d-c9375ea3dfc1_disk","aio":"native","node-name":"libvirt-3-storage","cache":{"direct":true,"no-flush":false},"auto-read-only":true,"discard":"unmap"} -blockdev {"node-name":"libvirt-3-format","read-only":false,"cache":{"direct":true,"no-flush":false},"driver":"raw","file":"libvirt-3-storage"} -device virtio-blk-pci,iothread=iothread1,bus=pci.4,addr=0x0,drive=libvirt-3-format,id=virtio-disk0,bootindex=1,write-cache=on -blockdev {"driver":"host_device","filename":"/dev/volumes/b399a7d6-170d-4cf7-970d-c9375ea3dfc1_disk.eph0","aio":"native","node-name":"libvirt-2-storage","cache":{"direct":true,"no-flush":false},"auto-read-only":true,"discard":"unmap"} -blockdev {"node-name":"libvirt-2-format","read-only":false,"cache":{"direct":true,"no-flush":false},"driver":"raw","file":"libvirt-2-storage"} -device virtio-blk-pci,iothread=iothread1,bus=pci.5,addr=0x0,drive=libvirt-2-format,id=virtio-disk1,write-cache=on -blockdev {"driver":"file","filename":"/var/lib/nova/instances/b399a7d6-170d-4cf7-970d-c9375ea3dfc1/disk.config","node-name":"libvirt-1-storage","cache":{"direct":true,"no-flush":false},"auto-read-only":true,"discard":"unmap"} -blockdev {"node-name":"libvirt-1-format","read-only":true,"cache":{"direct":true,"no-flush":false},"driver":"raw","file":"libvirt-1-storage"} -device ide-cd,bus=ide.0,drive=libvirt-1-format,id=sata0-0-0,write-cache=on -netdev tap,fd=40,id=hostnet0,vhost=on,vhostfd=41 -device virtio-net-pci,host_mtu=9000,netdev=hostnet0,id=net0,mac=fa:16:3e:64:71:2d,bus=pci.2,addr=0x0 -netdev tap,fd=42,id=hostnet1,vhost=on,vhostfd=43 -device virtio-net-pci,host_mtu=9000,netdev=hostnet1,id=net1,mac=fa:16:3e:bc:3e:65,bus=pci.3,addr=0x0 -add-fd set=5,fd=45 -chardev pty,id=charserial0,logfile=/dev/fdset/5,logappend=on -device isa-serial,chardev=charserial0,id=serial0 -device usb-tablet,id=input0,bus=usb.0,port=1 -audiodev id=audio1,driver=none -vnc 10.8.24.121:0,audiodev=audio1 -device cirrus-vga,id=video0,bus=pcie.0,addr=0x1 -device virtio-balloon-pci,id=balloon0,bus=pci.6,addr=0x0 -object {"qom-type":"rng-random","id":"objrng0","filename":"/dev/urandom"} -device virtio-rng-pci,rng=objrng0,id=rng0,bus=pci.7,addr=0x0 -global virtio-blk-pci.config-wce=off -sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny -msg timestamp=on

XML generated by libvirt in attachment.

Comment 1 Igor Raits 2021-08-03 19:54:25 UTC
Seems the issue was introduced by one of the changes in -25 build:
* Tue Jul 20 2021 Danilo Cesar Lemes de Paula <ddepaula> - 6.0.0-25.el8
- kvm-s390x-cpumodel-add-3931-and-3932.patch [bz#1976171]
- kvm-file-posix-fix-max_iov-for-dev-sg-devices.patch [bz#1943653]
- kvm-scsi-generic-pass-max_segments-via-max_iov-field-in-.patch [bz#1943653]
- kvm-osdep-provide-ROUND_DOWN-macro.patch [bz#1943653]
- kvm-block-backend-align-max_transfer-to-request-alignmen.patch [bz#1943653]
- kvm-block-add-max_hw_transfer-to-BlockLimits.patch [bz#1943653]
- kvm-file-posix-try-BLKSECTGET-on-block-devices-too-do-no.patch [bz#1943653]
- Resolves: bz#1976171
  ([IBM 8.5 FEAT] CPU Model for new IBM Z Hardware - qemu part)
- Resolves: bz#1943653
  (RHV VM pauses due to 'qemu-kvm' getting EINVAL on i/o to a direct lun with scsi passthrough enabled)

I'm continuing to bisect to figure out which exact patch causes the issue.

Comment 2 Igor Raits 2021-08-03 20:20:44 UTC
# For bz#1943653 - RHV VM pauses due to 'qemu-kvm' getting EINVAL on i/o to a direct lun with scsi passthrough enabled
Patch84: kvm-file-posix-try-BLKSECTGET-on-block-devices-too-do-no.patch

Reverting (not including) this patch fixes my problem.

Comment 3 Paolo Bonzini 2021-08-04 07:11:35 UTC
From email:

> [EU2][CHECK][root@cmp0019 ~]# sg_readcap /dev/dm-3
> Read Capacity results:
>     Last LBA=2097151999 (0x7cffffff), Number of logical blocks=2097152000
>     Logical block length=512 bytes

The disk has 2097152000 (~2 billion, corresponding to a capacity of 1 TB) logical blocks, but apparently the guest is requesting block 3146082984 (~3 billion).  Can you include the full dmesg up to the point where the disk goes read only?

Comment 4 Igor Raits 2021-08-04 07:29:29 UTC
Created attachment 1810744 [details]
journal.log

Comment 5 Igor Raits 2021-08-04 07:30:56 UTC
(In reply to Paolo Bonzini from comment #3)
> From email:
> 
> > [EU2][CHECK][root@cmp0019 ~]# sg_readcap /dev/dm-3
> > Read Capacity results:
> >     Last LBA=2097151999 (0x7cffffff), Number of logical blocks=2097152000
> >     Logical block length=512 bytes
> 
> The disk has 2097152000 (~2 billion, corresponding to a capacity of 1 TB)
> logical blocks, but apparently the guest is requesting block 3146082984 (~3
> billion).  Can you include the full dmesg up to the point where the disk
> goes read only?

Attached. It is not from the same boot / VM because I deploy / terminate testing VMs each time. But as I use same host and same image / volume, I guess it should not matter much.

Comment 6 qing.wang 2021-08-04 07:32:42 UTC
It looks like this issue related to specific hw or env?
I can not reproduce it on the FC device and scsi_debug disk.

ENV:
Red Hat Enterprise Linux release 8.5 Beta (Ootpa)
4.18.0-321.el8.x86_64
qemu-kvm-common-6.0.0-25.module+el8.5.0+11890+8e7c3f51.x86_64



1.create scsi_debug disk or using FC disk directly

modprobe scsi_debug physblk_exp=0 opt_xferlen_exp=6 opt_blks=0 sector_size=512 dev_size_mb=5120 add_host=2 vpd_use_hostno=0 lbprz=1 lbpu=1 lbpw=1 lbpws10=1 unmap_granularity=16 unmap_max_blocks=4096 write_same_length=4096 unmap_max_desc=11 scsi_level=6

(workspace) root@dell-per440-07 /home/kar $ multipath -l
mpathc (333333330000007d0) dm-3 Linux,scsi_debug     (scsi_debug)
size=5.0G features='0' hwhandler='1 alua' wp=rw
|-+- policy='service-time 0' prio=0 status=active
| `- 18:0:0:0 sde 8:64 active undef running
`-+- policy='service-time 0' prio=0 status=enabled
  `- 17:0:0:0 sdd 8:48 active undef running
mpatha (360050763008084e6e0000000000001a5) dm-4 IBM,2145  (FC)
size=200G features='1 queue_if_no_path' hwhandler='1 alua' wp=rw
|-+- policy='service-time 0' prio=0 status=active
| `- 10:0:0:0 sdb 8:16 active undef running
`-+- policy='service-time 0' prio=0 status=enabled
  `- 10:0:1:0 sdc 8:32 active undef running

mkfs.ext4 /dev/mapper/mpatha 

2.boot vm 
/usr/libexec/qemu-kvm \
  -name src_vm1 \
  -machine pc-q35-rhel8.4.0,accel=kvm,usb=off,dump-guest-core=off \
  -m 8g \
  -no-user-config -nodefaults \
  -vga qxl \
  -device pcie-root-port,id=pcie.0-root-port-2,slot=2,bus=pcie.0,multifunction=on \
  -device pcie-root-port,id=pcie.0-root-port-2-1,chassis=3,bus=pcie.0,addr=0x2.0x1 \
  -device pcie-root-port,id=pcie.0-root-port-2-2,chassis=4,bus=pcie.0,addr=0x2.0x2 \
  -device pcie-root-port,id=pcie.0-root-port-3,slot=3,bus=pcie.0 \
  -device pcie-root-port,id=pcie.0-root-port-4,slot=4,bus=pcie.0 \
  -device pcie-root-port,id=pcie.0-root-port-5,slot=5,bus=pcie.0 \
  -device pcie-root-port,id=pcie.0-root-port-7,slot=7,bus=pcie.0 \
  -device pcie-root-port,id=pcie.0-root-port-8,slot=8,bus=pcie.0 \
  -device pcie-root-port,id=pcie.0-root-port-9,slot=9,bus=pcie.0 \
  -device qemu-xhci,id=usb1,bus=pcie.0-root-port-2-1,addr=0x0 \
  -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \
  -object iothread,id=iothread0 \
  -object iothread,id=iothread1 \
  -device virtio-scsi-pci,id=scsi0,bus=pcie.0-root-port-2-2,addr=0x0,iothread=iothread0 \
  -blockdev \
  driver=qcow2,file.driver=file,cache.direct=off,cache.no-flush=on,file.filename=/home/kvm_autotest_root/images/rhel840-64-virtio-scsi.qcow2,node-name=drive_image1 \
  -device scsi-hd,id=os1,drive=drive_image1,bootindex=0 \
  \
  \
  -device virtio-scsi-pci,id=scsi1,bus=pcie.0-root-port-8,addr=0x0 \
  -blockdev node-name=host_device_stg,driver=host_device,aio=native,filename=/dev/mapper/mpathc,cache.direct=on,cache.no-flush=off,discard=unmap \
  -blockdev node-name=drive_stg,driver=raw,cache.direct=on,cache.no-flush=off,file=host_device_stg \
  -device virtio-blk-pci,iothread=iothread1,bus=pcie.0-root-port-4,addr=0x0,write-cache=on,id=stg,drive=drive_stg,rerror=report,werror=report \
  \
  -vnc \
  :5 \
  -qmp tcp:0:5955,server,nowait \
  -monitor stdio \
  \
  -device virtio-net-pci,mac=9a:b5:b6:b1:b2:b5,id=idMmq1jH,vectors=4,netdev=idxgXAlm,bus=pcie.0-root-port-5,addr=0x0 \
  -netdev tap,id=idxgXAlm

3.mount disk in guest and execute some io

[root@vm-198-88 ~]# lsblk
NAME                       MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda                          8:0    0  20G  0 disk 
\u251c\u2500sda1                       8:1    0   1G  0 part /boot
\u2514\u2500sda2                       8:2    0  19G  0 part 
  \u251c\u2500rhel_vm--115--131-root 253:0    0  17G  0 lvm  /
  \u2514\u2500rhel_vm--115--131-swap 253:1    0   2G  0 lvm  [SWAP]
vda                        252:0    0   5G  0 disk 
[root@vm-198-88 ~]# mount /dev/vda /mnt/x
[root@vm-198-88 ~]# while true; do chown -R root:root /mnt/x; chown -R nobody:nobody /mnt/x; done

open other console to generate file 

# while true;dd if=/dev/zero of=/mnt/x/test.img bs=4M count=1000;done


Not hit this issue

Comment 7 Igor Raits 2021-08-04 08:34:32 UTC
> It looks like this issue related to specific hw or env?

Sure, it can be. Our volume resides on the NetApp → multipath on HPE Proliant Gen10 DL380 with some Xeon 6154, etc.

Let me know if you need some more info about that.

> # while true;dd if=/dev/zero of=/mnt/x/test.img bs=4M count=1000;done

can it be that I have 536610 files worth 700G and you have one file 4G big?

Comment 8 Igor Raits 2021-08-04 10:49:26 UTC
By the way, I was looking if I can reproduce this issue if I don't use iSCSI but rather some local DM (LVM) module.

1. NetApp iSCSI volume - can easily reproduce issue
    <disk type='block' device='disk'>
      <driver name='qemu' type='raw' cache='none' io='native' iothread='2'/>
      <source dev='/dev/dm-3' index='3'/>
      <backingStore/>
      <target dev='vdc' bus='virtio'/>
      <serial>2f9aed39-d425-4a50-9335-53d075ce90df</serial>
      <alias name='virtio-disk2'/>
      <address type='pci' domain='0x0000' bus='0x06' slot='0x00' function='0x0'/>
    </disk>
2. Can't reproduce the issue
    <disk type='block' device='disk'>
      <driver name='qemu' type='raw' cache='none' io='native' iothread='2'/>
      <source dev='/dev/mapper/volumes-volume--ec389373--0ccf--44dc--a5ad--5196adc68e6c' index='2'/>
      <backingStore/>
      <target dev='vdd' bus='virtio'/>
      <serial>ec389373-0ccf-44dc-a5ad-5196adc68e6c</serial>
      <alias name='virtio-disk3'/>
      <address type='pci' domain='0x0000' bus='0x07' slot='0x00' function='0x0'/>
    </disk>

Comment 9 qing.wang 2021-08-04 11:03:09 UTC
(In reply to Igor Raits from comment #7)
> > It looks like this issue related to specific hw or env?
> 
> Sure, it can be. Our volume resides on the NetApp → multipath on HPE
> Proliant Gen10 DL380 with some Xeon 6154, etc.
> 
> Let me know if you need some more info about that.
> 
> > # while true;dd if=/dev/zero of=/mnt/x/test.img bs=4M count=1000;done
> 
> can it be that I have 536610 files worth 700G and you have one file 4G big?

My disk max space is 200g, it looks like does not related to disk size.

Change io operation as blowing, not hit this issue 

for ((n = 1; n < 1000; n++)); do     dd if=/dev/zero of=/mnt/x/test$n.img bs=1M count=200 oflag=direct;   done

Comment 10 Igor Raits 2021-08-04 11:32:54 UTC
> for ((n = 1; n < 1000; n++)); do     dd if=/dev/zero of=/mnt/x/test$n.img bs=1M count=200 oflag=direct;   done

I can't reproduce issue with this command either… But can constantly get it reproduced with existing data which I rsynced before and that chown loop.

Comment 11 Igor Raits 2021-08-04 12:04:40 UTC
But with this command I was able to get same error:

i=0; while true; do dd if=/dev/zero of=/mnt/test$i.img bs=16K count=100 oflag=direct 2>/dev/null; i=$(( i + 1 )); done

Comment 12 Igor Raits 2021-08-04 13:05:52 UTC
but it is not constantly reproducible with some dd. with chown of real data it is much easier :)

Comment 13 Igor Raits 2021-08-04 15:30:26 UTC
Ok, I was able to do it something like this.

1. Create empty ext4 FS
2. Run following for couple hours until it generates some huge amount of data (in my case I stopped on 276G)
   i=0; while true; do dd if=/dev/zero of=/mnt/test$i.img bs=16K count=100 oflag=direct 2>/dev/null; i=$(( i + 1 )); done
3. Run for some time
   while true; do chown -R root:root /mnt; chown -R nobody:nobody /mnt/; done

I've started data generation at Aug 04 14:05:40. Not sure when I started chown part, but in the end it failed.

Aug 04 17:14:51 igorovo.eu2.intgdc.com kernel: blk_update_request: I/O error, dev vde, sector 448805008 op 0x1:(WRITE) flags 0x107000 phys_seg 254 prio class 0

Comment 15 qing.wang 2021-08-05 10:28:29 UTC
My disk only have 200G , do you have enough disk space and it will auto extend in running ?

I test almost 24 hour, i do not hit this issue, i think the io operation is not the point. 
 
while true;do for ((n = 1; n < 100000; n++)); do     dd if=/dev/zero of=/mnt/x/test$n.img bs=16K count=100 oflag=direct; echo "$n";  done;done

Comment 16 Igor Raits 2021-08-06 13:17:58 UTC
Qing, can you try to write full disk of the 16k * 100 until disk is full + some more time after that?
> i=0; while true; do dd if=/dev/zero of=/mnt/test$i.img bs=16K count=100 oflag=direct 2>/dev/null; i=$(( i + 1 )); done
Then, run chown part and for me it makes drive to fail to R/O mode within few seconds.
>while true; do chown -R root:root /mnt; chown -R nobody:nobody /mnt/; done

Comment 17 Igor Raits 2021-08-07 17:52:03 UTC
P.S. If you still can't reproduce it, I probably don't have any other ideas.... But I can easily trigger it with NetApp AFF A300, be that empty volume or already existing volume with data… I can compile any version of qemu you'd like me to do and try anything out.

Comment 18 Paolo Bonzini 2021-08-09 09:10:51 UTC
> Ok, I was able to do it something like this.
> 
> 1. Create empty ext4 FS
> 2. Run following for couple hours until it generates some huge amount of data (in my case I stopped on 276G)
>    i=0; while true; do dd if=/dev/zero of=/mnt/test$i.img bs=16K count=100 oflag=direct 2>/dev/null; i=$(( i + 1 )); done
> 3. Run for some time
>    while true; do chown -R root:root /mnt; chown -R nobody:nobody /mnt/; done
> 
> I've started data generation at Aug 04 14:05:40. Not sure when I started chown part, but in the end it failed.
> 
> Aug 04 17:14:51 igorovo.eu2.intgdc.com kernel: blk_update_request: I/O error, dev vde, sector 448805008 op 0x1:(WRITE) flags 0x107000 phys_seg 254 prio class 0

Hi Igor,

can you do an "sg_readcap" on both the host and the guest?

Comment 19 Igor Raits 2021-08-10 11:12:32 UTC
(In reply to Paolo Bonzini from comment #18)
> > Ok, I was able to do it something like this.
> > 
> > 1. Create empty ext4 FS
> > 2. Run following for couple hours until it generates some huge amount of data (in my case I stopped on 276G)
> >    i=0; while true; do dd if=/dev/zero of=/mnt/test$i.img bs=16K count=100 oflag=direct 2>/dev/null; i=$(( i + 1 )); done
> > 3. Run for some time
> >    while true; do chown -R root:root /mnt; chown -R nobody:nobody /mnt/; done
> > 
> > I've started data generation at Aug 04 14:05:40. Not sure when I started chown part, but in the end it failed.
> > 
> > Aug 04 17:14:51 igorovo.eu2.intgdc.com kernel: blk_update_request: I/O error, dev vde, sector 448805008 op 0x1:(WRITE) flags 0x107000 phys_seg 254 prio class 0
> 
> Hi Igor,
> 
> can you do an "sg_readcap" on both the host and the guest?

Hi Paolo,

[EU2][PC-3801][root@cmp0022 ~]# sg_readcap /dev/dm-3 
Read Capacity results:
   Last LBA=2097151999 (0x7cffffff), Number of logical blocks=2097152000
   Logical block length=512 bytes
Hence:
   Device size: 1073741824000 bytes, 1024000.0 MiB, 1073.74 GB

[root@igorovo ~]# sg_readcap /dev/vdc 
read capacity(10): pass-through os error: Inappropriate ioctl for device
READ CAPACITY (10) failed: Inappropriate ioctl for device
sg_readcap failed: Inappropriate ioctl for device

Comment 20 Igor Raits 2021-08-18 10:09:46 UTC
Hi Paolo,

Anything more I can provide for resolving this bug? It is quite annoying one (in sense that any time there is new version of qemu-kvm in advanced-virtualization, we have to revert the patch)…

Happy to run / test anything!

Comment 21 Paolo Bonzini 2021-08-18 12:26:39 UTC
Sorry, the sg_readcap was not correct for virtio-blk.  You can use

cat /sys/block/vdc/size
cat /sys/block/vdc/queue/logical_block_size

instead to get the same information.

Comment 22 Igor Raits 2021-08-18 16:32:24 UTC
[root@igorovo ~]# cat /sys/block/vdc/size
2097152000
[root@igorovo ~]# cat /sys/block/vdc/queue/logical_block_size
512

Comment 23 Paolo Bonzini 2021-08-18 21:57:29 UTC
The size of the disk is incorrect in the VM.  Can you check that it changes back to 1 TB when reverting the patch? If not, the patch would likely be a red herring.

Comment 24 qing.wang 2021-08-19 05:48:53 UTC
(In reply to Paolo Bonzini from comment #23)
> The size of the disk is incorrect in the VM.  Can you check that it changes
> back to 1 TB when reverting the patch? If not, the patch would likely be a
> red herring.

What is the expected value? the size 2097152000*512 = 1T. 
It match the size of lun.

Comment 25 Paolo Bonzini 2021-08-19 07:11:27 UTC
That was stupid, I counted zeros wrong.

But the 2097152000 is less than 3146092200 that is in the errors, so why is the filesystem trying to access a block above 2097152000?

Can you try an e2fsck on the host?

Comment 26 Igor Raits 2021-08-19 15:23:45 UTC
(In reply to Paolo Bonzini from comment #25)
> That was stupid, I counted zeros wrong.
> 
> But the 2097152000 is less than 3146092200 that is in the errors, so why is
> the filesystem trying to access a block above 2097152000?
> 
> Can you try an e2fsck on the host?

Sure, it shows errors as it was unmounted improperly... Before doing a test, I usually create new filesystem with mkfs.ext4

Comment 27 Paolo Bonzini 2021-08-19 16:04:05 UTC
Ok, I think we have reproduced the bug.

Comment 28 Igor Raits 2021-08-19 16:18:56 UTC
(In reply to Paolo Bonzini from comment #27)
> Ok, I think we have reproduced the bug.

🍾

Comment 29 qing.wang 2021-08-20 07:39:10 UTC
(In reply to Paolo Bonzini from comment #27)
> Ok, I think we have reproduced the bug.

Hi Paolo, what is the reproducer?

Comment 30 qing.wang 2021-08-24 09:57:26 UTC
What is the reproducer, and this issue have same reason with Bug 1994494 - VM remains in paused state when trying to write on a resized disk resides on iscsi

Comment 31 qing.wang 2021-08-25 03:15:22 UTC
I think this with same reason  as Bug 1994494 - VM remains in paused state when trying to write on a resized disk resides on iscsi

reproduce on 
Red Hat Enterprise Linux release 8.5 Beta (Ootpa)
4.18.0-330.el8.x86_64
qemu-kvm-common-6.0.0-27.module+el8.5.0+12121+c40c8708.x86_64

Test steps
1.build iscis target server
root@dell-per440-07 /home/tmp $ targetcli ls
o- / ........................................................................................... [...]
  o- backstores ................................................................................ [...]
  | o- block .................................................................... [Storage Objects: 0]
  | o- fileio ................................................................... [Storage Objects: 1]
  | | o- one ................................... [/home/iscsi/onex.img (15.0GiB) write-back activated]
  | |   o- alua ..................................................................... [ALUA Groups: 1]
  | |     o- default_tg_pt_gp ......................................... [ALUA state: Active/optimized]
  | o- pscsi .................................................................... [Storage Objects: 0]
  | o- ramdisk .................................................................. [Storage Objects: 0]
  o- iscsi .............................................................................. [Targets: 1]
  | o- iqn.2016-06.one.server:one-a ........................................................ [TPGs: 1]
  |   o- tpg1 ................................................................. [no-gen-acls, no-auth]
  |     o- acls ............................................................................ [ACLs: 2]
  |     | o- iqn.1994-05.com.redhat:clienta ......................................... [Mapped LUNs: 1]
  |     | | o- mapped_lun0 .................................................... [lun0 fileio/one (rw)]
  |     | o- iqn.1994-05.com.redhat:clientb ......................................... [Mapped LUNs: 1]
  |     |   o- mapped_lun0 .................................................... [lun0 fileio/one (rw)]
  |     o- luns ............................................................................ [LUNs: 1]
  |     | o- lun0 ............................. [fileio/one (/home/iscsi/onex.img) (default_tg_pt_gp)]
  |     o- portals ...................................................................... [Portals: 1]
  |       o- 0.0.0.0:3260 ....................................................................... [OK]
  o- loopback ........................................................................... [Targets: 0]

2.attach iscsi disk on host
iscsiadm -m discovery -t st -p 127.0.0.1
iscsiadm -m node -T iqn.2016-06.one.server:one-a  -p 127.0.0.1:3260 -l

3.change the value to 64

echo 64 > /sys/block/sdd/queue/max_sectors_kb

4.create lvm on the disk
pvcreate /dev/sdd
vgcreate vg /dev/sdd

lvcreate -L 2560M -n lv1 vg;lvcreate -L 2560M -n lv2 vg

#optional for qcow2 format
qemu-img create -f qcow2 /dev/vg/lv1 2G;qemu-img create -f qcow2 /dev/vg/lv2 2G

 
5.boot the vm with two lvm as blk device


/usr/libexec/qemu-kvm \
  -name src_vm1 \
  -machine pc-q35-rhel8.4.0,accel=kvm,usb=off,dump-guest-core=off \
  -m 8g \
  -no-user-config -nodefaults \
  -vga qxl \
  -device pcie-root-port,id=pcie.0-root-port-2,slot=2,bus=pcie.0,multifunction=on \
  -device pcie-root-port,id=pcie.0-root-port-2-1,chassis=3,bus=pcie.0,addr=0x2.0x1 \
  -device pcie-root-port,id=pcie.0-root-port-2-2,chassis=4,bus=pcie.0,addr=0x2.0x2 \
  -device pcie-root-port,id=pcie.0-root-port-3,slot=3,bus=pcie.0 \
  -device pcie-root-port,id=pcie.0-root-port-4,slot=4,bus=pcie.0 \
  -device pcie-root-port,id=pcie.0-root-port-5,slot=5,bus=pcie.0 \
  -device pcie-root-port,id=pcie.0-root-port-6,slot=6,bus=pcie.0 \
  -device pcie-root-port,id=pcie.0-root-port-7,slot=7,bus=pcie.0 \
  -device pcie-root-port,id=pcie.0-root-port-8,slot=8,bus=pcie.0 \
  -device pcie-root-port,id=pcie.0-root-port-9,slot=9,bus=pcie.0 \
  -device qemu-xhci,id=usb1,bus=pcie.0-root-port-2-1,addr=0x0 \
  -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \
  -object iothread,id=iothread0 \
  -object iothread,id=iothread1 \
  -device virtio-scsi-pci,id=scsi0,bus=pcie.0-root-port-2-2,addr=0x0,iothread=iothread0 \
  -device virtio-scsi-pci,id=scsi1,bus=pcie.0-root-port-8,addr=0x0 \
  -blockdev driver=qcow2,file.driver=file,cache.direct=off,cache.no-flush=on,file.filename=/home/kvm_autotest_root/images/rhel840-64-virtio-scsi.qcow2,node-name=drive_image1 \
  -device scsi-hd,id=os1,drive=drive_image1,bootindex=0 \
  \
   -blockdev node-name=host_device_stg,driver=host_device,aio=native,filename=/dev/vg/lv1,cache.direct=on,cache.no-flush=off,discard=unmap \
  -blockdev node-name=drive_stg,driver=raw,cache.direct=on,cache.no-flush=off,file=host_device_stg \
  -device virtio-blk-pci,iothread=iothread1,bus=pcie.0-root-port-4,addr=0x0,write-cache=on,id=stg,drive=drive_stg \
\
 -blockdev node-name=host_device_stg2,driver=host_device,aio=native,filename=/dev/vg/lv2,cache.direct=on,cache.no-flush=off,discard=unmap \
  -blockdev node-name=drive_stg2,driver=raw,cache.direct=on,cache.no-flush=off,file=host_device_stg2 \
  -device virtio-blk-pci,iothread=iothread1,bus=pcie.0-root-port-5,addr=0x0,write-cache=on,id=stg2,drive=drive_stg2 \
\
  \
  -vnc :5 \
  -qmp tcp:0:5955,server,nowait \
  -monitor stdio \
  -device virtio-net-pci,mac=9a:b5:b6:b1:b4:b5,id=idMmq1jH,vectors=4,netdev=idxgXAlm,bus=pcie.0-root-port-6,addr=0x0 \
  -netdev tap,id=idxgXAlm


6.execute io on two disk in guest

root@bootp-73-199-218 /home $ cat test.sh 
for i in $(seq 100); do
    printf "pass %03d ... " $i
    dd if=/dev/zero bs=1M count=2048 of=$1 conv=fsync status=none
    if [[ "$?" != "0" ]];then break;fi
    echo "ok"
  done

./test.sh /dev/vda
./test.sh /dev/vdb


found io error message in dmesg
[  544.716828] blk_update_request: I/O error, dev vdb, sector 81280 op 0x1:(WRITE) flags 0x104000 phys_seg 254 prio class 0
[  544.717745] blk_update_request: I/O error, dev vdb, sector 83312 op 0x1:(WRITE) flags 0x104000 phys_seg 254 prio class 0
[  652.970014] print_req_error: 25 callbacks suppressed
[  652.970017] blk_update_request: I/O error, dev vdb, sector 62992 op 0x1:(WRITE) flags 0x104000 phys_seg 254 prio class 0
[  652.971340] buffer_io_error: 8880 callbacks suppressed
[  652.971342] Buffer I/O error on dev vdb, logical block 7874, lost async page write
[  652.972427] Buffer I/O error on dev vdb, logical block 7875, lost async page write
[  652.973024] Buffer I/O error on dev vdb, logical block 7876, lost async page write
[  652.973666] Buffer I/O error on dev vdb, logical block 7877, lost async page write
[  652.974306] Buffer I/O error on dev vdb, logical block 7878, lost async page write
[  652.974904] Buffer I/O error on dev vdb, logical block 7879, lost async page write
[  652.975537] Buffer I/O error on dev vdb, logical block 7880, lost async page write
[  652.976162] Buffer I/O error on dev vdb, logical block 7881, lost async page write
[  652.976792] Buffer I/O error on dev vdb, logical block 7882, lost async page write
[  652.977394] Buffer I/O error on dev vdb, logical block 7883, lost async page write
[  652.987664] blk_update_request: I/O error, dev vdb, sector 65024 op 0x1:(WRITE) flags 0x104000 phys_seg 254 prio class 0
[  652.988678] blk_update_request: I/O error, dev vdb, sector 67056 op 0x1:(WRITE) flags 0x104000 phys_seg 254 prio class 0
[  652.989649] blk_update_request: I/O error, dev vdb, sector 69088 op 0x1:(WRITE) flags 0x104000 phys_seg 254 prio class 0
[  652.990612] blk_update_request: I/O error, dev vdb, sector 71120 op 0x1:(WRITE) flags 0x104000 phys_seg 254 prio class 0
[  652.991569] blk_update_request: I/O error, dev vdb, sector 73152 op 0x1:(WRITE) flags 0x104000 phys_seg 254 prio class 0
[  652.992520] blk_update_request: I/O error, dev vdb, sector 75184 op 0x1:(WRITE) flags 0x104000 phys_seg 254 prio class 0
[  652.993511] blk_update_request: I/O error, dev vdb, sector 77216 op 0x1:(WRITE) flags 0x104000 phys_seg 254 prio class 0
[  652.994463] blk_update_request: I/O error, dev vdb, sector 79248 op 0x1:(WRITE) flags 0x104000 phys_seg 254 prio class 0

Comment 32 Igor Raits 2021-08-27 07:49:47 UTC
Qing, I see that linked bugzilla is already fixed. Can you share a patch with me, so that I can try it out?

Comment 34 qing.wang 2021-08-27 08:22:12 UTC
Hi, Kevin, I think this issue have same reason with https://bugzilla.redhat.com/show_bug.cgi?id=1994494,
Could you please share patch with Igor,thanks.

Comment 35 CongLi 2021-09-02 07:38:33 UTC
Hi Igor,

Can you try https://cbs.centos.org/koji/taskinfo?taskID=2528662 ?

Thanks.

Comment 36 Igor Raits 2021-09-04 09:27:26 UTC
(In reply to CongLi from comment #35)
> Hi Igor,

Hey,

> Can you try https://cbs.centos.org/koji/taskinfo?taskID=2528662 ?

Seems I'm unable to reproduce my bug with this build which is amazing. Do you have any ETA when this build will go through to stable advancedvirt?


Thank you!

Comment 37 CongLi 2021-09-05 08:25:09 UTC
(In reply to Igor Raits from comment #36)
> (In reply to CongLi from comment #35)
> > Hi Igor,
> 
> Hey,
> 
> > Can you try https://cbs.centos.org/koji/taskinfo?taskID=2528662 ?
> 
> Seems I'm unable to reproduce my bug with this build which is amazing. Do
> you have any ETA when this build will go through to stable advancedvirt?
> 
> 
> Thank you!

If you mean https://mirror.domainhizmetleri.com/centos/8-stream/virt/x86_64/advancedvirt-common/Packages/q/, qemu-kvm-6.0.0-29.el8s.x86_64.rpm has been added already. TBH, I've no idea how often the maintainer sync the package the advancedvirt.

Comment 38 CongLi 2021-09-05 08:32:17 UTC
Based on comment 36, close this bug. Please feel free to reopen if anything wrong.

Thanks.

*** This bug has been marked as a duplicate of bug 1994494 ***


Note You need to log in before you can comment on or make changes to this bug.