Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 1492559 - virtio-blk mutiwrite merge causes too big IO
virtio-blk mutiwrite merge causes too big IO
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: qemu-kvm (Show other bugs)
7.4
x86_64 Linux
medium Severity medium
: rc
: ---
Assigned To: Fam Zheng
Longxiang Lyu
: Automation
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2017-09-18 03:23 EDT by yisun
Modified: 2018-04-10 10:35 EDT (History)
15 users (show)

See Also:
Fixed In Version: qemu-kvm-1.5.3-143.el7
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2018-04-10 10:35:07 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
pre_iscsi.sh (866 bytes, text/plain)
2017-09-18 21:01 EDT, yisun
no flags Details
strace.log (506.63 KB, text/plain)
2017-09-20 07:08 EDT, yisun
no flags Details


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2018:0816 normal SHIPPED_LIVE Low: qemu-kvm security, bug fix, and enhancement update 2018-04-10 08:47:03 EDT

  None (edit)
Description yisun 2017-09-18 03:23:05 EDT
Description of problem:
mkfs.ext3/ext4 a iscsi disk will return a warning in vm

Version-Release number of selected component (if applicable):
qemu-kvm-common-1.5.3-141.el7_4.2.x86_64
qemu-kvm-1.5.3-141.el7_4.2.x86_64
qemu-img-1.5.3-141.el7_4.2.x86_64

How reproducible:
100%

Steps to Reproduce:
1.prepare a iscsi lun (can refer to attachment - pre_iscsi.sh
2.start a vm with extra disk as follow:
    <disk type='network' device='disk'>
      <driver name='qemu' type='raw'/>
      <source protocol='iscsi' name='iqn.2016-03.com.virttest:logical-pool.target/0'>
        <host name='127.0.0.1' port='3260'/>
      </source>
      <target dev='vdb' bus='virtio'/>
      <alias name='virtio-disk1'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x09' function='0x0'/>
    </disk>

The vm's qemu process will be as:
qemu     21943     1  1 15:08 ?        00:00:11 /usr/libexec/qemu-kvm -name avocado-vt-vm1 -S -machine pc-i440fx-rhel7.0.0,accel=kvm,usb=off,dump-guest-core=off -m 1024 -realtime mlock=off -smp 2,sockets=2,cores=1,threads=1 -uuid 19b984c1-07ec-43b6-a253-a2b3e23ad476 -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-6-avocado-vt-vm1/monitor.sock,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=delay -no-hpet -no-shutdown -global PIIX4_PM.disable_s3=1 -global PIIX4_PM.disable_s4=1 -boot strict=on -device ich9-usb-ehci1,id=usb,bus=pci.0,addr=0x5.0x7 -device ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pci.0,multifunction=on,addr=0x5 -device ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pci.0,addr=0x5.0x1 -device ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pci.0,addr=0x5.0x2 -device virtio-scsi-pci,id=scsi0,bus=pci.0,addr=0x8 -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x4 -drive file=/var/lib/libvirt/images/RHEL-7.4-x86_64-latest.qcow2,format=qcow2,if=none,id=drive-virtio-disk0 -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x6,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -drive file=iscsi://127.0.0.1:3260/iqn.2016-03.com.virttest%3Alogical-pool.target/0,format=raw,if=none,id=drive-virtio-disk1 -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x9,drive=drive-virtio-disk1,id=virtio-disk1 -netdev tap,fd=29,id=hostnet0 -device rtl8139,netdev=hostnet0,id=net0,mac=52:54:00:ac:95:18,bus=pci.0,addr=0x3 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -chardev socket,id=charchannel0,path=/var/lib/libvirt/qemu/channel/target/domain-6-avocado-vt-vm1/org.qemu.guest_agent.0,server,nowait -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.qemu.guest_agent.0 -device usb-tablet,id=input0,bus=usb.0,port=1 -vnc 127.0.0.1:0 -vga cirrus -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x7 -msg timestamp=on


3. login vm
4. run "mkfs.ext4 /dev/vdb" in vm
[root@localhost ~]# mkfs.ext4 /dev/vdb
mke2fs 1.42.9 (28-Dec-2013)
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
Stride=0 blocks, Stripe width=0 blocks
64000 inodes, 256000 blocks
12800 blocks (5.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=262144000
8 block groups
32768 blocks per group, 32768 fragments per group
8000 inodes per group
Superblock backups stored on blocks: 
	32768, 98304, 163840, 229376

Allocating group tables: done                            
Writing inode tables: done                            
Creating journal (4096 blocks): done
Writing superblocks and filesystem accounting information: [  295.964108] blk_update_request: I/O error, dev vdb, sector 548480
[  295.966615] Buffer I/O error on dev vdb, logical block 68560, lost async page write
[  295.967283] Buffer I/O error on dev vdb, logical block 68561, lost async page write
[  295.967952] Buffer I/O error on dev vdb, logical block 68562, lost async page write
[  295.968568] Buffer I/O error on dev vdb, logical block 68563, lost async page write
[  295.969222] Buffer I/O error on dev vdb, logical block 68564, lost async page write
[  295.969870] Buffer I/O error on dev vdb, logical block 68565, lost async page write
[  295.970486] Buffer I/O error on dev vdb, logical block 68566, lost async page write
[  295.971176] Buffer I/O error on dev vdb, logical block 68567, lost async page write
[  295.971831] Buffer I/O error on dev vdb, logical block 68568, lost async page write
[  295.972455] Buffer I/O error on dev vdb, logical block 68569, lost async page write
[  295.973151] blk_update_request: I/O error, dev vdb, sector 549488
[  295.973726] blk_update_request: I/O error, dev vdb, sector 550496
[  295.974340] blk_update_request: I/O error, dev vdb, sector 551504
[  295.974955] blk_update_request: I/O error, dev vdb, sector 552512
[  295.975531] blk_update_request: I/O error, dev vdb, sector 553520
[  295.976179] blk_update_request: I/O error, dev vdb, sector 554528
[  295.976855] blk_update_request: I/O error, dev vdb, sector 555536
[  295.977446] blk_update_request: I/O error, dev vdb, sector 556544
[  295.978863] blk_update_request: I/O error, dev vdb, sector 786432

Warning, had trouble writing out superblocks.


[root@localhost ~]# echo $?
5

** sometimes there is only "Warning, had trouble writing out superblocks." but no "... lost async page write" or "blk_update_request: I/O error, dev vdb ..."

Actual results:
mkfs returns error

Expected results:
mkfs should work

Additional info:
This is not reproduced if I reinstall qemu with pkgs:
qemu-img-rhev-2.9.0-16.el7_4.8.x86_64
qemu-kvm-common-rhev-2.9.0-16.el7_4.8.x86_64
qemu-kvm-rhev-2.9.0-16.el7_4.8.x86_64
Comment 2 yisun 2017-09-18 03:25:30 EDT
btw, the guest is installed with latest os tree:
RHEL-7.4-20170711.0-x86_64
Comment 3 yisun 2017-09-18 21:01 EDT
Created attachment 1327658 [details]
pre_iscsi.sh
Comment 4 Fam Zheng 2017-09-18 22:01:16 EDT
Did this work with previous guest versions? E.g. RHEL 7.3?
Comment 5 yisun 2017-09-18 23:41:42 EDT
With RHEL-7.3-20161019.0-x86_64, there are still some i/o errors, but mkfs's exit code is normal as 0. 

[root@localhost ~]# uname -r
3.10.0-514.el7.x86_64

[root@localhost ~]# mkfs.ext4 /dev/vdb
mke2fs 1.42.9 (28-Dec-2013)
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
Stride=0 blocks, Stripe width=0 blocks
64000 inodes, 256000 blocks
12800 blocks (5.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=262144000
8 block groups
32768 blocks per group, 32768 fragments per group
8000 inodes per group
Superblock backups stored on blocks: 
	32768, 98304, 163840, 229376

Allocating group tables: done                            
Writing inode tables: done                            
Creating journal (4096 blocks): done
Writing superblocks and filesystem accounting information: [  192.808837] blk_update_request: I/O error, dev vdb, sector 548480
[  192.810756] Buffer I/O error on dev vdb, logical block 68560, lost async page write
[  192.811284] Buffer I/O error on dev vdb, logical block 68561, lost async page write
[  192.811815] Buffer I/O error on dev vdb, logical block 68562, lost async page write
[  192.812599] Buffer I/O error on dev vdb, logical block 68563, lost async page write
[  192.813366] Buffer I/O error on dev vdb, logical block 68564, lost async page write
[  192.814127] Buffer I/O error on dev vdb, logical block 68565, lost async page write
[  192.814906] Buffer I/O error on dev vdb, logical block 68566, lost async page write
[  192.815623] Buffer I/O error on dev vdb, logical block 68567, lost async page write
[  192.816372] Buffer I/O error on dev vdb, logical block 68568, lost async page write
[  192.817083] Buffer I/O error on dev vdb, logical block 68569, lost async page write
[  192.817830] blk_update_request: I/O error, dev vdb, sector 549488
[  192.818453] blk_update_request: I/O error, dev vdb, sector 550496
[  192.819131] blk_update_request: I/O error, dev vdb, sector 551504
[  192.819810] blk_update_request: I/O error, dev vdb, sector 552512
[  192.820474] blk_update_request: I/O error, dev vdb, sector 553520
[  192.821145] blk_update_request: I/O error, dev vdb, sector 554528
[  192.821804] blk_update_request: I/O error, dev vdb, sector 555536
[  192.823135] blk_update_request: I/O error, dev vdb, sector 556544
[  192.823754] blk_update_request: I/O error, dev vdb, sector 786432
done

[root@localhost ~]# echo $?
0
Comment 6 Fam Zheng 2017-09-20 02:23:34 EDT
I cannot reproduce that, can you collect "strace -f -p $PID_OF_QEMU_KVM" at host (with $PID_OF_QEMU_KVM replaced by the actual pid number) when you run the mkfs command in guest?
Comment 7 yisun 2017-09-20 07:08 EDT
Created attachment 1328383 [details]
strace.log
Comment 8 yisun 2017-09-20 07:11:12 EDT
(In reply to Fam Zheng from comment #6)
> I cannot reproduce that, can you collect "strace -f -p $PID_OF_QEMU_KVM" at
> host (with $PID_OF_QEMU_KVM replaced by the actual pid number) when you run
> the mkfs command in guest?

the strace output attached as https://bugzilla.redhat.com/attachment.cgi?id=1328383
It seems to has a lot of redundant info, pls check if it's useful
Comment 9 Fam Zheng 2017-09-20 21:17:23 EDT
Sorry, I was confused, strace won't help because it doesn't reflect the iscsi driver activities. You are not using the host kernel iscsi initiator although you login in the pre_iscsi.sh:

iscsiadm --mode node --login --targetname iqn.2016-03.com.virttest:logical-pool.target --portal 127.0.0.1

Two more questions:

Are there any error message from the host kernel?

Can you rule out the disk failure? (Reproducible on a different host?) If yes, maybe it's easier if you can let me access the test machine.
Comment 11 Fam Zheng 2017-09-21 09:19:51 EDT
Thanks, that helps!

I've fully understand the problem and I will work on a fix.

Summary: in qemu-kvm, we do a relatively old version of "write request merging" in virtio-blk, which could generate an errneously large request to be sent to the iscsi target when guest writes out the new filesystem metadata, resulting in EIO. Upstream QEMU (2.10) and qemu-kvm-rhev has new request merging logic that doesn't have this problem because the merging limit has been correctly considered.
Comment 15 Wainer dos Santos Moschetta 2017-09-28 17:58:33 EDT
Fix included in qemu-kvm-1.5.3-143.el7
Comment 17 Longxiang Lyu 2017-09-29 00:49:50 EDT
Verification: Bug is fixed in qemu-kvm-1.5.3-143.el7.

Steps:
1. verify qemu-kvm version
# rpm -qa | grep qemu-kvm
qemu-kvm-common-1.5.3-143.el7.x86_64
qemu-kvm-tools-1.5.3-143.el7.x86_64
qemu-kvm-debuginfo-1.5.3-143.el7.x86_64
qemu-kvm-1.5.3-143.el7.x86_64

2. boot up a vm with an iscsi drive as second disk
/usr/libexec/qemu-kvm \
-name test \
-machine pc-i440fx-rhel7.0.0,accel=kvm,usb=off,dump-guest-core=off \
-m 4G \
-smp 2,sockets=2,cores=1,threads=1 \
-boot strict=on \
-drive file=/home/test/bugs/1492559/test.qcow2,format=qcow2,if=none,id=img0 \
-device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x6,drive=img0,id=disk0,bootindex=1 \
-drive file=iscsi://127.0.0.1:3260/iqn.2017-09.com.versace:logical-pool.target/0,format=raw,if=none,id=img1 \
-device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x9,drive=img1,id=disk1 \
-netdev tap,vhost=on,id=hostnet0 \
-device rtl8139,netdev=hostnet0,id=net0,mac=52:54:00:ac:95:18,bus=pci.0,addr=0x3 \
-vnc 127.0.0.1:1 \
-device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x7 \
-monitor stdio \

3. disk info of guest
# lsblk
NAME                              MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
fd0                                 2:0    1    4K  0 disk 
sr0                                11:0    1 1024M  0 rom  
vda                               252:0    0   20G  0 disk 
├─vda1                            252:1    0    1G  0 part /boot
└─vda2                            252:2    0   19G  0 part 
  ├─rhel_bootp--73--199--239-root 253:0    0   17G  0 lvm  /
  └─rhel_bootp--73--199--239-swap 253:1    0    2G  0 lvm  [SWAP]
vdb                               252:16   0  9.8G  0 disk 

4. format vdb
# mkfs.ext3 /dev/vdb 
mke2fs 1.42.9 (28-Dec-2013)
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
Stride=0 blocks, Stripe width=0 blocks
640848 inodes, 2560000 blocks
128000 blocks (5.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=2621440000
79 block groups
32768 blocks per group, 32768 fragments per group
8112 inodes per group
Superblock backups stored on blocks: 
	32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632

Allocating group tables: done                            
Writing inode tables: done                            
Creating journal (32768 blocks): done
Writing superblocks and filesystem accounting information: done 

# echo $?
0

# mkfs.ext4 /dev/vdb
mke2fs 1.42.9 (28-Dec-2013)
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
Stride=0 blocks, Stripe width=0 blocks
640848 inodes, 2560000 blocks
128000 blocks (5.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=2151677952
79 block groups
32768 blocks per group, 32768 fragments per group
8112 inodes per group
Superblock backups stored on blocks: 
	32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632

Allocating group tables: done                            
Writing inode tables: done                            
Creating journal (32768 blocks): done
Writing superblocks and filesystem accounting information: done 

# echo $?
0
Comment 22 errata-xmlrpc 2018-04-10 10:35:07 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:0816

Note You need to log in before you can comment on or make changes to this bug.