Bug 1741939

Summary:	qcow2 L2 cache size is incorrectly calculated
Product:	Red Hat Enterprise Linux Advanced Virtualization	Reporter:	Kevin Wolf <kwolf>
Component:	qemu-kvm	Assignee:	Kevin Wolf <kwolf>
qemu-kvm sub component:	General	QA Contact:	CongLi <coli>
Status:	CLOSED ERRATA	Docs Contact:
Severity:	unspecified
Priority:	medium	CC:	areis, coli, ddepaula, jinzhao, juzhang, virt-maint
Version:	8.0
Target Milestone:	rc
Target Release:	---
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:	qemu-kvm-4.2.0-1.module+el8.2.0+4793+b09dd2fb	Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2020-05-05 09:47:43 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Kevin Wolf 2019-08-16 13:42:29 UTC

In bug 1656507 we fixed the L2 cache size to cover exactly the image size in most cases. However, a rounding bug means that the necessary cache size is rounded down instead of up, so the cache size may end up being smaller than necessary to cover the whole image.

Patch and detailed report on the qemu-block mailing list:
https://lists.gnu.org/archive/html/qemu-block/2019-08/msg00793.html

The report suggests testing performance with an image size of 1280 MB to reproduce a performance problem. Images with a size larger than 1 GB that is not a multiple of 512 MB are affected by the bug.

Comment 5 Tingting Mao 2019-12-03 02:22:55 UTC

Tried to verify this bug as below.


Steps:
1.Create a data disk over tmpfs.
# mkdir tmpfs_dir
# mount tmpfs tmpfs_dir/ -t tmpfs -o size=10G
# qemu-img create -f qcow2 tmpfs_dir/data.qcow2 1280M
# qemu-img info tmpfs_dir/data.qcow2 -U
image: tmpfs_dir/data.qcow2
file format: qcow2
virtual size: 1.25 GiB (1342177280 bytes)
disk size: 196 KiB
cluster_size: 65536
Format specific information:
    compat: 1.1
    lazy refcounts: false
    refcount bits: 16
    corrupt: false

2. Boot a guest with this data disk.
# /usr/libexec/qemu-kvm \
    -name 'guest-rhel7.7' \
    -machine q35 \
    -nodefaults \
    -vga qxl \
    -device pcie-root-port,port=0x10,chassis=1,id=pci.1,bus=pcie.0,multifunction=on,addr=0x2 \
    -device pcie-root-port,port=0x11,chassis=2,id=pci.2,bus=pcie.0,addr=0x2.0x1 \
    -device pcie-root-port,port=0x12,chassis=3,id=pcx.3,bus=pcie.0,addr=0x2.0x2 \
    -device pcie-root-port,port=0x13,chassis=4,id=pci.4,bus=pcie.0,addr=0x2.0x3 \
    -device pcie-root-port,port=0x14,chassis=5,id=pci.5,bus=pcie.0,addr=0x2.0x4 \
    -device pcie-root-port,port=0x15,chassis=6,id=pci.6,bus=pcie.0,addr=0x2.0x5 \
    -device pcie-root-port,port=0x16,chassis=7,id=pci.7,bus=pcie.0,addr=0x2.0x6 \
    -device pcie-root-port,port=0x17,chassis=8,id=pci.8,bus=pcie.0,addr=0x2.0x7 \
    -blockdev driver=qcow2,cache.direct=off,cache.no-flush=off,node-name=my_file0,file.driver=file,file.filename=rhel77-64-virtio-scsi.qcow2 \
    -device virtio-blk-pci,id=virtio_blk_pci0,drive=my_file0,bus=pci.2,addr=0x0 \
    -blockdev driver=qcow2,cache.direct=off,cache.no-flush=off,node-name=my_file1,file.driver=file,file.filename=data.qcow2 \
    -device virtio-blk-pci,id=virtio_blk_pci1,drive=my_file1,bus=pcx.3,addr=0x0 \
    -vnc :0 \
    -m 8192 \
    -smp 4 \
    -netdev tap,id=hostnet0,vhost=on \
    -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:56:00:00:00:07,bus=pci.4,addr=0x0 \
    -chardev socket,id=qmp_id_qmpmonitor1,path=/home/qmp-sock2,server,nowait \
    -mon chardev=qmp_id_qmpmonitor1,mode=control  \
    -boot order=cdn,once=c,menu=off,strict=off  \
    -enable-kvm \
    -monitor stdio

3. Allocated ‘0’ to the data disk before i/o test
(guest)# dd if=/dev/zero of=/dev/vdb


4. Test the i/o performance via `fio`. 

In ‘qemu-kvm-4.2.0-1.module+el8.2.0+4793+b09dd2fb’.

# fio --filename=/dev/vdb --direct=1 --rw=randwrite --bs=4K --name=my_test --iodepth=1 --ioengine=libaio
my_test: (g=0): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=1
fio-3.1
Starting 1 process
Jobs: 1 (f=1): [w(1)][100.0%][r=0KiB/s,w=66.4MiB/s][r=0,w=17.0k IOPS][eta 00m:00s]
my_test: (groupid=0, jobs=1): err= 0: pid=2990: Mon Dec  2 18:18:01 2019
  write: IOPS=16.9k, BW=66.0MiB/s (69.2MB/s)(1280MiB/19382msec)
    slat (nsec): min=7876, max=79291, avg=9098.93, stdev=794.69
    clat (usec): min=20, max=980, avg=43.62, stdev= 3.84
     lat (usec): min=36, max=990, avg=54.24, stdev= 4.01
    clat percentiles (usec):
     |  1.00th=[   37],  5.00th=[   40], 10.00th=[   42], 20.00th=[   43],
     | 30.00th=[   43], 40.00th=[   44], 50.00th=[   44], 60.00th=[   44],
     | 70.00th=[   45], 80.00th=[   45], 90.00th=[   46], 95.00th=[   47],
     | 99.00th=[   52], 99.50th=[   53], 99.90th=[   60], 99.95th=[   85],
     | 99.99th=[  139]
   bw (  KiB/s): min=66008, max=69784, per=100.00%, avg=67630.08, stdev=863.59, samples=38
   iops        : min=16502, max=17446, avg=16907.50, stdev=215.89, samples=38
  lat (usec)   : 50=98.37%, 100=1.61%, 250=0.02%, 500=0.01%, 750=0.01%
  lat (usec)   : 1000=0.01%
  cpu          : usr=14.74%, sys=22.17%, ctx=327684, majf=0, minf=33
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,327680,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
  WRITE: bw=66.0MiB/s (69.2MB/s), 66.0MiB/s-66.0MiB/s (69.2MB/s-69.2MB/s), io=1280MiB (1342MB), run=19382-19382msec

Disk stats (read/write):
  vdb: ios=40/323644, merge=0/0, ticks=1/14184, in_queue=14108, util=73.33%

# fio --filename=/dev/vdb --direct=1 --rw=randread --bs=4K --name=my_test --iodepth=1 --ioengine=libaio
my_test: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=1
fio-3.1
Starting 1 process
Jobs: 1 (f=1): [r(1)][100.0%][r=65.2MiB/s,w=0KiB/s][r=16.7k,w=0 IOPS][eta 00m:00s]
my_test: (groupid=0, jobs=1): err= 0: pid=2998: Mon Dec  2 18:18:55 2019
   read: IOPS=17.0k, BW=66.4MiB/s (69.7MB/s)(1280MiB/19268msec)
    slat (nsec): min=7987, max=88676, avg=9089.05, stdev=730.47
    clat (usec): min=27, max=315, avg=43.14, stdev= 2.41
     lat (usec): min=48, max=422, avg=53.80, stdev= 2.67
    clat percentiles (usec):
     |  1.00th=[   41],  5.00th=[   42], 10.00th=[   42], 20.00th=[   43],
     | 30.00th=[   43], 40.00th=[   43], 50.00th=[   43], 60.00th=[   44],
     | 70.00th=[   44], 80.00th=[   44], 90.00th=[   45], 95.00th=[   46],
     | 99.00th=[   50], 99.50th=[   51], 99.90th=[   75], 99.95th=[   87],
     | 99.99th=[  127]
   bw (  KiB/s): min=66520, max=69368, per=100.00%, avg=68062.53, stdev=622.70, samples=38
   iops        : min=16630, max=17342, avg=17015.63, stdev=155.67, samples=38
  lat (usec)   : 50=99.29%, 100=0.69%, 250=0.02%, 500=0.01%
  cpu          : usr=15.25%, sys=22.20%, ctx=327686, majf=0, minf=36
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=327680,0,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
   READ: bw=66.4MiB/s (69.7MB/s), 66.4MiB/s-66.4MiB/s (69.7MB/s-69.7MB/s), io=1280MiB (1342MB), run=19268-19268msec

Disk stats (read/write):
  vdb: ios=325693/0, merge=0/0, ticks=11027/0, in_queue=10944, util=56.88%



Downgrade the qemu package to qemu-4.1. And boot guest again with that data disk. And the CMLs are the same as Step2.

In ‘qemu-kvm-4.1.0-14.module+el8.2.0+4670+07fe2774’.

# fio --filename=/dev/vdb --direct=1 --rw=randwrite --bs=4K --name=my_test --iodepth=1 --ioengine=libaio
my_test: (g=0): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=1
fio-3.1
Starting 1 process
Jobs: 1 (f=1): [w(1)][100.0%][r=0KiB/s,w=53.2MiB/s][r=0,w=13.6k IOPS][eta 00m:00s]
my_test: (groupid=0, jobs=1): err= 0: pid=3005: Mon Dec  2 18:10:47 2019
  write: IOPS=13.7k, BW=53.6MiB/s (56.2MB/s)(1280MiB/23901msec)
    slat (nsec): min=8196, max=76198, avg=9359.88, stdev=732.44
    clat (usec): min=28, max=525, avg=57.10, stdev=20.16
     lat (usec): min=39, max=535, avg=68.00, stdev=20.19
    clat percentiles (usec):
     |  1.00th=[   42],  5.00th=[   42], 10.00th=[   43], 20.00th=[   44],
     | 30.00th=[   44], 40.00th=[   45], 50.00th=[   46], 60.00th=[   47],
     | 70.00th=[   53], 80.00th=[   87], 90.00th=[   90], 95.00th=[   92],
     | 99.00th=[   97], 99.50th=[   99], 99.90th=[  109], 99.95th=[  125],
     | 99.99th=[  159]
   bw (  KiB/s): min=53024, max=57346, per=100.00%, avg=54843.96, stdev=978.77, samples=47
   iops        : min=13256, max=14336, avg=13711.00, stdev=244.74, samples=47
  lat (usec)   : 50=69.23%, 100=30.38%, 250=0.39%, 750=0.01%
  cpu          : usr=11.43%, sys=19.07%, ctx=327683, majf=0, minf=32
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,327680,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
  WRITE: bw=53.6MiB/s (56.2MB/s), 53.6MiB/s-53.6MiB/s (56.2MB/s-56.2MB/s), io=1280MiB (1342MB), run=23901-23901msec

Disk stats (read/write):
  vdb: ios=40/327574, merge=0/0, ticks=4/19238, in_queue=19153, util=79.85%

# fio --filename=/dev/vdb --direct=1 --rw=randread --bs=4K --name=my_test --iodepth=1 --ioengine=libaio
my_test: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=1
fio-3.1
Starting 1 process
Jobs: 1 (f=1): [r(1)][100.0%][r=53.9MiB/s,w=0KiB/s][r=13.8k,w=0 IOPS][eta 00m:00s]
my_test: (groupid=0, jobs=1): err= 0: pid=3010: Mon Dec  2 18:11:22 2019
   read: IOPS=13.7k, BW=53.4MiB/s (56.0MB/s)(1280MiB/23967msec)
    slat (nsec): min=7952, max=88243, avg=9037.58, stdev=738.39
    clat (usec): min=31, max=1259, avg=57.70, stdev=20.36
     lat (usec): min=48, max=1269, avg=68.24, stdev=20.39
    clat percentiles (usec):
     |  1.00th=[   42],  5.00th=[   43], 10.00th=[   43], 20.00th=[   45],
     | 30.00th=[   45], 40.00th=[   46], 50.00th=[   46], 60.00th=[   47],
     | 70.00th=[   52], 80.00th=[   87], 90.00th=[   90], 95.00th=[   92],
     | 99.00th=[   96], 99.50th=[   99], 99.90th=[  103], 99.95th=[  110],
     | 99.99th=[  159]
   bw (  KiB/s): min=52704, max=56216, per=99.91%, avg=54638.47, stdev=853.21, samples=47
   iops        : min=13176, max=14054, avg=13659.62, stdev=213.30, samples=47
  lat (usec)   : 50=69.17%, 100=30.54%, 250=0.29%, 500=0.01%, 750=0.01%
  lat (usec)   : 1000=0.01%
  lat (msec)   : 2=0.01%
  cpu          : usr=11.47%, sys=18.47%, ctx=327683, majf=0, minf=37
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=327680,0,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
   READ: bw=53.4MiB/s (56.0MB/s), 53.4MiB/s-53.4MiB/s (56.0MB/s-56.0MB/s), io=1280MiB (1342MB), run=23967-23967msec

Disk stats (read/write):
  vdb: ios=326530/0, merge=0/0, ticks=19143/0, in_queue=19039, util=79.37%

Comment 9 Ademar Reis 2020-02-05 23:03:20 UTC

QEMU has been recently split into sub-components and as a one-time operation to avoid breakage of tools, we are setting the QEMU sub-component of this BZ to "General". Please review and change the sub-component if necessary the next time you review this BZ. Thanks

Comment 12 errata-xmlrpc 2020-05-05 09:47:43 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:2017