RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1150403 - iops_max and iops can not work as expected
Summary: iops_max and iops can not work as expected
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: qemu-kvm-rhev
Version: 7.1
Hardware: All
OS: Linux
medium
high
Target Milestone: rc
: ---
Assignee: Stefan Hajnoczi
QA Contact: Virtualization Bugs
URL:
Whiteboard:
Depends On:
Blocks: 1146755 1247688 1247830
TreeView+ depends on / blocked
 
Reported: 2014-10-08 08:09 UTC by Jun Li
Modified: 2015-09-15 13:29 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 1247688 (view as bug list)
Environment:
Last Closed: 2015-07-29 13:47:45 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Jun Li 2014-10-08 08:09:06 UTC
Description of problem:
When boot guest with "iops_max=10000,iops=100", throttling can not be triggered once "I/O pool" is empty.

Version-Release number of selected component (if applicable):
qemu-kvm-rhev-2.1.0-5.el7.x86_64

How reproducible:
100%

Steps to Reproduce:
1.boot guest with following cli:
# /usr/libexec/qemu-kvm -M pc -m 2G -smp 2 -spice port=5931,disable-ticketing -monitor stdio -qmp tcp::8888,server,nowait \
-drive file=/home/RHEL-Server-7.0-64-virtio.qcow2,if=none,id=img,snapshot=on,bps_max=1024000000 \
-device virtio-blk-pci,drive=img,id=sys-img \
-netdev tap,id=tap0,script=/etc/qemu-ifup \
-device virtio-net-pci,netdev=tap0,id=net0,mac=24:be:05:11:15:11 \
-drive file=/home/data.qcow2,if=none,id=disk0,iops_max=10000,iops=100,cache=none \
-device virtio-blk-pci,drive=disk0,id=sys-disk0
2.do testing inside guest using fio.
# fio --filename=/dev/vdb --direct=1 --rw=randwrite --bs=64k --size=1000M --name=test --iodepth=1 --ioengine=libaio --runtime=60s
3.repeat step 2 after step 2 finished.

Actual results:
###After step 2, got results as followings:
...
  write: io=994.77MB, bw=16976KB/s, iops=265, runt= 60005msec
...
Here: 265*60 > 10000.

###After step 3, got following results:
...
  write: io=567040KB, bw=9450.4KB/s, iops=147, runt= 60002msec
...

Expected results:
After step 3, should got the value of iops <= 100. Not "iops=147". 

Additional info:

Comment 3 Gu Nini 2015-03-12 07:56:02 UTC
The bug also occurred under power KVM, the detailed software versions are as follows:

Host kernel: 3.10.0-229.el7.ppc64
Qemu KVM: 
qemu-kvm-common-rhev-2.2.0-5.el7.ppc64
qemu-kvm-tools-rhev-2.2.0-5.el7.ppc64
qemu-img-rhev-2.2.0-5.el7.ppc64
qemu-kvm-rhev-2.2.0-5.el7.ppc64
qemu-kvm-rhev-debuginfo-2.2.0-5.el7.ppc64

That's the combined options iops and iops_max, bps and bps_max, iops_rd and iops_rd_max, iops_wr and iops_wr_max, bps_rd and bps_rd_max, bps_wr and bps_wr_max, could not work as expected as that in the bug.

Also change the 'Hardware' field to 'All' since it exists on both x86_64 and ppc64.

Comment 5 Stefan Hajnoczi 2015-07-29 13:47:45 UTC
(In reply to Jun Li from comment #0)
> Description of problem:
> When boot guest with "iops_max=10000,iops=100", throttling can not be
> triggered once "I/O pool" is empty.
...
> 2.do testing inside guest using fio.
> # fio --filename=/dev/vdb --direct=1 --rw=randwrite --bs=64k --size=1000M
> --name=test --iodepth=1 --ioengine=libaio --runtime=60s
> 3.repeat step 2 after step 2 finished.
> 
> Actual results:
> ###After step 2, got results as followings:
> ...
>   write: io=994.77MB, bw=16976KB/s, iops=265, runt= 60005msec
> ...
> Here: 265*60 > 10000.
> 
> ###After step 3, got following results:
> ...
>   write: io=567040KB, bw=9450.4KB/s, iops=147, runt= 60002msec
> ...
> 
> Expected results:
> After step 3, should got the value of iops <= 100. Not "iops=147". 

The iops_max setting is the "burst" capacity that a guest is allowed to use before being affected by I/O throttling.

It allows the guest to temporarily exceed the iops limit, but then the guest needs to wait for time to pass before it can issue more requests.  At some point the guest is able to make use of the burst capacity again.

The results reported here look plausible and correct.  The configuration is extreme: burst capacity is 100x iops limit, and means the guest will not be limited soley to the iops limit.

Comment 6 Xiaoqing Wei 2015-08-10 07:58:05 UTC
Hi Stefan,

Is there a formula/rule to set the _max values ?

1) like if I set bps=102400 to -drive, what range of bps_max= is supported ?
    like you mentioned in C#5, 100x is extreme, I would like to know what value is reasonable to test.
2) and what would happen if a guest exceeded it's bps/iops etc limit ?
    bursting IO for a while and banned for a while ?


I'm curios as I found that bps/bps_max also no limited by bps solely



    -device scsi-hd,id=image2,drive=drive_image2 \
    -drive id=drive_image2,if=none,cache=none,snapshot=off,aio=native,format=raw,file="/dev/sdb",bps=102400,bps_max=10240000

--------------------------------

(qemu) info block
drive_image1: /home/staf-kvm-devel/autotest-devel/client/tests/virt/shared/data/images/RHEL-Server-6.7-64-virtio-scsi.qcow2 (qcow2)
    Cache mode:       writeback, direct

drive_image2: /dev/sdb (raw)
    Cache mode:       writeback, direct
    I/O throttling:   bps=102400 bps_rd=0 bps_wr=0 bps_max=10240000 bps_rd_max=0 bps_wr_max=0 iops=0 iops_rd=0 iops_wr=0 iops_max=0 iops_rd_max=0 iops_wr_max=0 iops_size=0


==================================


[root@dhcp-8-155 ~]# fio --filename=/dev/sdb --direct=1 --rw=read --bs=64k --size=1000M --name=test --iodepth=1 --ioengine=libaio --runtime=300
test: (g=0): rw=read, bs=64K-64K/64K-64K/64K-64K, ioengine=libaio, iodepth=1
fio-2.0.13
Starting 1 process
Jobs: 1 (f=1): [R] [3.9% done] [127K/0K/0K /s] [1 /0 /0  iops] [eta 02h:03m:12s]
test: (groupid=0, jobs=1): err= 0: pid=2143: Mon Aug 10 14:33:08 2015
  read : io=40128KB, bw=136668 B/s, iops=2 , runt=300663msec
    slat (usec): min=14 , max=73 , avg=36.28, stdev= 8.32
    clat (msec): min=1 , max=680 , avg=479.48, stdev=277.02
     lat (msec): min=1 , max=680 , avg=479.52, stdev=277.02
    clat percentiles (usec):
     |  1.00th=[ 1576],  5.00th=[ 2544], 10.00th=[ 2544], 20.00th=[ 2576],
     | 30.00th=[618496], 40.00th=[626688], 50.00th=[643072], 60.00th=[643072],
     | 70.00th=[643072], 80.00th=[643072], 90.00th=[651264], 95.00th=[659456],
     | 99.00th=[667648], 99.50th=[675840], 99.90th=[684032], 99.95th=[684032],
     | 99.99th=[684032]
    bw (KB/s)  : min=   94, max= 9061, per=89.38%, avg=118.88, stdev=413.80
    lat (msec) : 2=1.44%, 4=23.44%, 50=0.16%, 100=0.16%, 750=74.80%
  cpu          : usr=0.00%, sys=0.01%, ctx=627, majf=0, minf=41
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued    : total=r=627/w=0/d=0, short=r=0/w=0/d=0

Run status group 0 (all jobs):
   READ: io=40128KB, aggrb=133KB/s, minb=133KB/s, maxb=133KB/s, mint=300663msec, maxt=300663msec

Disk stats (read/write):
  sdb: ios=626/0, merge=0/0, ticks=299950/0, in_queue=300396, util=100.00%




Thank you !
Xiaoqing Wei.

Comment 7 Stefan Hajnoczi 2015-08-11 14:54:04 UTC
(In reply to Xiaoqing Wei from comment #6)
> Is there a formula/rule to set the _max values ?
> 
> 1) like if I set bps=102400 to -drive, what range of bps_max= is supported ?
>     like you mentioned in C#5, 100x is extreme, I would like to know what
> value is reasonable to test.

A realistic range for the burst value is 0x to 4x the limit.

The rate at which the throttling algorithm recovers from a burst is determined by the limit value.  That means a huge burst value with a small limit will allow the guest to greatly exceed the limit, but the guest won't be able to use the full burst amount again for a long time.

bps=1M,bps_max=512K allows the guest to do more than 1 MB/s of I/O.  It can do up to 1.5 MB/s but then needs to recover for a while before 1.5 MB/s can be reached again.

An important thing to consider for testing is that fio will display a higher throughput than the limit value, since the burst is added on top of the limit.

> 2) and what would happen if a guest exceeded it's bps/iops etc limit ?
>     bursting IO for a while and banned for a while ?

If the guest consumes the burst amount it can still do more I/O up to the bps/iops limit.

If the burst + limit has been consumed, then requests are throttled.

Comment 8 Xiaoqing Wei 2015-08-12 05:43:55 UTC
(In reply to Stefan Hajnoczi from comment #7)
> (In reply to Xiaoqing Wei from comment #6)
> > Is there a formula/rule to set the _max values ?
> > 
> > 1) like if I set bps=102400 to -drive, what range of bps_max= is supported ?
> >     like you mentioned in C#5, 100x is extreme, I would like to know what
> > value is reasonable to test.
> 
> A realistic range for the burst value is 0x to 4x the limit.
> 
> The rate at which the throttling algorithm recovers from a burst is
> determined by the limit value.  That means a huge burst value with a small
> limit will allow the guest to greatly exceed the limit, but the guest won't
> be able to use the full burst amount again for a long time.
> 
> bps=1M,bps_max=512K allows the guest to do more than 1 MB/s of I/O.  It can
> do up to 1.5 MB/s but then needs to recover for a while before 1.5 MB/s can
> be reached again.
> 
> An important thing to consider for testing is that fio will display a higher
> throughput than the limit value, since the burst is added on top of the
> limit.
> 
> > 2) and what would happen if a guest exceeded it's bps/iops etc limit ?
> >     bursting IO for a while and banned for a while ?
> 
> If the guest consumes the burst amount it can still do more I/O up to the
> bps/iops limit.
> 
> If the burst + limit has been consumed, then requests are throttled.


Thanks for the explain, Stefan

Hi Nini,

accord to Stefan's explain above, seems the existing test case would need a revise ? as I rem some case are using 100x time


Cheers,
Xiaoqing.

Comment 9 Gu Nini 2015-08-31 12:37:25 UTC
(In reply to Xiaoqing Wei from comment #8)
> 
> Hi Nini,
> 
> accord to Stefan's explain above, seems the existing test case would need a
> revise ? as I rem some case are using 100x time
> 
> 
> Cheers,
> Xiaoqing.

Xiaoqing,

Thanks a lot, is checking it.


And Stefan,

'As you described above, the iops_max setting is the "burst" capacity that a guest is allowed to use before being affected by I/O throttling, it allows the guest to temporarily exceed the iops limit, but then the guest needs to wait for time to pass before it can issue more requests. At some point the guest is able to make use of the burst capacity again.'

My question is at what point the guest is able to used the burst again, can you describe in more detail, is it in a cycle?

And since 'fio will display a higher throughput than the limit value since the burst is added on top of the limit' since you said in comment 7, do you know in what way and using what tools we can test the 'FOO_max' function? In my test with 'bps=bps=1048576,bps_max=2048000' and fio, the test result is not clear.

Thannks in advance.

Nini Gu

Comment 10 Stefan Hajnoczi 2015-09-08 15:57:24 UTC
(In reply to Gu Nini from comment #9)
> (In reply to Xiaoqing Wei from comment #8)
> > 
> > Hi Nini,
> > 
> > accord to Stefan's explain above, seems the existing test case would need a
> > revise ? as I rem some case are using 100x time
> > 
> > 
> > Cheers,
> > Xiaoqing.
> 
> Xiaoqing,
> 
> Thanks a lot, is checking it.
> 
> 
> And Stefan,
> 
> 'As you described above, the iops_max setting is the "burst" capacity that a
> guest is allowed to use before being affected by I/O throttling, it allows
> the guest to temporarily exceed the iops limit, but then the guest needs to
> wait for time to pass before it can issue more requests. At some point the
> guest is able to make use of the burst capacity again.'
> 
> My question is at what point the guest is able to used the burst again, can
> you describe in more detail, is it in a cycle?

The formula for throttling adds the burst and the limit value together.  If the total amount of recent I/O (called the "bucket level") exceeds this value, new requests are throttled.

The bucket level is reduced over time by looking only at the limit value.  This means that a high limit allows the guest to perform I/O again sooner.

The burst does not affect the rate at which the bucket level is reduced.

The logic for this is in util/throttle.c.  There are test cases in tests/test-throttle.c if you want to try different scenarios.

The burst can only be reused after the bucket level has reduced back below the FOO_max value.  This means the guest has to stop submitting new I/O for a while, so that the bucket level can become lower than FOO_max.

> And since 'fio will display a higher throughput than the limit value since
> the burst is added on top of the limit' since you said in comment 7, do you
> know in what way and using what tools we can test the 'FOO_max' function? In
> my test with 'bps=bps=1048576,bps_max=2048000' and fio, the test result is
> not clear.

The max is bps + bps_max BPS.  However, the guest needs to wait for 3 seconds afterwards to be able to send the next 3 MB/s.  If the guest doesn't wait, then it won't be able to achieve 3 MB/s because the bucket level will not reach 0 - so the effect of bps_max will be smaller.

Comment 11 Gu Nini 2015-09-11 10:36:32 UTC
(In reply to Stefan Hajnoczi from comment #10)

Stefan,

Thanks for the detailed explanation, now I can understand it. The problem in original the bug description is that it only count the burst value WHILE in fact 'The formula for throttling adds the burst and the limit value together' and 'The max is bps + bps_max BPS'; and if want to clear the bucket pool to used the burst again, it should wait for enough time. So the way it used to do test is wrong, and since it's also what we used in our current FOO_max test, I will change it to the right way, would you help to check if following bash script is correct enough to do the test for a guest started with 'bps=1024000,bps_max=2048000'? If it's ok, I will change the test cases.


# cat IOthrottling_FOOmax.sh
#!/bin/bash

echo 'To full occupy the bucket pool:####'
/root/fio-2.1.10/fio --filename=/dev/sdb --direct=1 --rw=read --bs=4k --size=1000M --name=test --iodepth=1 --runtime=60

echo 'Check if io is throttled now:####'
/root/fio-2.1.10/fio --filename=/dev/sdb --direct=1 --rw=read --bs=4k --size=1000M --name=test --iodepth=1 --runtime=60

echo 'Wait for enough time to clear the io bucket:####'
sleep 240

echo 'To full occupy the bucket pool for the 2nd time:####'
/root/fio-2.1.10/fio --filename=/dev/sdb --direct=1 --rw=read --bs=4k --size=1000M --name=test --iodepth=1 --runtime=60

echo 'Check if io is throttled for the 2nd time:####'
/root/fio-2.1.10/fio --filename=/dev/sdb --direct=1 --rw=read --bs=4k --size=1000M --name=test --iodepth=1 --runtime=60


Thanks a lot!
Nini Gu

Comment 12 Stefan Hajnoczi 2015-09-15 13:29:53 UTC
(In reply to Gu Nini from comment #11)
> (In reply to Stefan Hajnoczi from comment #10)
> 
> Stefan,
> 
> Thanks for the detailed explanation, now I can understand it. The problem in
> original the bug description is that it only count the burst value WHILE in
> fact 'The formula for throttling adds the burst and the limit value
> together' and 'The max is bps + bps_max BPS'; and if want to clear the
> bucket pool to used the burst again, it should wait for enough time. So the
> way it used to do test is wrong, and since it's also what we used in our
> current FOO_max test, I will change it to the right way, would you help to
> check if following bash script is correct enough to do the test for a guest
> started with 'bps=1024000,bps_max=2048000'? If it's ok, I will change the
> test cases.
> 
> 
> # cat IOthrottling_FOOmax.sh
> #!/bin/bash
> 
> echo 'To full occupy the bucket pool:####'
> /root/fio-2.1.10/fio --filename=/dev/sdb --direct=1 --rw=read --bs=4k
> --size=1000M --name=test --iodepth=1 --runtime=60
> 
> echo 'Check if io is throttled now:####'
> /root/fio-2.1.10/fio --filename=/dev/sdb --direct=1 --rw=read --bs=4k
> --size=1000M --name=test --iodepth=1 --runtime=60
> 
> echo 'Wait for enough time to clear the io bucket:####'
> sleep 240

sleep 240 is much longer than necessary, you can safely reduce it to sleep 5 or sleep 10.  Here is an explanation:

The rate at which the bucket empties is bps.  So with bps=1024000 the bucket level decreases by 1024000 bytes/sec.

The maximum bucket level is bps + bps_max so with bps_max=2048000 it is 3072000.

So it will take (bps + bps_max) / bps seconds to empty the bucket.  After 3 seconds the bucket should be 0 again.  To be on the safe side it's a good idea to round up to 5 seconds.

> echo 'To full occupy the bucket pool for the 2nd time:####'
> /root/fio-2.1.10/fio --filename=/dev/sdb --direct=1 --rw=read --bs=4k
> --size=1000M --name=test --iodepth=1 --runtime=60
> 
> echo 'Check if io is throttled for the 2nd time:####'
> /root/fio-2.1.10/fio --filename=/dev/sdb --direct=1 --rw=read --bs=4k
> --size=1000M --name=test --iodepth=1 --runtime=60

Yes, this looks good.


Note You need to log in before you can comment on or make changes to this bug.