Bug 1897572 - QEMU Ceph driver performance issues - krbd is 2-3x faster than librados driver
Summary: QEMU Ceph driver performance issues - krbd is 2-3x faster than librados driver
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Enterprise Linux Advanced Virtualization
Classification: Red Hat
Component: qemu-kvm
Version: ---
Hardware: x86_64
OS: Linux
medium
medium
Target Milestone: rc
: 8.3
Assignee: Stefano Garzarella
QA Contact: zixchen
URL:
Whiteboard:
Depends On: 1813960
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-11-13 13:48 UTC by David Hill
Modified: 2023-12-15 20:06 UTC (History)
26 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-06-16 17:18:36 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 5514611 0 None None None 2022-03-16 15:04:26 UTC

Description David Hill 2020-11-13 13:48:46 UTC
What problem/issue/behavior are you having trouble with?  What do you expect to see?
Hi Red Hat Team,

We are looking into methods of improving our performance for database instances on Ceph. 
During our initial testing with this change in place, our test instances on Ceph saw a dramatic improvement in performance. The FIO profile we used is here [2], then our baseline is here [3], and using krbd it's here [4]. Now bear in mind we are not using LUKS encryption, we have no plans to encrypt volumes. But the performance difference between the scenarios I presented is very dramatic. It's essentially the difference for our business to determine if we need a SAN based solution or if Ceph could perform to the level we need it too. Clearly from the results in [4], it can... 

Th

Thanks so much,


[2]
[global]
bs=${BLKSIZE}
ioengine=libaio
iodepth=${IODEPTH}
norandommap
direct=1
time_based=1
numjobs=${NUMJOBS}
runtime=${RUNTIME}
filename=${DEVICE}
group_reporting
#cpus_allowed=${CPUS_ALLOWED}
cpus_allowed_policy=split
ramp_time=30

[seq-write]
rw=write
stonewall

[rand-write]
rw=randwrite
stonewall

[seq-read]
rw=read
stonewall

[rand-read]
rw=randread
stonewall

[rand-mix_80r_20w]
rw=randrw
rwmixwrite=20
stonewall

[rand-mix_50r_50w]
rw=randrw
rwmixwrite=50

[3]
seq-write: (g=0): rw=write, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=32
rand-write: (g=1): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=32
seq-read: (g=2): rw=read, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=32
rand-read: (g=3): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=32
rand-mix_80r_20w: (g=4): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=32
rand-mix_50r_50w: (g=5): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=32
fio-3.19
Starting 6 processes
Jobs: 1 (f=1): [_(5),m(1)][60.0%][r=41.9MiB/s,w=41.4MiB/s][r=10.7k,w=10.6k IOPS][eta 10m:00s]    
seq-write: (groupid=0, jobs=1): err= 0: pid=1574: Mon Oct 12 23:39:34 2020
  write: IOPS=38.2k, BW=149MiB/s (156MB/s)(17.5GiB/120002msec)
    slat (usec): min=3, max=1108, avg= 8.29, stdev= 2.30
    clat (usec): min=195, max=9198, avg=825.72, stdev=355.23
     lat (usec): min=207, max=9210, avg=834.86, stdev=355.41
    clat percentiles (usec):
     |  1.00th=[  375],  5.00th=[  416], 10.00th=[  429], 20.00th=[  474],
     | 30.00th=[  578], 40.00th=[  685], 50.00th=[  783], 60.00th=[  857],
     | 70.00th=[  963], 80.00th=[ 1123], 90.00th=[ 1336], 95.00th=[ 1483],
     | 99.00th=[ 1778], 99.50th=[ 1926], 99.90th=[ 2311], 99.95th=[ 2671],
     | 99.99th=[ 4948]
   bw (  KiB/s): min=130928, max=166504, per=100.00%, avg=153051.59, stdev=8805.20, samples=239
   iops        : min=32732, max=41626, avg=38262.88, stdev=2201.28, samples=239
  lat (usec)   : 250=0.01%, 500=22.55%, 750=24.06%, 1000=26.33%
  lat (msec)   : 2=26.71%, 4=0.33%, 10=0.01%
  cpu          : usr=16.90%, sys=45.37%, ctx=2877365, majf=0, minf=65
  IO depths    : 1=0.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=100.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,4584878,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=32
rand-write: (groupid=1, jobs=1): err= 0: pid=1578: Mon Oct 12 23:39:34 2020
  write: IOPS=20.2k, BW=78.0MiB/s (82.8MB/s)(9478MiB/120001msec)
    slat (usec): min=3, max=923, avg= 8.08, stdev= 2.95
    clat (usec): min=260, max=10514, avg=1571.15, stdev=234.25
     lat (usec): min=265, max=10522, avg=1580.04, stdev=234.02
    clat percentiles (usec):
     |  1.00th=[ 1004],  5.00th=[ 1221], 10.00th=[ 1319], 20.00th=[ 1418],
     | 30.00th=[ 1467], 40.00th=[ 1516], 50.00th=[ 1565], 60.00th=[ 1614],
     | 70.00th=[ 1663], 80.00th=[ 1729], 90.00th=[ 1827], 95.00th=[ 1926],
     | 99.00th=[ 2180], 99.50th=[ 2311], 99.90th=[ 2704], 99.95th=[ 3097],
     | 99.99th=[ 5342]
   bw (  KiB/s): min=70538, max=83976, per=100.00%, avg=81007.11, stdev=1967.83, samples=239
   iops        : min=17634, max=20994, avg=20251.75, stdev=491.97, samples=239
  lat (usec)   : 500=0.07%, 750=0.19%, 1000=0.71%
  lat (msec)   : 2=95.92%, 4=3.08%, 10=0.02%, 20=0.01%
  cpu          : usr=9.09%, sys=22.70%, ctx=1281415, majf=0, minf=64
  IO depths    : 1=0.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=100.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,2426274,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=32
seq-read: (groupid=2, jobs=1): err= 0: pid=1581: Mon Oct 12 23:39:34 2020
  read: IOPS=7509, BW=29.3MiB/s (30.8MB/s)(3520MiB/120005msec)
    slat (nsec): min=3583, max=97226, avg=8003.26, stdev=1005.53
    clat (usec): min=255, max=9669, avg=4249.73, stdev=752.70
     lat (usec): min=264, max=9677, avg=4258.50, stdev=752.73
    clat percentiles (usec):
     |  1.00th=[ 1237],  5.00th=[ 2802], 10.00th=[ 3490], 20.00th=[ 3884],
     | 30.00th=[ 4080], 40.00th=[ 4228], 50.00th=[ 4359], 60.00th=[ 4490],
     | 70.00th=[ 4621], 80.00th=[ 4752], 90.00th=[ 5014], 95.00th=[ 5211],
     | 99.00th=[ 5538], 99.50th=[ 5669], 99.90th=[ 6259], 99.95th=[ 7177],
     | 99.99th=[ 8979]
   bw (  KiB/s): min=27920, max=38008, per=100.00%, avg=30079.14, stdev=1261.83, samples=239
   iops        : min= 6980, max= 9502, avg=7519.78, stdev=315.46, samples=239
  lat (usec)   : 500=0.07%, 750=0.32%, 1000=0.32%
  lat (msec)   : 2=1.28%, 4=24.14%, 10=73.87%
  cpu          : usr=3.50%, sys=9.54%, ctx=883627, majf=0, minf=66
  IO depths    : 1=0.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=100.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%, >=64=0.0%
     issued rwts: total=901208,0,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=32
rand-read: (groupid=3, jobs=1): err= 0: pid=1582: Mon Oct 12 23:39:34 2020
  read: IOPS=25.7k, BW=101MiB/s (105MB/s)(11.8GiB/120001msec)
    slat (usec): min=3, max=1636, avg= 7.98, stdev= 1.95
    clat (usec): min=238, max=19104, avg=1232.35, stdev=179.47
     lat (usec): min=246, max=19109, avg=1241.09, stdev=179.46
    clat percentiles (usec):
     |  1.00th=[ 1057],  5.00th=[ 1090], 10.00th=[ 1106], 20.00th=[ 1139],
     | 30.00th=[ 1139], 40.00th=[ 1156], 50.00th=[ 1172], 60.00th=[ 1188],
     | 70.00th=[ 1221], 80.00th=[ 1287], 90.00th=[ 1549], 95.00th=[ 1614],
     | 99.00th=[ 1729], 99.50th=[ 1778], 99.90th=[ 1991], 99.95th=[ 2704],
     | 99.99th=[ 5276]
   bw (  KiB/s): min=75864, max=112136, per=100.00%, avg=103059.16, stdev=10479.82, samples=239
   iops        : min=18966, max=28034, avg=25764.80, stdev=2619.96, samples=239
  lat (usec)   : 250=0.01%, 500=0.01%, 750=0.02%, 1000=0.27%
  lat (msec)   : 2=99.59%, 4=0.08%, 10=0.02%, 20=0.01%
  cpu          : usr=11.52%, sys=32.48%, ctx=3077788, majf=0, minf=64
  IO depths    : 1=0.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=100.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%, >=64=0.0%
     issued rwts: total=3087819,0,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=32
rand-mix_80r_20w: (groupid=4, jobs=1): err= 0: pid=1583: Mon Oct 12 23:39:34 2020
  read: IOPS=18.9k, BW=73.8MiB/s (77.4MB/s)(8857MiB/120001msec)
    slat (usec): min=3, max=1389, avg= 7.92, stdev= 3.05
    clat (usec): min=301, max=9173, avg=1409.03, stdev=250.82
     lat (usec): min=315, max=9184, avg=1417.74, stdev=250.78
    clat percentiles (usec):
     |  1.00th=[  873],  5.00th=[ 1045], 10.00th=[ 1139], 20.00th=[ 1221],
     | 30.00th=[ 1287], 40.00th=[ 1336], 50.00th=[ 1385], 60.00th=[ 1434],
     | 70.00th=[ 1500], 80.00th=[ 1582], 90.00th=[ 1713], 95.00th=[ 1844],
     | 99.00th=[ 2089], 99.50th=[ 2180], 99.90th=[ 2606], 99.95th=[ 3163],
     | 99.99th=[ 5145]
   bw (  KiB/s): min=56000, max=81928, per=100.00%, avg=75685.10, stdev=6442.14, samples=239
   iops        : min=14000, max=20482, avg=18921.27, stdev=1610.52, samples=239
  write: IOPS=4728, BW=18.5MiB/s (19.4MB/s)(2216MiB/120001msec)
    slat (usec): min=3, max=740, avg= 8.43, stdev= 2.06
    clat (usec): min=52, max=5578, avg=1080.57, stdev=232.26
     lat (usec): min=62, max=5588, avg=1089.77, stdev=232.17
    clat percentiles (usec):
     |  1.00th=[  586],  5.00th=[  734], 10.00th=[  816], 20.00th=[  906],
     | 30.00th=[  963], 40.00th=[ 1012], 50.00th=[ 1057], 60.00th=[ 1106],
     | 70.00th=[ 1172], 80.00th=[ 1237], 90.00th=[ 1369], 95.00th=[ 1500],
     | 99.00th=[ 1729], 99.50th=[ 1795], 99.90th=[ 2008], 99.95th=[ 2311],
     | 99.99th=[ 4080]
   bw (  KiB/s): min=13560, max=21024, per=100.00%, avg=18940.05, stdev=1642.92, samples=239
   iops        : min= 3390, max= 5256, avg=4734.98, stdev=410.72, samples=239
  lat (usec)   : 100=0.01%, 250=0.01%, 500=0.09%, 750=1.28%, 1000=8.64%
  lat (msec)   : 2=88.49%, 4=1.47%, 10=0.02%
  cpu          : usr=10.13%, sys=28.72%, ctx=2232064, majf=0, minf=64
  IO depths    : 1=0.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=100.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%, >=64=0.0%
     issued rwts: total=2267278,567373,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=32
rand-mix_50r_50w: (groupid=5, jobs=1): err= 0: pid=1585: Mon Oct 12 23:39:34 2020
  read: IOPS=10.5k, BW=41.1MiB/s (43.1MB/s)(4931MiB/120001msec)
    slat (usec): min=3, max=1324, avg= 7.85, stdev= 2.77
    clat (usec): min=326, max=31820, avg=1668.01, stdev=283.46
     lat (usec): min=331, max=31829, avg=1676.65, stdev=283.33
    clat percentiles (usec):
     |  1.00th=[ 1123],  5.00th=[ 1287], 10.00th=[ 1369], 20.00th=[ 1450],
     | 30.00th=[ 1516], 40.00th=[ 1582], 50.00th=[ 1631], 60.00th=[ 1696],
     | 70.00th=[ 1778], 80.00th=[ 1876], 90.00th=[ 2024], 95.00th=[ 2147],
     | 99.00th=[ 2376], 99.50th=[ 2474], 99.90th=[ 3458], 99.95th=[ 4293],
     | 99.99th=[ 6390]
   bw (  KiB/s): min=33520, max=46720, per=100.00%, avg=42158.05, stdev=3281.20, samples=239
   iops        : min= 8380, max=11680, avg=10539.50, stdev=820.29, samples=239
  write: IOPS=10.5k, BW=41.1MiB/s (43.1MB/s)(4932MiB/120001msec)
    slat (usec): min=3, max=1250, avg= 8.33, stdev= 2.90
    clat (usec): min=49, max=6268, avg=1350.57, stdev=264.40
     lat (usec): min=57, max=6277, avg=1359.69, stdev=264.20
    clat percentiles (usec):
     |  1.00th=[  791],  5.00th=[  963], 10.00th=[ 1057], 20.00th=[ 1139],
     | 30.00th=[ 1205], 40.00th=[ 1270], 50.00th=[ 1319], 60.00th=[ 1385],
     | 70.00th=[ 1467], 80.00th=[ 1565], 90.00th=[ 1696], 95.00th=[ 1795],
     | 99.00th=[ 2008], 99.50th=[ 2114], 99.90th=[ 2540], 99.95th=[ 3163],
     | 99.99th=[ 4883]
   bw (  KiB/s): min=33192, max=46440, per=100.00%, avg=42167.66, stdev=3284.10, samples=239
   iops        : min= 8298, max=11610, avg=10541.90, stdev=821.02, samples=239
  lat (usec)   : 50=0.01%, 100=0.01%, 250=0.01%, 500=0.06%, 750=0.30%
  lat (usec)   : 1000=3.17%
  lat (msec)   : 2=90.36%, 4=6.07%, 10=0.04%, 20=0.01%, 50=0.01%
  cpu          : usr=8.92%, sys=24.62%, ctx=1594184, majf=0, minf=63
  IO depths    : 1=0.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=100.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%, >=64=0.0%
     issued rwts: total=1262390,1262650,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=32

Run status group 0 (all jobs):
  WRITE: bw=149MiB/s (156MB/s), 149MiB/s-149MiB/s (156MB/s-156MB/s), io=17.5GiB (18.8GB), run=120002-120002msec

Run status group 1 (all jobs):
  WRITE: bw=78.0MiB/s (82.8MB/s), 78.0MiB/s-78.0MiB/s (82.8MB/s-82.8MB/s), io=9478MiB (9938MB), run=120001-120001msec

Run status group 2 (all jobs):
   READ: bw=29.3MiB/s (30.8MB/s), 29.3MiB/s-29.3MiB/s (30.8MB/s-30.8MB/s), io=3520MiB (3691MB), run=120005-120005msec

Run status group 3 (all jobs):
   READ: bw=101MiB/s (105MB/s), 101MiB/s-101MiB/s (105MB/s-105MB/s), io=11.8GiB (12.6GB), run=120001-120001msec

Run status group 4 (all jobs):
   READ: bw=73.8MiB/s (77.4MB/s), 73.8MiB/s-73.8MiB/s (77.4MB/s-77.4MB/s), io=8857MiB (9287MB), run=120001-120001msec
  WRITE: bw=18.5MiB/s (19.4MB/s), 18.5MiB/s-18.5MiB/s (19.4MB/s-19.4MB/s), io=2216MiB (2324MB), run=120001-120001msec

Run status group 5 (all jobs):
   READ: bw=41.1MiB/s (43.1MB/s), 41.1MiB/s-41.1MiB/s (43.1MB/s-43.1MB/s), io=4931MiB (5171MB), run=120001-120001msec
  WRITE: bw=41.1MiB/s (43.1MB/s), 41.1MiB/s-41.1MiB/s (43.1MB/s-43.1MB/s), io=4932MiB (5172MB), run=120001-120001msec

Disk stats (read/write):
  vdb: ios=9423940/11049668, merge=0/0, ticks=16094933/12019245, in_queue=28109168, util=100.00%

[4]
seq-write: (g=0): rw=write, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=32
rand-write: (g=1): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=32
seq-read: (g=2): rw=read, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=32
rand-read: (g=3): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=32
rand-mix_80r_20w: (g=4): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=32
rand-mix_50r_50w: (g=5): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=32
fio-3.19
Starting 6 processes
Jobs: 1 (f=1): [_(5),m(1)][60.0%][r=93.6MiB/s,w=93.8MiB/s][r=23.0k,w=24.0k IOPS][eta 10m:00s]    
seq-write: (groupid=0, jobs=1): err= 0: pid=1357: Mon Oct 12 23:54:45 2020
  write: IOPS=41.9k, BW=164MiB/s (171MB/s)(19.2GiB/120002msec)
    slat (usec): min=3, max=2278, avg= 7.24, stdev= 3.33
    clat (usec): min=341, max=17084, avg=753.49, stdev=407.17
     lat (usec): min=364, max=17091, avg=761.62, stdev=407.00
    clat percentiles (usec):
     |  1.00th=[  461],  5.00th=[  510], 10.00th=[  537], 20.00th=[  586],
     | 30.00th=[  627], 40.00th=[  660], 50.00th=[  693], 60.00th=[  734],
     | 70.00th=[  783], 80.00th=[  840], 90.00th=[  955], 95.00th=[ 1074],
     | 99.00th=[ 1713], 99.50th=[ 3064], 99.90th=[ 7177], 99.95th=[ 7898],
     | 99.99th=[ 9241]
   bw (  KiB/s): min=141472, max=183752, per=100.00%, avg=167682.66, stdev=5513.07, samples=239
   iops        : min=35368, max=45938, avg=41920.54, stdev=1378.32, samples=239
  lat (usec)   : 500=4.07%, 750=59.99%, 1000=28.31%
  lat (msec)   : 2=6.82%, 4=0.44%, 10=0.37%, 20=0.01%
  cpu          : usr=15.67%, sys=39.65%, ctx=759242, majf=0, minf=65
  IO depths    : 1=0.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=100.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,5024393,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=32
rand-write: (groupid=1, jobs=1): err= 0: pid=1360: Mon Oct 12 23:54:45 2020
  write: IOPS=36.7k, BW=143MiB/s (150MB/s)(16.8GiB/120002msec)
    slat (usec): min=3, max=501, avg= 7.41, stdev= 3.89
    clat (usec): min=259, max=18125, avg=860.51, stdev=270.08
     lat (usec): min=450, max=18134, avg=868.81, stdev=269.80
    clat percentiles (usec):
     |  1.00th=[  570],  5.00th=[  619], 10.00th=[  652], 20.00th=[  701],
     | 30.00th=[  742], 40.00th=[  783], 50.00th=[  824], 60.00th=[  865],
     | 70.00th=[  914], 80.00th=[  988], 90.00th=[ 1090], 95.00th=[ 1188],
     | 99.00th=[ 1467], 99.50th=[ 1778], 99.90th=[ 4424], 99.95th=[ 5407],
     | 99.99th=[ 8029]
   bw (  KiB/s): min=135257, max=154552, per=100.00%, avg=147026.83, stdev=3200.22, samples=239
   iops        : min=33814, max=38638, avg=36756.55, stdev=800.09, samples=239
  lat (usec)   : 500=0.01%, 750=32.28%, 1000=49.36%
  lat (msec)   : 2=17.97%, 4=0.27%, 10=0.11%, 20=0.01%
  cpu          : usr=14.25%, sys=36.10%, ctx=687581, majf=0, minf=64
  IO depths    : 1=0.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=100.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,4405511,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=32
seq-read: (groupid=2, jobs=1): err= 0: pid=1366: Mon Oct 12 23:54:45 2020
  read: IOPS=65.7k, BW=256MiB/s (269MB/s)(30.1GiB/120001msec)
    slat (usec): min=3, max=273, avg= 7.45, stdev= 3.30
    clat (usec): min=88, max=19869, avg=476.26, stdev=165.47
     lat (usec): min=97, max=19876, avg=484.70, stdev=165.23
    clat percentiles (usec):
     |  1.00th=[  245],  5.00th=[  302], 10.00th=[  330], 20.00th=[  371],
     | 30.00th=[  400], 40.00th=[  429], 50.00th=[  453], 60.00th=[  482],
     | 70.00th=[  519], 80.00th=[  562], 90.00th=[  635], 95.00th=[  717],
     | 99.00th=[  906], 99.50th=[ 1090], 99.90th=[ 1795], 99.95th=[ 2442],
     | 99.99th=[ 4621]
   bw (  KiB/s): min=229632, max=303176, per=100.00%, avg=263067.15, stdev=16359.76, samples=239
   iops        : min=57408, max=75794, avg=65766.77, stdev=4089.87, samples=239
  lat (usec)   : 100=0.01%, 250=1.19%, 500=64.25%, 750=30.98%, 1000=2.94%
  lat (msec)   : 2=0.57%, 4=0.05%, 10=0.02%, 20=0.01%
  cpu          : usr=22.44%, sys=63.32%, ctx=359042, majf=0, minf=66
  IO depths    : 1=0.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=100.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%, >=64=0.0%
     issued rwts: total=7878968,0,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=32
rand-read: (groupid=3, jobs=1): err= 0: pid=1367: Mon Oct 12 23:54:45 2020
  read: IOPS=65.1k, BW=254MiB/s (267MB/s)(29.8GiB/120001msec)
    slat (usec): min=3, max=243, avg= 7.24, stdev= 3.43
    clat (usec): min=104, max=21925, avg=480.32, stdev=166.51
     lat (usec): min=114, max=21929, avg=488.59, stdev=166.37
    clat percentiles (usec):
     |  1.00th=[  227],  5.00th=[  302], 10.00th=[  334], 20.00th=[  375],
     | 30.00th=[  400], 40.00th=[  424], 50.00th=[  453], 60.00th=[  486],
     | 70.00th=[  529], 80.00th=[  586], 90.00th=[  660], 95.00th=[  725],
     | 99.00th=[  865], 99.50th=[  914], 99.90th=[ 1205], 99.95th=[ 1860],
     | 99.99th=[ 5014]
   bw (  KiB/s): min=236582, max=272902, per=100.00%, avg=260848.53, stdev=7181.96, samples=239
   iops        : min=59145, max=68225, avg=65212.05, stdev=1795.50, samples=239
  lat (usec)   : 250=1.71%, 500=62.18%, 750=32.23%, 1000=3.68%
  lat (msec)   : 2=0.16%, 4=0.02%, 10=0.02%, 20=0.01%, 50=0.01%
  cpu          : usr=22.95%, sys=62.21%, ctx=258569, majf=0, minf=64
  IO depths    : 1=0.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=100.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%, >=64=0.0%
     issued rwts: total=7814294,0,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=32
rand-mix_80r_20w: (groupid=4, jobs=1): err= 0: pid=1368: Mon Oct 12 23:54:45 2020
  read: IOPS=46.6k, BW=182MiB/s (191MB/s)(21.3GiB/120001msec)
    slat (usec): min=3, max=233, avg= 6.76, stdev= 3.05
    clat (usec): min=105, max=20896, avg=465.69, stdev=232.91
     lat (usec): min=115, max=20902, avg=473.38, stdev=232.77
    clat percentiles (usec):
     |  1.00th=[  206],  5.00th=[  265], 10.00th=[  297], 20.00th=[  338],
     | 30.00th=[  371], 40.00th=[  404], 50.00th=[  441], 60.00th=[  482],
     | 70.00th=[  529], 80.00th=[  578], 90.00th=[  652], 95.00th=[  709],
     | 99.00th=[  840], 99.50th=[  906], 99.90th=[ 2704], 99.95th=[ 4621],
     | 99.99th=[ 9241]
   bw (  KiB/s): min=167888, max=201928, per=100.00%, avg=186459.86, stdev=5784.32, samples=239
   iops        : min=41972, max=50482, avg=46614.92, stdev=1446.11, samples=239
  write: IOPS=11.6k, BW=45.5MiB/s (47.7MB/s)(5458MiB/120001msec)
    slat (usec): min=3, max=172, avg= 7.53, stdev= 3.18
    clat (usec): min=456, max=20452, avg=833.32, stdev=260.92
     lat (usec): min=466, max=20464, avg=841.78, stdev=260.80
    clat percentiles (usec):
     |  1.00th=[  578],  5.00th=[  627], 10.00th=[  660], 20.00th=[  701],
     | 30.00th=[  742], 40.00th=[  766], 50.00th=[  799], 60.00th=[  840],
     | 70.00th=[  881], 80.00th=[  930], 90.00th=[ 1012], 95.00th=[ 1090],
     | 99.00th=[ 1434], 99.50th=[ 1713], 99.90th=[ 4146], 99.95th=[ 5473],
     | 99.99th=[ 9372]
   bw (  KiB/s): min=42368, max=51216, per=100.00%, avg=46631.67, stdev=1499.50, samples=239
   iops        : min=10592, max=12804, avg=11657.83, stdev=374.89, samples=239
  lat (usec)   : 250=2.95%, 500=48.16%, 750=33.19%, 1000=13.26%
  lat (msec)   : 2=2.28%, 4=0.09%, 10=0.07%, 20=0.01%, 50=0.01%
  cpu          : usr=20.93%, sys=53.68%, ctx=691490, majf=0, minf=64
  IO depths    : 1=0.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=100.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%, >=64=0.0%
     issued rwts: total=5586486,1397204,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=32
rand-mix_50r_50w: (groupid=5, jobs=1): err= 0: pid=1370: Mon Oct 12 23:54:45 2020
  read: IOPS=24.0k, BW=97.6MiB/s (102MB/s)(11.4GiB/120002msec)
    slat (usec): min=3, max=286, avg= 6.75, stdev= 2.92
    clat (usec): min=115, max=21893, avg=434.83, stdev=233.57
     lat (usec): min=123, max=21904, avg=442.47, stdev=233.44
    clat percentiles (usec):
     |  1.00th=[  196],  5.00th=[  235], 10.00th=[  262], 20.00th=[  302],
     | 30.00th=[  343], 40.00th=[  379], 50.00th=[  416], 60.00th=[  453],
     | 70.00th=[  498], 80.00th=[  545], 90.00th=[  611], 95.00th=[  676],
     | 99.00th=[  807], 99.50th=[  881], 99.90th=[ 3097], 99.95th=[ 4686],
     | 99.99th=[ 8225]
   bw (  KiB/s): min=87232, max=107192, per=100.00%, avg=100127.59, stdev=2558.02, samples=239
   iops        : min=21808, max=26798, avg=25031.86, stdev=639.50, samples=239
  write: IOPS=24.0k, BW=97.6MiB/s (102MB/s)(11.4GiB/120002msec)
    slat (usec): min=3, max=517, avg= 7.25, stdev= 3.00
    clat (usec): min=453, max=21344, avg=824.22, stdev=276.74
     lat (usec): min=462, max=21351, avg=832.36, stdev=276.62
    clat percentiles (usec):
     |  1.00th=[  570],  5.00th=[  619], 10.00th=[  652], 20.00th=[  693],
     | 30.00th=[  725], 40.00th=[  750], 50.00th=[  791], 60.00th=[  824],
     | 70.00th=[  865], 80.00th=[  922], 90.00th=[ 1004], 95.00th=[ 1074],
     | 99.00th=[ 1483], 99.50th=[ 1827], 99.90th=[ 4621], 99.95th=[ 5669],
     | 99.99th=[ 9241]
   bw (  KiB/s): min=85112, max=107952, per=100.00%, avg=100117.97, stdev=2565.35, samples=239
   iops        : min=21278, max=26988, avg=25029.43, stdev=641.35, samples=239
  lat (usec)   : 250=3.85%, 500=31.62%, 750=33.00%, 1000=26.17%
  lat (msec)   : 2=5.08%, 4=0.18%, 10=0.09%, 20=0.01%, 50=0.01%
  cpu          : usr=19.05%, sys=46.56%, ctx=1002633, majf=0, minf=63
  IO depths    : 1=0.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=100.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%, >=64=0.0%
     issued rwts: total=2999565,2999201,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=32

Run status group 0 (all jobs):
  WRITE: bw=164MiB/s (171MB/s), 164MiB/s-164MiB/s (171MB/s-171MB/s), io=19.2GiB (20.6GB), run=120002-120002msec

Run status group 1 (all jobs):
  WRITE: bw=143MiB/s (150MB/s), 143MiB/s-143MiB/s (150MB/s-150MB/s), io=16.8GiB (18.0GB), run=120002-120002msec

Run status group 2 (all jobs):
   READ: bw=256MiB/s (269MB/s), 256MiB/s-256MiB/s (269MB/s-269MB/s), io=30.1GiB (32.3GB), run=120001-120001msec

Run status group 3 (all jobs):
   READ: bw=254MiB/s (267MB/s), 254MiB/s-254MiB/s (267MB/s-267MB/s), io=29.8GiB (32.0GB), run=120001-120001msec

Run status group 4 (all jobs):
   READ: bw=182MiB/s (191MB/s), 182MiB/s-182MiB/s (191MB/s-191MB/s), io=21.3GiB (22.9GB), run=120001-120001msec
  WRITE: bw=45.5MiB/s (47.7MB/s), 45.5MiB/s-45.5MiB/s (47.7MB/s-47.7MB/s), io=5458MiB (5723MB), run=120001-120001msec

Run status group 5 (all jobs):
   READ: bw=97.6MiB/s (102MB/s), 97.6MiB/s-97.6MiB/s (102MB/s-102MB/s), io=11.4GiB (12.3GB), run=120002-120002msec
  WRITE: bw=97.6MiB/s (102MB/s), 97.6MiB/s-97.6MiB/s (102MB/s-102MB/s), io=11.4GiB (12.3GB), run=120002-120002msec

Disk stats (read/write):
  vdc: ios=30367968/16981945, merge=0/0, ticks=11543664/13226502, in_queue=24752840, util=100.00%



Where are you experiencing the behavior? What environment?
Low performance from ceph

When does the behavior occur? Frequency? Repeatedly? At certain times?
Repeatedly

What information can you provide around timeframes and the business impact?
Business may need to move off of Ceph due to slow performance

Comment 5 zixchen 2020-11-19 03:44:52 UTC
Please confirm this bug is a duplicate with Bug 1744525 or not.
Thanks.

Comment 6 zixchen 2020-11-19 03:55:08 UTC
hi Daniel,
Could you please to confirm this bug is duplicate with Bug 1744525 or not?

Comment 7 Daniel Berrangé 2020-11-19 09:21:39 UTC
I think they are probably different, but I'll leave it upto QEMU RBD maintainer to decide that.

Comment 8 David Hill 2020-11-19 15:29:21 UTC
No they are not the same ... from what I can see.   Customer is complaining that having a volume attached with krbd provides better performances than volumes attached with librados.

Comment 47 Klaus Heinrich Kiwi 2021-06-16 17:18:36 UTC
I think it's safe to close this bug for now, as the librbd package works and performance seems to be as expected.


Note You need to log in before you can comment on or make changes to this bug.