Bug 1897572

Summary: QEMU Ceph driver performance issues - krbd is 2-3x faster than librados driver
Product: Red Hat Enterprise Linux Advanced Virtualization Reporter: David Hill <dhill>
Component: qemu-kvmAssignee: Stefano Garzarella <sgarzare>
qemu-kvm sub component: Ceph QA Contact: zixchen
Status: CLOSED NOTABUG Docs Contact:
Severity: medium    
Priority: medium CC: alolivei, areis, berrange, bshephar, chayang, coli, ealcaniz, eharney, gcharot, gveitmic, idryomov, jferlan, jinzhao, juzhang, kkiwi, mkasturi, moddi, pgrist, sgarzare, shtiwari, smooney, vcojot, virt-maint, xuwei, yama, zixchen
Version: ---Keywords: RFE, Triaged
Target Milestone: rc   
Target Release: 8.3   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-06-16 17:18:36 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1813960    
Bug Blocks:    

Description David Hill 2020-11-13 13:48:46 UTC
What problem/issue/behavior are you having trouble with?  What do you expect to see?
Hi Red Hat Team,

We are looking into methods of improving our performance for database instances on Ceph. 
During our initial testing with this change in place, our test instances on Ceph saw a dramatic improvement in performance. The FIO profile we used is here [2], then our baseline is here [3], and using krbd it's here [4]. Now bear in mind we are not using LUKS encryption, we have no plans to encrypt volumes. But the performance difference between the scenarios I presented is very dramatic. It's essentially the difference for our business to determine if we need a SAN based solution or if Ceph could perform to the level we need it too. Clearly from the results in [4], it can... 

Th

Thanks so much,


[2]
[global]
bs=${BLKSIZE}
ioengine=libaio
iodepth=${IODEPTH}
norandommap
direct=1
time_based=1
numjobs=${NUMJOBS}
runtime=${RUNTIME}
filename=${DEVICE}
group_reporting
#cpus_allowed=${CPUS_ALLOWED}
cpus_allowed_policy=split
ramp_time=30

[seq-write]
rw=write
stonewall

[rand-write]
rw=randwrite
stonewall

[seq-read]
rw=read
stonewall

[rand-read]
rw=randread
stonewall

[rand-mix_80r_20w]
rw=randrw
rwmixwrite=20
stonewall

[rand-mix_50r_50w]
rw=randrw
rwmixwrite=50

[3]
seq-write: (g=0): rw=write, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=32
rand-write: (g=1): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=32
seq-read: (g=2): rw=read, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=32
rand-read: (g=3): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=32
rand-mix_80r_20w: (g=4): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=32
rand-mix_50r_50w: (g=5): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=32
fio-3.19
Starting 6 processes
Jobs: 1 (f=1): [_(5),m(1)][60.0%][r=41.9MiB/s,w=41.4MiB/s][r=10.7k,w=10.6k IOPS][eta 10m:00s]    
seq-write: (groupid=0, jobs=1): err= 0: pid=1574: Mon Oct 12 23:39:34 2020
  write: IOPS=38.2k, BW=149MiB/s (156MB/s)(17.5GiB/120002msec)
    slat (usec): min=3, max=1108, avg= 8.29, stdev= 2.30
    clat (usec): min=195, max=9198, avg=825.72, stdev=355.23
     lat (usec): min=207, max=9210, avg=834.86, stdev=355.41
    clat percentiles (usec):
     |  1.00th=[  375],  5.00th=[  416], 10.00th=[  429], 20.00th=[  474],
     | 30.00th=[  578], 40.00th=[  685], 50.00th=[  783], 60.00th=[  857],
     | 70.00th=[  963], 80.00th=[ 1123], 90.00th=[ 1336], 95.00th=[ 1483],
     | 99.00th=[ 1778], 99.50th=[ 1926], 99.90th=[ 2311], 99.95th=[ 2671],
     | 99.99th=[ 4948]
   bw (  KiB/s): min=130928, max=166504, per=100.00%, avg=153051.59, stdev=8805.20, samples=239
   iops        : min=32732, max=41626, avg=38262.88, stdev=2201.28, samples=239
  lat (usec)   : 250=0.01%, 500=22.55%, 750=24.06%, 1000=26.33%
  lat (msec)   : 2=26.71%, 4=0.33%, 10=0.01%
  cpu          : usr=16.90%, sys=45.37%, ctx=2877365, majf=0, minf=65
  IO depths    : 1=0.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=100.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,4584878,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=32
rand-write: (groupid=1, jobs=1): err= 0: pid=1578: Mon Oct 12 23:39:34 2020
  write: IOPS=20.2k, BW=78.0MiB/s (82.8MB/s)(9478MiB/120001msec)
    slat (usec): min=3, max=923, avg= 8.08, stdev= 2.95
    clat (usec): min=260, max=10514, avg=1571.15, stdev=234.25
     lat (usec): min=265, max=10522, avg=1580.04, stdev=234.02
    clat percentiles (usec):
     |  1.00th=[ 1004],  5.00th=[ 1221], 10.00th=[ 1319], 20.00th=[ 1418],
     | 30.00th=[ 1467], 40.00th=[ 1516], 50.00th=[ 1565], 60.00th=[ 1614],
     | 70.00th=[ 1663], 80.00th=[ 1729], 90.00th=[ 1827], 95.00th=[ 1926],
     | 99.00th=[ 2180], 99.50th=[ 2311], 99.90th=[ 2704], 99.95th=[ 3097],
     | 99.99th=[ 5342]
   bw (  KiB/s): min=70538, max=83976, per=100.00%, avg=81007.11, stdev=1967.83, samples=239
   iops        : min=17634, max=20994, avg=20251.75, stdev=491.97, samples=239
  lat (usec)   : 500=0.07%, 750=0.19%, 1000=0.71%
  lat (msec)   : 2=95.92%, 4=3.08%, 10=0.02%, 20=0.01%
  cpu          : usr=9.09%, sys=22.70%, ctx=1281415, majf=0, minf=64
  IO depths    : 1=0.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=100.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,2426274,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=32
seq-read: (groupid=2, jobs=1): err= 0: pid=1581: Mon Oct 12 23:39:34 2020
  read: IOPS=7509, BW=29.3MiB/s (30.8MB/s)(3520MiB/120005msec)
    slat (nsec): min=3583, max=97226, avg=8003.26, stdev=1005.53
    clat (usec): min=255, max=9669, avg=4249.73, stdev=752.70
     lat (usec): min=264, max=9677, avg=4258.50, stdev=752.73
    clat percentiles (usec):
     |  1.00th=[ 1237],  5.00th=[ 2802], 10.00th=[ 3490], 20.00th=[ 3884],
     | 30.00th=[ 4080], 40.00th=[ 4228], 50.00th=[ 4359], 60.00th=[ 4490],
     | 70.00th=[ 4621], 80.00th=[ 4752], 90.00th=[ 5014], 95.00th=[ 5211],
     | 99.00th=[ 5538], 99.50th=[ 5669], 99.90th=[ 6259], 99.95th=[ 7177],
     | 99.99th=[ 8979]
   bw (  KiB/s): min=27920, max=38008, per=100.00%, avg=30079.14, stdev=1261.83, samples=239
   iops        : min= 6980, max= 9502, avg=7519.78, stdev=315.46, samples=239
  lat (usec)   : 500=0.07%, 750=0.32%, 1000=0.32%
  lat (msec)   : 2=1.28%, 4=24.14%, 10=73.87%
  cpu          : usr=3.50%, sys=9.54%, ctx=883627, majf=0, minf=66
  IO depths    : 1=0.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=100.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%, >=64=0.0%
     issued rwts: total=901208,0,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=32
rand-read: (groupid=3, jobs=1): err= 0: pid=1582: Mon Oct 12 23:39:34 2020
  read: IOPS=25.7k, BW=101MiB/s (105MB/s)(11.8GiB/120001msec)
    slat (usec): min=3, max=1636, avg= 7.98, stdev= 1.95
    clat (usec): min=238, max=19104, avg=1232.35, stdev=179.47
     lat (usec): min=246, max=19109, avg=1241.09, stdev=179.46
    clat percentiles (usec):
     |  1.00th=[ 1057],  5.00th=[ 1090], 10.00th=[ 1106], 20.00th=[ 1139],
     | 30.00th=[ 1139], 40.00th=[ 1156], 50.00th=[ 1172], 60.00th=[ 1188],
     | 70.00th=[ 1221], 80.00th=[ 1287], 90.00th=[ 1549], 95.00th=[ 1614],
     | 99.00th=[ 1729], 99.50th=[ 1778], 99.90th=[ 1991], 99.95th=[ 2704],
     | 99.99th=[ 5276]
   bw (  KiB/s): min=75864, max=112136, per=100.00%, avg=103059.16, stdev=10479.82, samples=239
   iops        : min=18966, max=28034, avg=25764.80, stdev=2619.96, samples=239
  lat (usec)   : 250=0.01%, 500=0.01%, 750=0.02%, 1000=0.27%
  lat (msec)   : 2=99.59%, 4=0.08%, 10=0.02%, 20=0.01%
  cpu          : usr=11.52%, sys=32.48%, ctx=3077788, majf=0, minf=64
  IO depths    : 1=0.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=100.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%, >=64=0.0%
     issued rwts: total=3087819,0,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=32
rand-mix_80r_20w: (groupid=4, jobs=1): err= 0: pid=1583: Mon Oct 12 23:39:34 2020
  read: IOPS=18.9k, BW=73.8MiB/s (77.4MB/s)(8857MiB/120001msec)
    slat (usec): min=3, max=1389, avg= 7.92, stdev= 3.05
    clat (usec): min=301, max=9173, avg=1409.03, stdev=250.82
     lat (usec): min=315, max=9184, avg=1417.74, stdev=250.78
    clat percentiles (usec):
     |  1.00th=[  873],  5.00th=[ 1045], 10.00th=[ 1139], 20.00th=[ 1221],
     | 30.00th=[ 1287], 40.00th=[ 1336], 50.00th=[ 1385], 60.00th=[ 1434],
     | 70.00th=[ 1500], 80.00th=[ 1582], 90.00th=[ 1713], 95.00th=[ 1844],
     | 99.00th=[ 2089], 99.50th=[ 2180], 99.90th=[ 2606], 99.95th=[ 3163],
     | 99.99th=[ 5145]
   bw (  KiB/s): min=56000, max=81928, per=100.00%, avg=75685.10, stdev=6442.14, samples=239
   iops        : min=14000, max=20482, avg=18921.27, stdev=1610.52, samples=239
  write: IOPS=4728, BW=18.5MiB/s (19.4MB/s)(2216MiB/120001msec)
    slat (usec): min=3, max=740, avg= 8.43, stdev= 2.06
    clat (usec): min=52, max=5578, avg=1080.57, stdev=232.26
     lat (usec): min=62, max=5588, avg=1089.77, stdev=232.17
    clat percentiles (usec):
     |  1.00th=[  586],  5.00th=[  734], 10.00th=[  816], 20.00th=[  906],
     | 30.00th=[  963], 40.00th=[ 1012], 50.00th=[ 1057], 60.00th=[ 1106],
     | 70.00th=[ 1172], 80.00th=[ 1237], 90.00th=[ 1369], 95.00th=[ 1500],
     | 99.00th=[ 1729], 99.50th=[ 1795], 99.90th=[ 2008], 99.95th=[ 2311],
     | 99.99th=[ 4080]
   bw (  KiB/s): min=13560, max=21024, per=100.00%, avg=18940.05, stdev=1642.92, samples=239
   iops        : min= 3390, max= 5256, avg=4734.98, stdev=410.72, samples=239
  lat (usec)   : 100=0.01%, 250=0.01%, 500=0.09%, 750=1.28%, 1000=8.64%
  lat (msec)   : 2=88.49%, 4=1.47%, 10=0.02%
  cpu          : usr=10.13%, sys=28.72%, ctx=2232064, majf=0, minf=64
  IO depths    : 1=0.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=100.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%, >=64=0.0%
     issued rwts: total=2267278,567373,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=32
rand-mix_50r_50w: (groupid=5, jobs=1): err= 0: pid=1585: Mon Oct 12 23:39:34 2020
  read: IOPS=10.5k, BW=41.1MiB/s (43.1MB/s)(4931MiB/120001msec)
    slat (usec): min=3, max=1324, avg= 7.85, stdev= 2.77
    clat (usec): min=326, max=31820, avg=1668.01, stdev=283.46
     lat (usec): min=331, max=31829, avg=1676.65, stdev=283.33
    clat percentiles (usec):
     |  1.00th=[ 1123],  5.00th=[ 1287], 10.00th=[ 1369], 20.00th=[ 1450],
     | 30.00th=[ 1516], 40.00th=[ 1582], 50.00th=[ 1631], 60.00th=[ 1696],
     | 70.00th=[ 1778], 80.00th=[ 1876], 90.00th=[ 2024], 95.00th=[ 2147],
     | 99.00th=[ 2376], 99.50th=[ 2474], 99.90th=[ 3458], 99.95th=[ 4293],
     | 99.99th=[ 6390]
   bw (  KiB/s): min=33520, max=46720, per=100.00%, avg=42158.05, stdev=3281.20, samples=239
   iops        : min= 8380, max=11680, avg=10539.50, stdev=820.29, samples=239
  write: IOPS=10.5k, BW=41.1MiB/s (43.1MB/s)(4932MiB/120001msec)
    slat (usec): min=3, max=1250, avg= 8.33, stdev= 2.90
    clat (usec): min=49, max=6268, avg=1350.57, stdev=264.40
     lat (usec): min=57, max=6277, avg=1359.69, stdev=264.20
    clat percentiles (usec):
     |  1.00th=[  791],  5.00th=[  963], 10.00th=[ 1057], 20.00th=[ 1139],
     | 30.00th=[ 1205], 40.00th=[ 1270], 50.00th=[ 1319], 60.00th=[ 1385],
     | 70.00th=[ 1467], 80.00th=[ 1565], 90.00th=[ 1696], 95.00th=[ 1795],
     | 99.00th=[ 2008], 99.50th=[ 2114], 99.90th=[ 2540], 99.95th=[ 3163],
     | 99.99th=[ 4883]
   bw (  KiB/s): min=33192, max=46440, per=100.00%, avg=42167.66, stdev=3284.10, samples=239
   iops        : min= 8298, max=11610, avg=10541.90, stdev=821.02, samples=239
  lat (usec)   : 50=0.01%, 100=0.01%, 250=0.01%, 500=0.06%, 750=0.30%
  lat (usec)   : 1000=3.17%
  lat (msec)   : 2=90.36%, 4=6.07%, 10=0.04%, 20=0.01%, 50=0.01%
  cpu          : usr=8.92%, sys=24.62%, ctx=1594184, majf=0, minf=63
  IO depths    : 1=0.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=100.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%, >=64=0.0%
     issued rwts: total=1262390,1262650,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=32

Run status group 0 (all jobs):
  WRITE: bw=149MiB/s (156MB/s), 149MiB/s-149MiB/s (156MB/s-156MB/s), io=17.5GiB (18.8GB), run=120002-120002msec

Run status group 1 (all jobs):
  WRITE: bw=78.0MiB/s (82.8MB/s), 78.0MiB/s-78.0MiB/s (82.8MB/s-82.8MB/s), io=9478MiB (9938MB), run=120001-120001msec

Run status group 2 (all jobs):
   READ: bw=29.3MiB/s (30.8MB/s), 29.3MiB/s-29.3MiB/s (30.8MB/s-30.8MB/s), io=3520MiB (3691MB), run=120005-120005msec

Run status group 3 (all jobs):
   READ: bw=101MiB/s (105MB/s), 101MiB/s-101MiB/s (105MB/s-105MB/s), io=11.8GiB (12.6GB), run=120001-120001msec

Run status group 4 (all jobs):
   READ: bw=73.8MiB/s (77.4MB/s), 73.8MiB/s-73.8MiB/s (77.4MB/s-77.4MB/s), io=8857MiB (9287MB), run=120001-120001msec
  WRITE: bw=18.5MiB/s (19.4MB/s), 18.5MiB/s-18.5MiB/s (19.4MB/s-19.4MB/s), io=2216MiB (2324MB), run=120001-120001msec

Run status group 5 (all jobs):
   READ: bw=41.1MiB/s (43.1MB/s), 41.1MiB/s-41.1MiB/s (43.1MB/s-43.1MB/s), io=4931MiB (5171MB), run=120001-120001msec
  WRITE: bw=41.1MiB/s (43.1MB/s), 41.1MiB/s-41.1MiB/s (43.1MB/s-43.1MB/s), io=4932MiB (5172MB), run=120001-120001msec

Disk stats (read/write):
  vdb: ios=9423940/11049668, merge=0/0, ticks=16094933/12019245, in_queue=28109168, util=100.00%

[4]
seq-write: (g=0): rw=write, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=32
rand-write: (g=1): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=32
seq-read: (g=2): rw=read, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=32
rand-read: (g=3): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=32
rand-mix_80r_20w: (g=4): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=32
rand-mix_50r_50w: (g=5): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=32
fio-3.19
Starting 6 processes
Jobs: 1 (f=1): [_(5),m(1)][60.0%][r=93.6MiB/s,w=93.8MiB/s][r=23.0k,w=24.0k IOPS][eta 10m:00s]    
seq-write: (groupid=0, jobs=1): err= 0: pid=1357: Mon Oct 12 23:54:45 2020
  write: IOPS=41.9k, BW=164MiB/s (171MB/s)(19.2GiB/120002msec)
    slat (usec): min=3, max=2278, avg= 7.24, stdev= 3.33
    clat (usec): min=341, max=17084, avg=753.49, stdev=407.17
     lat (usec): min=364, max=17091, avg=761.62, stdev=407.00
    clat percentiles (usec):
     |  1.00th=[  461],  5.00th=[  510], 10.00th=[  537], 20.00th=[  586],
     | 30.00th=[  627], 40.00th=[  660], 50.00th=[  693], 60.00th=[  734],
     | 70.00th=[  783], 80.00th=[  840], 90.00th=[  955], 95.00th=[ 1074],
     | 99.00th=[ 1713], 99.50th=[ 3064], 99.90th=[ 7177], 99.95th=[ 7898],
     | 99.99th=[ 9241]
   bw (  KiB/s): min=141472, max=183752, per=100.00%, avg=167682.66, stdev=5513.07, samples=239
   iops        : min=35368, max=45938, avg=41920.54, stdev=1378.32, samples=239
  lat (usec)   : 500=4.07%, 750=59.99%, 1000=28.31%
  lat (msec)   : 2=6.82%, 4=0.44%, 10=0.37%, 20=0.01%
  cpu          : usr=15.67%, sys=39.65%, ctx=759242, majf=0, minf=65
  IO depths    : 1=0.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=100.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,5024393,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=32
rand-write: (groupid=1, jobs=1): err= 0: pid=1360: Mon Oct 12 23:54:45 2020
  write: IOPS=36.7k, BW=143MiB/s (150MB/s)(16.8GiB/120002msec)
    slat (usec): min=3, max=501, avg= 7.41, stdev= 3.89
    clat (usec): min=259, max=18125, avg=860.51, stdev=270.08
     lat (usec): min=450, max=18134, avg=868.81, stdev=269.80
    clat percentiles (usec):
     |  1.00th=[  570],  5.00th=[  619], 10.00th=[  652], 20.00th=[  701],
     | 30.00th=[  742], 40.00th=[  783], 50.00th=[  824], 60.00th=[  865],
     | 70.00th=[  914], 80.00th=[  988], 90.00th=[ 1090], 95.00th=[ 1188],
     | 99.00th=[ 1467], 99.50th=[ 1778], 99.90th=[ 4424], 99.95th=[ 5407],
     | 99.99th=[ 8029]
   bw (  KiB/s): min=135257, max=154552, per=100.00%, avg=147026.83, stdev=3200.22, samples=239
   iops        : min=33814, max=38638, avg=36756.55, stdev=800.09, samples=239
  lat (usec)   : 500=0.01%, 750=32.28%, 1000=49.36%
  lat (msec)   : 2=17.97%, 4=0.27%, 10=0.11%, 20=0.01%
  cpu          : usr=14.25%, sys=36.10%, ctx=687581, majf=0, minf=64
  IO depths    : 1=0.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=100.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,4405511,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=32
seq-read: (groupid=2, jobs=1): err= 0: pid=1366: Mon Oct 12 23:54:45 2020
  read: IOPS=65.7k, BW=256MiB/s (269MB/s)(30.1GiB/120001msec)
    slat (usec): min=3, max=273, avg= 7.45, stdev= 3.30
    clat (usec): min=88, max=19869, avg=476.26, stdev=165.47
     lat (usec): min=97, max=19876, avg=484.70, stdev=165.23
    clat percentiles (usec):
     |  1.00th=[  245],  5.00th=[  302], 10.00th=[  330], 20.00th=[  371],
     | 30.00th=[  400], 40.00th=[  429], 50.00th=[  453], 60.00th=[  482],
     | 70.00th=[  519], 80.00th=[  562], 90.00th=[  635], 95.00th=[  717],
     | 99.00th=[  906], 99.50th=[ 1090], 99.90th=[ 1795], 99.95th=[ 2442],
     | 99.99th=[ 4621]
   bw (  KiB/s): min=229632, max=303176, per=100.00%, avg=263067.15, stdev=16359.76, samples=239
   iops        : min=57408, max=75794, avg=65766.77, stdev=4089.87, samples=239
  lat (usec)   : 100=0.01%, 250=1.19%, 500=64.25%, 750=30.98%, 1000=2.94%
  lat (msec)   : 2=0.57%, 4=0.05%, 10=0.02%, 20=0.01%
  cpu          : usr=22.44%, sys=63.32%, ctx=359042, majf=0, minf=66
  IO depths    : 1=0.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=100.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%, >=64=0.0%
     issued rwts: total=7878968,0,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=32
rand-read: (groupid=3, jobs=1): err= 0: pid=1367: Mon Oct 12 23:54:45 2020
  read: IOPS=65.1k, BW=254MiB/s (267MB/s)(29.8GiB/120001msec)
    slat (usec): min=3, max=243, avg= 7.24, stdev= 3.43
    clat (usec): min=104, max=21925, avg=480.32, stdev=166.51
     lat (usec): min=114, max=21929, avg=488.59, stdev=166.37
    clat percentiles (usec):
     |  1.00th=[  227],  5.00th=[  302], 10.00th=[  334], 20.00th=[  375],
     | 30.00th=[  400], 40.00th=[  424], 50.00th=[  453], 60.00th=[  486],
     | 70.00th=[  529], 80.00th=[  586], 90.00th=[  660], 95.00th=[  725],
     | 99.00th=[  865], 99.50th=[  914], 99.90th=[ 1205], 99.95th=[ 1860],
     | 99.99th=[ 5014]
   bw (  KiB/s): min=236582, max=272902, per=100.00%, avg=260848.53, stdev=7181.96, samples=239
   iops        : min=59145, max=68225, avg=65212.05, stdev=1795.50, samples=239
  lat (usec)   : 250=1.71%, 500=62.18%, 750=32.23%, 1000=3.68%
  lat (msec)   : 2=0.16%, 4=0.02%, 10=0.02%, 20=0.01%, 50=0.01%
  cpu          : usr=22.95%, sys=62.21%, ctx=258569, majf=0, minf=64
  IO depths    : 1=0.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=100.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%, >=64=0.0%
     issued rwts: total=7814294,0,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=32
rand-mix_80r_20w: (groupid=4, jobs=1): err= 0: pid=1368: Mon Oct 12 23:54:45 2020
  read: IOPS=46.6k, BW=182MiB/s (191MB/s)(21.3GiB/120001msec)
    slat (usec): min=3, max=233, avg= 6.76, stdev= 3.05
    clat (usec): min=105, max=20896, avg=465.69, stdev=232.91
     lat (usec): min=115, max=20902, avg=473.38, stdev=232.77
    clat percentiles (usec):
     |  1.00th=[  206],  5.00th=[  265], 10.00th=[  297], 20.00th=[  338],
     | 30.00th=[  371], 40.00th=[  404], 50.00th=[  441], 60.00th=[  482],
     | 70.00th=[  529], 80.00th=[  578], 90.00th=[  652], 95.00th=[  709],
     | 99.00th=[  840], 99.50th=[  906], 99.90th=[ 2704], 99.95th=[ 4621],
     | 99.99th=[ 9241]
   bw (  KiB/s): min=167888, max=201928, per=100.00%, avg=186459.86, stdev=5784.32, samples=239
   iops        : min=41972, max=50482, avg=46614.92, stdev=1446.11, samples=239
  write: IOPS=11.6k, BW=45.5MiB/s (47.7MB/s)(5458MiB/120001msec)
    slat (usec): min=3, max=172, avg= 7.53, stdev= 3.18
    clat (usec): min=456, max=20452, avg=833.32, stdev=260.92
     lat (usec): min=466, max=20464, avg=841.78, stdev=260.80
    clat percentiles (usec):
     |  1.00th=[  578],  5.00th=[  627], 10.00th=[  660], 20.00th=[  701],
     | 30.00th=[  742], 40.00th=[  766], 50.00th=[  799], 60.00th=[  840],
     | 70.00th=[  881], 80.00th=[  930], 90.00th=[ 1012], 95.00th=[ 1090],
     | 99.00th=[ 1434], 99.50th=[ 1713], 99.90th=[ 4146], 99.95th=[ 5473],
     | 99.99th=[ 9372]
   bw (  KiB/s): min=42368, max=51216, per=100.00%, avg=46631.67, stdev=1499.50, samples=239
   iops        : min=10592, max=12804, avg=11657.83, stdev=374.89, samples=239
  lat (usec)   : 250=2.95%, 500=48.16%, 750=33.19%, 1000=13.26%
  lat (msec)   : 2=2.28%, 4=0.09%, 10=0.07%, 20=0.01%, 50=0.01%
  cpu          : usr=20.93%, sys=53.68%, ctx=691490, majf=0, minf=64
  IO depths    : 1=0.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=100.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%, >=64=0.0%
     issued rwts: total=5586486,1397204,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=32
rand-mix_50r_50w: (groupid=5, jobs=1): err= 0: pid=1370: Mon Oct 12 23:54:45 2020
  read: IOPS=24.0k, BW=97.6MiB/s (102MB/s)(11.4GiB/120002msec)
    slat (usec): min=3, max=286, avg= 6.75, stdev= 2.92
    clat (usec): min=115, max=21893, avg=434.83, stdev=233.57
     lat (usec): min=123, max=21904, avg=442.47, stdev=233.44
    clat percentiles (usec):
     |  1.00th=[  196],  5.00th=[  235], 10.00th=[  262], 20.00th=[  302],
     | 30.00th=[  343], 40.00th=[  379], 50.00th=[  416], 60.00th=[  453],
     | 70.00th=[  498], 80.00th=[  545], 90.00th=[  611], 95.00th=[  676],
     | 99.00th=[  807], 99.50th=[  881], 99.90th=[ 3097], 99.95th=[ 4686],
     | 99.99th=[ 8225]
   bw (  KiB/s): min=87232, max=107192, per=100.00%, avg=100127.59, stdev=2558.02, samples=239
   iops        : min=21808, max=26798, avg=25031.86, stdev=639.50, samples=239
  write: IOPS=24.0k, BW=97.6MiB/s (102MB/s)(11.4GiB/120002msec)
    slat (usec): min=3, max=517, avg= 7.25, stdev= 3.00
    clat (usec): min=453, max=21344, avg=824.22, stdev=276.74
     lat (usec): min=462, max=21351, avg=832.36, stdev=276.62
    clat percentiles (usec):
     |  1.00th=[  570],  5.00th=[  619], 10.00th=[  652], 20.00th=[  693],
     | 30.00th=[  725], 40.00th=[  750], 50.00th=[  791], 60.00th=[  824],
     | 70.00th=[  865], 80.00th=[  922], 90.00th=[ 1004], 95.00th=[ 1074],
     | 99.00th=[ 1483], 99.50th=[ 1827], 99.90th=[ 4621], 99.95th=[ 5669],
     | 99.99th=[ 9241]
   bw (  KiB/s): min=85112, max=107952, per=100.00%, avg=100117.97, stdev=2565.35, samples=239
   iops        : min=21278, max=26988, avg=25029.43, stdev=641.35, samples=239
  lat (usec)   : 250=3.85%, 500=31.62%, 750=33.00%, 1000=26.17%
  lat (msec)   : 2=5.08%, 4=0.18%, 10=0.09%, 20=0.01%, 50=0.01%
  cpu          : usr=19.05%, sys=46.56%, ctx=1002633, majf=0, minf=63
  IO depths    : 1=0.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=100.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%, >=64=0.0%
     issued rwts: total=2999565,2999201,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=32

Run status group 0 (all jobs):
  WRITE: bw=164MiB/s (171MB/s), 164MiB/s-164MiB/s (171MB/s-171MB/s), io=19.2GiB (20.6GB), run=120002-120002msec

Run status group 1 (all jobs):
  WRITE: bw=143MiB/s (150MB/s), 143MiB/s-143MiB/s (150MB/s-150MB/s), io=16.8GiB (18.0GB), run=120002-120002msec

Run status group 2 (all jobs):
   READ: bw=256MiB/s (269MB/s), 256MiB/s-256MiB/s (269MB/s-269MB/s), io=30.1GiB (32.3GB), run=120001-120001msec

Run status group 3 (all jobs):
   READ: bw=254MiB/s (267MB/s), 254MiB/s-254MiB/s (267MB/s-267MB/s), io=29.8GiB (32.0GB), run=120001-120001msec

Run status group 4 (all jobs):
   READ: bw=182MiB/s (191MB/s), 182MiB/s-182MiB/s (191MB/s-191MB/s), io=21.3GiB (22.9GB), run=120001-120001msec
  WRITE: bw=45.5MiB/s (47.7MB/s), 45.5MiB/s-45.5MiB/s (47.7MB/s-47.7MB/s), io=5458MiB (5723MB), run=120001-120001msec

Run status group 5 (all jobs):
   READ: bw=97.6MiB/s (102MB/s), 97.6MiB/s-97.6MiB/s (102MB/s-102MB/s), io=11.4GiB (12.3GB), run=120002-120002msec
  WRITE: bw=97.6MiB/s (102MB/s), 97.6MiB/s-97.6MiB/s (102MB/s-102MB/s), io=11.4GiB (12.3GB), run=120002-120002msec

Disk stats (read/write):
  vdc: ios=30367968/16981945, merge=0/0, ticks=11543664/13226502, in_queue=24752840, util=100.00%



Where are you experiencing the behavior? What environment?
Low performance from ceph

When does the behavior occur? Frequency? Repeatedly? At certain times?
Repeatedly

What information can you provide around timeframes and the business impact?
Business may need to move off of Ceph due to slow performance

Comment 5 zixchen 2020-11-19 03:44:52 UTC
Please confirm this bug is a duplicate with Bug 1744525 or not.
Thanks.

Comment 6 zixchen 2020-11-19 03:55:08 UTC
hi Daniel,
Could you please to confirm this bug is duplicate with Bug 1744525 or not?

Comment 7 Daniel Berrangé 2020-11-19 09:21:39 UTC
I think they are probably different, but I'll leave it upto QEMU RBD maintainer to decide that.

Comment 8 David Hill 2020-11-19 15:29:21 UTC
No they are not the same ... from what I can see.   Customer is complaining that having a volume attached with krbd provides better performances than volumes attached with librados.

Comment 47 Klaus Heinrich Kiwi 2021-06-16 17:18:36 UTC
I think it's safe to close this bug for now, as the librbd package works and performance seems to be as expected.