Description of problem:
When using either the 'rbd export' or 'rbd export-diff' commands with the '--rbd-concurrent-management-ops' flag no impact on performance is observed.
Sample output from current attempt:
# snapshot created for initial export
$ rbd snap ls rbd/nas7-rds-stg1
SNAPID NAME SIZE
1106 161224-0935 102400 GB
$ rbd info rbd/foobar@161224-0935
rbd image 'foobar':
size 102400 GB in 26214400 objects
order 22 (4096 kB objects)
$ rbd export-diff --rbd-concurrent-management-ops 50 rbd/foobar@161224-0935 - | pv | rbd import-diff --rbd-concurrent-management-ops 50 - expandtest/foobar
Importing image diff: 1% complete... 1TiB 7:39:35 [37.2MiB/s]
Importing image diff: 2% complete... 2TiB 19:53:12 [36.7MiB/s]
Importing image diff: 3% complete... 3TiB 27:30:54 [43.2MiB/s]
Importing image diff: 4% complete... 4TiB 34:33:08 [7.97MiB/s]
Importing image diff: 5% complete... 5TiB 41:44:14 [42.4MiB/s]
Version-Release number of selected component (if applicable):
0.94.x (will request specific ceph -v)
Steps to Reproduce:
1. Create snapshot from an rbd
2. Export/import and specify the '--rbd-concurrent-management-ops' flag, for example:
rbd export-diff --rbd-concurrent-management-ops 50 rbd/foobar@161224-0935 - | pv | rbd import-diff --rbd-concurrent-management-ops 50 - expandtest/foobar
Slow performance when performing an rbd export/export-diff
Performance should be higher when '--rbd-concurrent-management-ops' is set to a high value. A ceph cluster should have a ton of parallelism available to it.
* Providing fio profile and job output which exhibits the concurrency expected in the cluster vs the image diff import above.
* Debug data is en route
* It's also worth noting that if 'export-diff' truly does not support 'rbd-concurrent-management-ops' even though it appears to be present in the code:
168 ExportDiffContext edc(&image, fd, info.size,
169 g_conf->rbd_concurrent_management_ops, no_progress);
170 r = image.diff_iterate2(fromsnapname, 0, info.size, true, whole_object,
171 &C_ExportDiff::export_diff_cb, (void *)&edc);
The customer has also seen the same behavior on just 'export'.
Created attachment 1238092 [details]
brief sequential rbd write & read fio job file
Created attachment 1238093 [details]
fio job output against rbds in the pool we are exporting from
I would suggest adding "--debug-rbd 20" to the rbd CLI and utilize the "librbd::io::AioCompletion" logs to see if more than one instance of an AIO message is in-flight concurrently.
Thank you Jason!
Verified in ceph version 12.2.0-1.el7cp
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.