Description of problem: RBD Python bindings script it is taking 3x more time to import a file as compare to RBD cli Version-Release number of selected component (if applicable): Red Hat Ceph Storage 1.3.z How reproducible: Always Steps to Reproduce: - Import the image with rbd cli and next time do it with python bindings. ---- - Initially we thought this one is the issue: http://tracker.ceph.com/issues/17178 and https://bugzilla.redhat.com/show_bug.cgi?id=1373643 as we are focusing on glance images upload is slow. - As mentioned in http://tracker.ceph.com/issues/17178 RBD python still do not support skipping fully zeroed blocks when importing the image to the RBD pool. - We thought it is because of that normal RBD import from rbd cli is completing in 20 seconds for 1.1G raw image and with python bindings it is taking nearly 4 minutes for same image. ---- # du -sh /root/rhel-guest-image-7.2-20151102.0.x86_64.raw 1.1G /root/rhel-guest-image-7.2-20151102.0.x86_64.raw # time python image-upload.py rhel7.1 /root/rhel-guest-image-7.2-20151102.0.x86_64.raw 8 [......] Writing data at offset 10645143552(MB: 10152) Writing data at offset 10653532160(MB: 10160) Writing data at offset 10661920768(MB: 10168) Writing data at offset 10670309376(MB: 10176) Writing data at offset 10678697984(MB: 10184) Writing data at offset 10687086592(MB: 10192) Writing data at offset 10695475200(MB: 10200) Writing data at offset 10703863808(MB: 10208) Writing data at offset 10712252416(MB: 10216) Writing data at offset 10720641024(MB: 10224) Writing data at offset 10729029632(MB: 10232) done real 4m25.849s user 0m4.523s sys 0m7.037s [root@dell-per630-11 ceph]# rbd info images/rhel7.1 rbd image 'rhel7.1': size 10240 MB in 1280 objects order 23 (8192 kB objects) block_name_prefix: rbd_data.17632b6238e1f29 format: 2 features: layering flags: # time rbd --id=glance --image-format=2 -p images import /root/rhel-guest-image-7.2-20151102.0.x86_64.raw rhel7 Importing image: 100% complete...done. real 0m20.950s user 0m9.691s sys 0m3.212s # rbd info images/rhel7 rbd image 'rhel7': size 10240 MB in 2560 objects order 22 (4096 kB objects) block_name_prefix: rbd_data.1743256238e1f29 format: 2 features: layering flags: --- ---- Script which we used for python binding is referenced from RBD glance driver ---------------------------------------------------------------------- import os import sys import math from oslo_utils import units try: import rados import rbd except ImportError: rados = None rbd = None if len(sys.argv) <4 : sys.exit('Usage: %s <image_name> <image_path> <chunk_size_in_MB>') image_name = sys.argv[1] image_path = sys.argv[2] chunk = int(sys.argv[3]) chunk_size = chunk * units.Mi pool = 'images' #pool = 'svl-fab1-aos-glance-pool-01' user = 'glance' #user = 'svl-fab1-aos-glance-usr' conf_file = '/etc/ceph/ceph.conf' radosid= 'glance' #radosid='svl-fab1-aos-glance-usr' connect_timeout = 0 image_size = os.path.getsize(image_path) order = int(math.log(chunk_size, 2)) print image_name print image_path print image_size print chunk print chunk_size print pool print user print conf_file print connect_timeout print radosid with rados.Rados(conffile=conf_file, rados_id=radosid) as cluster: with cluster.open_ioctx(pool) as ioctx: rbd_inst = rbd.RBD() size = image_size rbd_inst.create(ioctx, image_name, size, order,old_format=False,features=1) with rbd.Image(ioctx, image_name) as image: f = open(image_path, "rb") try: offset = 0 data = f.read(chunk_size) while data != "": print "Writing data at offset " + str(offset) + "(MB: " + str(offset / units.Mi) + ")" offset += image.write(data,offset) data = f.read(chunk_size) finally: f.close() print "done" -----------
- To avoid the zero block stuff we created a file of 9.3G and 1G with /dev/urandom and reproduced once again and it is mostly same the only difference is now rbd cli is not have same performance as it is taking whole file in consideration. ------------------------------------- - I created a 9.3G tesfile with /dev/urandom # du -sh /etc/ceph/testfile 9.3G /etc/ceph/testfile # time rbd --id=glance --image-format=2 --order=23 -p images import /etc/ceph/testfile testfile.rbd.cli Importing image: 100% complete...done. real 1m41.499s user 0m7.554s sys 0m9.129s # rbd info images/testfile.rbd.cli rbd image 'testfile.rbd.cli': size 9462 MB in 1183 objects order 23 (8192 kB objects) block_name_prefix: rbd_data.1743a3f238e1f29 format: 2 features: layering flags: ^^ rbd cli which was taking 20 seconds now it is taking 1 minute 41 seconds. But python binding is nearly same. # time python image-upload.py testfile.rbd.python.bindings /etc/ceph/testfile 8 [......] Writing data at offset 9873391616(MB: 9416) Writing data at offset 9881780224(MB: 9424) Writing data at offset 9890168832(MB: 9432) Writing data at offset 9898557440(MB: 9440) Writing data at offset 9906946048(MB: 9448) Writing data at offset 9915334656(MB: 9456) done real 4m4.008s user 0m4.093s sys 0m6.611s # rbd info images/testfile.rbd.python.bindings rbd image 'testfile.rbd.python.bindings': size 9462 MB in 1183 objects order 23 (8192 kB objects) block_name_prefix: rbd_data.1763af9238e1f29 format: 2 features: layering flags: ----------------------------- For 1G file: ------------ # du -sh /etc/ceph/testfile.1G 1.0G /etc/ceph/testfile.1G # time rbd --id=glance --order=23 --image-format=2 -p images import /etc/ceph/testfile.1G testfile.rbd.cli.1G Importing image: 100% complete...done. real 0m11.504s user 0m0.870s sys 0m1.432s # rbd info images/testfile.rbd.cli.1G rbd image 'testfile.rbd.cli.1G': size 1024 MB in 128 objects order 23 (8192 kB objects) block_name_prefix: rbd_data.176787e238e1f29 format: 2 features: layering flags: # time python image-upload.py testfile.rbd.python.bindings.1G /etc/ceph/testfile.1G 8 [.....] Writing data at offset 1031798784(MB: 984) Writing data at offset 1040187392(MB: 992) Writing data at offset 1048576000(MB: 1000) Writing data at offset 1056964608(MB: 1008) Writing data at offset 1065353216(MB: 1016) done real 0m26.687s user 0m0.578s sys 0m0.816s # rbd info images/testfile.rbd.python.bindings.1G rbd image 'testfile.rbd.python.bindings.1G': size 1024 MB in 128 objects order 23 (8192 kB objects) block_name_prefix: rbd_data.17678ab238e1f29 format: 2 features: layering flags: ---------------------------------------------------------
There are two factors are play: (1) the rbd CLI will skip zeroed, object-size extents and (2) the rbd CLI uses aio to have 10 concurrent read/write requests in flight concurrently (controlled by the "rbd concurrent management ops" config value). Therefore, this BZ is not comparing apples to apples.
Hi Jason! Well, for users, the end result is important. Glance is leveraging the python bindings, and we needed to know where this performance difference came from. Thanks for the explanation! How could one increase the concurrency of the python bindings? Would this be an RFE for the library code, or should the user of python rbd bindings somehow take care of the threading to create concurrent read/write requests in the application code? Meaning that this should be improved in glance's code?
(In reply to Andreas Karis from comment #3) > Hi Jason! Well, for users, the end result is important. Glance is leveraging > the python bindings, and we needed to know where this performance difference > came from. Thanks for the explanation! How could one increase the > concurrency of the python bindings? Would this be an RFE for the library > code, or should the user of python rbd bindings somehow take care of the > threading to create concurrent read/write requests in the application code? > Meaning that this should be improved in glance's code? Hey Andreas, I had discussion with Jason and right we need to Open a RFE to enhance the glance RBD driver code for import to make use of concurrency.
Created RFE agains glance https://bugzilla.redhat.com/show_bug.cgi?id=1389529
BZ1389529 covers the ability to sparse upload images to RBD-backed Glance.