Bug 1389529 - [RFE] Implement threading in glance_store/_drivers/rbd.py
Summary: [RFE] Implement threading in glance_store/_drivers/rbd.py
Keywords:
Status: CLOSED DUPLICATE of bug 1647041
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: python-glance-store
Version: 9.0 (Mitaka)
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: Upstream M1
: ---
Assignee: Cyril Roelandt
QA Contact: Mike Abrams
URL:
Whiteboard:
Depends On:
Blocks: 1389112 ciscoosp13features, ciscoosp13rfe 1476900
TreeView+ depends on / blocked
 
Reported: 2016-10-27 19:56 UTC by Andreas Karis
Modified: 2020-11-12 11:12 UTC (History)
11 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-11-12 11:12:24 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Andreas Karis 2016-10-27 19:56:23 UTC
Description of problem:
There is a noticeable performance delay between glance image uploads into ceph and uploads directly into ceph via CLI

"There are two factors are play: (1) the rbd CLI will skip zeroed, object-size extents and (2) the rbd CLI uses aio to have 10 concurrent read/write requests in flight concurrently (controlled by the "rbd concurrent management ops" config value). Therefore, this BZ is not comparing apples to apples."

Version-Release number of selected component (if applicable):
all versions

How reproducible:
Verify the procedure and comments here
https://bugzilla.redhat.com/show_bug.cgi?id=1389112

Steps to Reproduce:
----
# du -sh /root/rhel-guest-image-7.2-20151102.0.x86_64.raw
1.1G	/root/rhel-guest-image-7.2-20151102.0.x86_64.raw


# time python image-upload.py rhel7.1 /root/rhel-guest-image-7.2-20151102.0.x86_64.raw 8

[......]


Writing data at offset 10645143552(MB: 10152)
Writing data at offset 10653532160(MB: 10160)
Writing data at offset 10661920768(MB: 10168)
Writing data at offset 10670309376(MB: 10176)
Writing data at offset 10678697984(MB: 10184)
Writing data at offset 10687086592(MB: 10192)
Writing data at offset 10695475200(MB: 10200)
Writing data at offset 10703863808(MB: 10208)
Writing data at offset 10712252416(MB: 10216)
Writing data at offset 10720641024(MB: 10224)
Writing data at offset 10729029632(MB: 10232)
done

real	4m25.849s
user	0m4.523s
sys	0m7.037s
[root@dell-per630-11 ceph]# rbd info images/rhel7.1
rbd image 'rhel7.1':
	size 10240 MB in 1280 objects
	order 23 (8192 kB objects)
	block_name_prefix: rbd_data.17632b6238e1f29
	format: 2
	features: layering
	flags: 

# time rbd --id=glance --image-format=2 -p images import /root/rhel-guest-image-7.2-20151102.0.x86_64.raw rhel7 
Importing image: 100% complete...done.

real	0m20.950s
user	0m9.691s
sys	0m3.212s

# rbd info images/rhel7
rbd image 'rhel7':
	size 10240 MB in 2560 objects
	order 22 (4096 kB objects)
	block_name_prefix: rbd_data.1743256238e1f29
	format: 2
	features: layering
	flags:
---

---- Script which we used for python binding is referenced from RBD glance driver ----------------------------------------------------------------------


import os
import sys
import math

from oslo_utils import units

try:
    import rados
    import rbd
except ImportError:
    rados = None
    rbd = None


if len(sys.argv) <4 :
    sys.exit('Usage: %s <image_name> <image_path> <chunk_size_in_MB>')
image_name = sys.argv[1]
image_path = sys.argv[2]
chunk = int(sys.argv[3])

chunk_size = chunk * units.Mi
pool = 'images'
#pool = 'svl-fab1-aos-glance-pool-01'
user = 'glance'
#user = 'svl-fab1-aos-glance-usr'
conf_file = '/etc/ceph/ceph.conf'
radosid= 'glance'
#radosid='svl-fab1-aos-glance-usr'
connect_timeout = 0
image_size = os.path.getsize(image_path)
order = int(math.log(chunk_size, 2))

print image_name
print image_path
print image_size
print chunk
print chunk_size
print pool
print user
print conf_file
print connect_timeout
print radosid

with rados.Rados(conffile=conf_file, rados_id=radosid) as cluster:
    with cluster.open_ioctx(pool) as ioctx:
        rbd_inst = rbd.RBD()
        size = image_size
        rbd_inst.create(ioctx, image_name, size, order,old_format=False,features=1)
        with rbd.Image(ioctx, image_name) as image:
          f = open(image_path, "rb")
          try:
              offset = 0
              data = f.read(chunk_size)
              while data != "":
                  print "Writing data at offset " + str(offset) + "(MB: " + str(offset / units.Mi) + ")"
                  offset += image.write(data,offset)
                  data = f.read(chunk_size)
          finally:
              f.close()

print "done"

-----------

Additional info:

This is the glance code:

glance_store/_drivers/rbd.py

168             with self.store.get_connection(conffile=self.conf_file,
169                                            rados_id=self.user) as conn:
170                 with conn.open_ioctx(self.pool) as ioctx:
171                     with rbd.Image(ioctx, self.name,
172                                    snapshot=self.snapshot) as image:
173                         img_info = image.stat()
174                         size = img_info['size']
175                         bytes_left = size
176                         while bytes_left > 0:
177                             length = min(self.chunk_size, bytes_left)
178                             data = image.read(size - bytes_left, length)
179                             bytes_left -= len(data)
180                             yield data
181                         raise StopIteration()
182         except rbd.ImageNotFound:
183             raise exceptions.NotFound(
184                 _('RBD image %s does not exist') % self.name)

Comment 2 Andreas Karis 2016-10-27 19:58:35 UTC
The main reason here is that the rbd CLI uses aio to have 10 concurrent read/write requests in flight concurrently (controlled by the "rbd concurrent management ops" config value). The python API bindings for RBD do not use aio; thus, threading needs to be implemented in the client code (=glance) to improve performance.

Comment 6 Jason Dillaman 2017-07-11 15:47:35 UTC
@Sean: why was this closed? There is an associated upstream Glance review in-progress for this feature [1]

[1] https://review.openstack.org/#/c/430641/

Comment 11 Jason Dillaman 2020-04-24 00:24:57 UTC
Just to rehash this BZ once more: the RBD Python bindings now offer AIO interfaces (since the kraken release) [1] so there is no need to use "threading" to solve this performance bottleneck. Instead, offer a new "max concurrent IOs"-like config override (we use 10 in the rbd CLI) and issue up to the configured max concurrent IO limit when reading from and writing to RBD from Glance. 

[1] https://github.com/ceph/ceph/blob/luminous/src/pybind/rbd/rbd.pyx#L2534

Comment 12 Cyril Roelandt 2020-05-05 17:42:14 UTC
> (we use 10 in the rbd CLI) 

@Jason: could you point us to the relevant code in the rbd CLI?

Comment 13 Jason Dillaman 2020-05-05 17:47:40 UTC
(In reply to Cyril Roelandt from comment #12)
> > (we use 10 in the rbd CLI) 
> 
> @Jason: could you point us to the relevant code in the rbd CLI?

The rbd CLI is written in C++ not Python, but it's here [1].

[1] https://github.com/ceph/ceph/blob/master/src/tools/rbd/action/Import.cc#L743

Comment 14 Gregory Charot 2020-11-12 11:12:24 UTC
Closing as duplicate of the sparse Image RFE

This RFE is meant to improve Glance RBD image upload which will be solved by:

1. Support for sparse images - https://bugzilla.redhat.com/show_bug.cgi?id=1647041
2. Ramp up rbd resize to avoid excessive calls - https://bugzilla.redhat.com/show_bug.cgi?id=1690726
3. Set default rbd concurrent management ops = 20 - https://bugzilla.redhat.com/show_bug.cgi?id=1886175 + Comment 11 mentioning that "RBD Python bindings now offer AIO interfaces" 

Feel free to reopen this RFE and add more inputs if the above does not meet the expectations.

*** This bug has been marked as a duplicate of bug 1647041 ***


Note You need to log in before you can comment on or make changes to this bug.