Description of problem:Customer uses glance to export rbd images. Using the luminous client, a ~6GB image takes roughly a minute to export the first time (subsequent reads get faster, presumably due to cacheing). Using mimc or nautilus the export takes ~6 seconds. Version-Release number of selected component (if applicable): ceph version 12.2.11 (26dc3775efc7bb286a1d6d66faee0ba30ea23eee) luminous (stable) VS ceph version 14.2.4-29.0.TEST.bz1777064.el7cp (7680734a26329b5cbae64651a22ac1605344d785) nautilus (stable) (also tested with mimic- similar results to luminous. How reproducible: consistent Steps to Reproduce: 1.List glance images 2.Export image Actual results: Luminous takes ~60 seconds, vs ~6 seonds for mimic or nautilus. Expected results: Consistent behavior across versions. Additional info:
I did some further tests where I generated a new random 6.7GB file and did an RBD import & export with each client, to avoid any caching interfering with the results. The three clients connected to the same Ceph cluster (running Luminous) using the same client configuration, same credentials, same pool. The luminous client is approximately 10 times slower on the first RBD export than both Mimic and Nautilus clients. Ceph cluster ============ ceph version 12.2.8-128.el7cp (030358773c5213a14c1444a5147258672b2dc15f) luminous (stable) 8 nodes, 72 OSDs Client config ============= # Same config for all clients $ cat ceph.conf_devprod [global] fsid = 45e17393-84f4-4d31-bce3-0765a92314cb mon host = s-65kw2w2.sys.az1.eng.pdx.wd,s-65dy2w2.sys.az1.eng.pdx.wd,s-65l13w2.sys.az1.eng.pdx.wd Luminous client =============== $ ceph --version ceph version 12.2.11 (26dc3775efc7bb286a1d6d66faee0ba30ea23eee) luminous (stable) # Create a random 6.7GB file $ dd if=<(openssl enc -aes-256-ctr -pass pass:"$(dd if=/dev/urandom bs=128 count=1 2>/dev/null | base64)" -nosalt < /dev/zero) bs=1M count=6800 iflag=fullblock of=/tmp/large.image 6800+0 records in 6800+0 records out 7130316800 bytes (7.1 GB) copied, 10.2746 s, 694 MB/s # Import file as an RBD image $ uuid=$(uuidgen); time rbd import -c /etc/ceph/ceph.conf_devprod -p wpc4-pg1-glance --user wpc4-pg1-glance -k /etc/ceph/ceph.client.wpc4-pg1-glance.keyring_devprod /tmp/large.image $uuid rbd: --pool is deprecated for import, use --dest-pool Importing image: 100% complete...done. real 0m5.836s user 0m4.387s sys 0m4.525s # Export RBD image (not saving to disk) $ time rbd export -c /etc/ceph/ceph.conf_devprod -p wpc4-pg1-glance --user wpc4-pg1-glance -k /etc/ceph/ceph.client.wpc4-pg1-glance.keyring_devprod $uuid - > /dev/null Exporting image: 100% complete...done. real 1m9.896s user 0m1.401s sys 0m6.136s Mimic client ============ $ ceph --version ceph version 13.2.8 (5579a94fafbc1f9cc913a0f5d362953a5d9c3ae0) mimic (stable) # Create a random 6.7GB file $ dd if=<(openssl enc -aes-256-ctr -pass pass:"$(dd if=/dev/urandom bs=128 count=1 2>/dev/null | base64)" -nosalt < /dev/zero) bs=1M count=6800 iflag=fullblock of=/tmp/large.image 6800+0 records in 6800+0 records out 7130316800 bytes (7.1 GB) copied, 8.93855 s, 798 MB/s # Import file as an RBD image $ uuid=$(uuidgen); time rbd import -c ceph.conf_devprod --dest-pool wpc4-pg1-glance --user wpc4-pg1-glance -k ceph.client.wpc4-pg1-glance.keyring_devprod /tmp/large.image $uuid Importing image: 100% complete...done. real 0m5.571s user 0m3.943s sys 0m4.021s # Export RBD image (not saving to disk) $ time rbd export -c ceph.conf_devprod -p wpc4-pg1-glance --user wpc4-pg1-glance -k ceph.client.wpc4-pg1-glance.keyring_devprod $uuid - > /dev/null Exporting image: 100% complete...done. real 0m8.800s user 0m1.033s sys 0m6.756s Nautilus client =============== $ ceph --version ceph version 14.2.4-29.0.TEST.bz1777064.el7cp (7680734a26329b5cbae64651a22ac1605344d785) nautilus (stable) # Create a random 6.7GB file $ dd if=<(openssl enc -aes-256-ctr -pass pass:"$(dd if=/dev/urandom bs=128 count=1 2>/dev/null | base64)" -nosalt < /dev/zero) bs=1M count=6800 iflag=fullblock of=/tmp/large.image 6800+0 records in 6800+0 records out 7130316800 bytes (7.1 GB) copied, 8.0649 s, 884 MB/s # Import file as an RBD image $ uuid=$(uuidgen); time rbd import -c ceph.conf_devprod --dest-pool wpc4-pg1-glance --user wpc4-pg1-glance -k ceph.client.wpc4-pg1-glance.keyring_devprod /tmp/large.image $uuid Importing image: 100% complete...done. real 0m5.742s user 0m3.810s sys 0m5.302s # Export RBD image (not saving to disk) $ time rbd export -c ceph.conf_devprod -p wpc4-pg1-glance --user wpc4-pg1-glance -k ceph.client.wpc4-pg1-glance.keyring_devprod $uuid - > /dev/null Exporting image: 100% complete...done. real 0m9.494s user 0m1.502s sys 0m11.228s
I also repeated the tests pointing the three clients to a 8-node 72-OSD Red Hat Ceph Storage 4 cluster (ceph version 14.2.4-29.0.TEST.bz1777064.el7cp (7680734a26329b5cbae64651a22ac1605344d785) nautilus (stable)), and noticed the same slowdown with the Nautilus client.
Older releases of Ceph (including older Mimic and Nautilus releases) limited the read queue depth to 1 when exporting to STDOUT. This was improved with later point releases (i.e. v14.2.3 was the first Nautilus release to include the improvement).
I'm also seeing a significant slowdown when using the Python RBD library to read the image. I wrote a simple Python script that simulates what the Glance RBD driver does when reading and image from Ceph. I'll attach the Python script to this ticket. It is based on the Glance RBD driver: https://github.com/openstack/glance_store/blob/master/glance_store/_drivers/rbd.py#L223 I ran this script from one of the Ceph nodes inside this RHCS 3 cluster (Nautilus) and it also took more than a minute to read a new 7GB image in 4MB chunks. # ceph --version ceph version 12.2.8-128.el7cp (030358773c5213a14c1444a5147258672b2dc15f) luminous (stable) # dd if=<(openssl enc -aes-256-ctr -pass pass:"$(dd if=/dev/urandom bs=128 count=1 2>/dev/null | base64)" -nosalt < /dev/zero) bs=1M count=6800 iflag=fullblock of=/tmp/large.image 6800+0 records in 6800+0 records out 7130316800 bytes (7.1 GB) copied, 8.14506 s, 875 MB/s # uuid=$(uuidgen); time rbd import --dest-pool wpc4-pg1-glance /tmp/large.image $uuid Importing image: 100% complete...done. real 0m6.675s user 0m3.405s sys 0m4.231s # time ./rbdread.py $uuid Image size: 7130316800 bytes Image read successfully real 1m1.907s user 0m2.314s sys 0m5.409s
Created attachment 1655080 [details] Python script to read an RBD image from Ceph similar to the RBD driver for Glance
(In reply to Sergio de Carvalho from comment #5) > Created attachment 1655080 [details] > Python script to read an RBD image from Ceph similar to the RBD driver for > Glance This is essentially a queue depth of 1 just like Glance, and therefore suffers the same performance problems due to client<->OSD read latency and lack of parallelism. See BZ1389529 for the RFE against Glance to support queue depth >1 for Glance import operations.
That makes sense now, Jason. I did a test exporting the image to a file and noticed an improvement in download speed. Even though there was a big delay in writing the actual file to disk - the total export time dropped from 70s to about 35s. I see what you mean with the Glance RBD driver being inefficient. I guess there isn't much we can do there other than writing a threaded version ourselves... Thanks very much for your help.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:1320