Created attachment 1186276 [details] bt Description of problem: rbd bench-write: segmentation fault when value of "--io-size" is greater than or equal to image size. Version-Release number of selected component (if applicable): ceph version 10.2.2-26.el7cp How reproducible: always Steps to Reproduce: 1. create an image of size 100M --cmd: rbd create test_rbd/cephRBD --size 100M 2. start 'rbd bench-write' on this image with '--io-size' as 101M --cmd: rbd --cluster ceph bench-write test_rbd/cephRBD --io-size 101M --io-threads 3 --io-total 1M --io-pattern rand --result: segmentation fault seen 3. create an image of size 4096M 4. start 'rbd bench-write' on this image with '--io-size' as 4096M --cmd: rbd --cluster ceph bench-write -p test_rbd --image test0 --io-size 4096M --io-threads 3 --io-total 1M --io-pattern rand --result: segmentation fault seen Actual results: segmentation fault observed after executing step 2 and 4. Expected results: Additional info: 1) Please see the attachment with bt for both the cases. 2) Image sizes: [root@magna105 ubuntu]# rbd info test_rbd/test0 rbd image 'test0': size 4096 MB in 1024 objects order 22 (4096 kB objects) block_name_prefix: rbd_data.658322ae8944a format: 2 features: layering, exclusive-lock, object-map, fast-diff, deep-flatten flags: [root@magna105 ubuntu]# rbd info test_rbd/cephRBD rbd image 'cephRBD': size 102400 kB in 25 objects order 22 (4096 kB objects) block_name_prefix: rbd_data.6bb7e238e1f29 format: 2 features: layering, exclusive-lock, object-map, fast-diff, deep-flatten flags:
Upstream pull request: https://github.com/ceph/ceph/pull/10708
Still facing this issue whenever io size and image size are greater than or equal to 4096 and io size is lesser than or equal to chosen image size. ( 4096 =< io_size =< image_size ) Regards, Vasishta
Please provide the exact command you executed and the exact result you witnessed. What you are describing doesn't match the bug (which was a crash when io_size was *greater* than the image size).
Hi Jason, As it is mentioned in Comment 0 that 'segmentation fault when value of "--io-size" is greater than or equal to image size', both cases (1. greater than and 2.equal) have been tried. 1. When io_size is greater than image_size - Error message is getting displayed, saying this is not possible 2. When io_size is equal to image_size - a. If image_size & io_size is lesser than 4096 sudo rbd --cluster ceph bench-write images/img2 --io-size 4096M --io-threads 3 --io-total 100M --io-pattern rand
Sorry Jason, Please ignore Comment 11. It is incomplete. I'll provide full info shortly. Regards, Vasishta
Ack -- so comment 9 should read something along the lines of the following: rbd bench-write will still crash if the io_size is *equal to* the image size so long as the image_size/io_size is less than X. The original wording stated that it was crashing as long as io_size was less than the image size, which is a major issue if true. Given that this is a small corner case, I am going to move this to the next release.
The actual issue is that the "--io-size=4096M" (i.e. 4GB io size) is leading to memory corruption due to overflow. The "rbd bench-write" command actually works properly if "--io-size" is equal to the image size so long as it doesn't pass the 4GB size boundary.
Hi all, Couldn't complete Comment 11, Sorry for that. As Jason has mentioned in Comment 15, Seg Fault was occurring whenever io-size was >= 4096, if image-size was >= 4096. i,e image_size >= io-size >= 4096 (After the first fix) Now io-size has been capped to 4095M, irrespective of image size. $ sudo rbd bench data/im2 --io-type write --io-size 4096M --io-threads 3 --io-total 100M --io-pattern rand --cluster 12_luminous rbd: io-size should be less than 4G bench failed: (22) Invalid argument $ sudo rbd bench data/im2 --io-type write --io-size 10240M --io-threads 3 --io-total 100M --io-pattern rand --cluster 12_luminous rbd: io-size should be less than 4G bench failed: (22) Invalid argument $ sudo rbd bench data/im3 --io-type write --io-size 5000M --io-threads 3 --io-total 100M --io-pattern rand --cluster 12_luminous rbd: io-size 5000 MB larger than image size 102400 kB bench failed: (22) Invalid argument Moving to VERIFIED state. Regards, Vasishta
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:3387