Created attachment 1847075 [details] IO error log and scripts Description of problem (please be detailed as possible and provide log snippests): A test executed as in the script rbd-mirror-io-test.sh (attached) where RBD mirroring is setup to mirror in a 1 minute schedule across 2 ceph clusters shows the following behaviour: - Post the script (rbd-mirror-io-test.sh) is executed, start IO on the image against the primary instance using the following command: - rbd bench --io-type write --io-size 4K --io-threads 2 --io-total 10M --io-pattern rand replicapool/test-demote-sb-1 - In parallel watch in a loop for rbd snapshots being created for the image under test: - while true; do date -u; rbd snap ls replicapool/test-demote-sb-1 --all; echo "============================="; sleep 5; done - When a snapshot schedule is triggered (at the minute) the following observations are made on the terminal performing IO on the image: - IO starts slowing down drastically - IO errors out with logs as follows (also attached rbd-io-error.log) Log snippet: 2021-12-20T14:08:04.660+0000 7f4ccd8c3700 -1 librbd::ImageWatcher: 0x7f4cb40078e0 image watch failed: 139967279926704, (107) Transport endpoint is not connected 2021-12-20T14:08:04.660+0000 7f4ccd8c3700 -1 librbd::Watcher: 0x7f4cb40078e0 handle_error: handle=139967279926704: (107) Transport endpoint is not connected 30 1460 11.5875 46 KiB/s 31 1468 10.425 42 KiB/s 2021-12-20T14:08:05.999+0000 7f4ccd8c3700 -1 librbd::io::ObjectRequest: 0x7f4cac007150 handle_write_object: rbd_data.1705844598362.000000000000024a failed to write object: (108) Cannot send after transport endpoint shutdown 2021-12-20T14:08:05.999+0000 7f4ccd8c3700 -1 librbd::cache::WriteAroundObjectDispatch: 0x7f4cac00aea0 handle_in_flight_io_complete: IO error encountered: tid=3: (108) Cannot send after transport endpoint shutdown 2021-12-20T14:08:06.024+0000 7f4ccd8c3700 -1 librbd::io::ObjectRequest: 0x7f4cc400abe0 handle_write_object: rbd_data.1705844598362.000000000000413a failed to write object: (108) Cannot send after transport endpoint shutdown 2021-12-20T14:08:06.024+0000 7f4ccd8c3700 -1 librbd::cache::WriteAroundObjectDispatch: 0x7f4cac00aea0 handle_in_flight_io_complete: IO error encountered: tid=4: (108) Cannot send after transport endpoint shutdown 2021-12-20T14:08:06.029+0000 7f4ccd8c3700 -1 librbd::io::ObjectRequest: 0x7f4cb401aa10 handle_write_object: rbd_data.1705844598362.0000000000003ad5 failed to write object: (108) Cannot send after transport endpoint shutdown 2021-12-20T14:08:06.029+0000 7f4ccd8c3700 -1 librbd::cache::WriteAroundObjectDispatch: 0x7f4cac00aea0 handle_in_flight_io_complete: IO error encountered: tid=6: (108) Cannot send after transport endpoint shutdown 2021-12-20T14:08:06.042+0000 7f4ccd8c3700 -1 librbd::io::ObjectRequest: 0x7f4cb401f9e0 handle_write_object: rbd_data.1705844598362.0000000000001915 failed to write object: (108) Cannot send after transport endpoint shutdown 2021-12-20T14:08:06.042+0000 7f4ccd8c3700 -1 librbd::cache::WriteAroundObjectDispatch: 0x7f4cac00aea0 handle_in_flight_io_complete: IO error encountered: tid=7: (108) Cannot send after transport endpoint shutdown Further observations are as follows: - IO errors out even if secondary cluster is shutdown (i.e image replay is not in flight) - IO slowness ~65Kib/s is observed only when mirroring is in play, this does not happen for a regular image for which a snapshot is taken at the time of IO test as above Problems to look at: - IO errors - IO slowness when image is mirroring enabled Version of all relevant components (if applicable): - ceph version 16.2.6 (ee28fb57e47e9f88813e24bbf4c14496ca299d31) pacific (stable) Does this issue impact your ability to continue to work with the product (please explain in detail what is the user impact)? - If IO errors out when mirroring is in effect this would impact clients using the image - IO rate slowness is quite drastic, and would impact client access of the image Is there any workaround available to the best of your knowledge? - No Rate from 1 - 5 the complexity of the scenario you performed that caused this bug (1 - very simple, 5 - very complex)? - 2 Can this issue reproducible? - Yes Can this issue reproduce from the UI? - NA If this is a regression, please provide more details to justify this: - No Steps to Reproduce: - As above
Further observations: - When IO is interrupted the client performing the IO is blocklisted - This was determined by watching the blocklist entries against the OSD when IO was ongoing and a mirror snapshot schedule got triggered - The output looks as follows: while true; do date -u; rbd snap ls replicapool/test-demote-sb-1 --all; ceph osd blocklist ls; echo "============================="; sleep 5; done ... Mon Dec 20 21:50:07 UTC 2021 SNAPID NAME SIZE PROTECTED TIMESTAMP NAMESPACE 11621 .mirror.primary.a2ef2d48-0cbb-4e05-8d29-d93706f42a22.6c67935b-5f0a-4490-ac65-55bca3ab24dc 100 GiB Mon Dec 20 14:44:01 2021 mirror (primary peer_uuids:[49a8473a-9cfd-4797-8266-443eba34dfdf]) 11630 .mirror.primary.a2ef2d48-0cbb-4e05-8d29-d93706f42a22.6687c762-78d3-4e75-ad75-a88de4ea513b 100 GiB Mon Dec 20 14:45:39 2021 mirror (primary peer_uuids:[49a8473a-9cfd-4797-8266-443eba34dfdf]) 13319 .mirror.primary.a2ef2d48-0cbb-4e05-8d29-d93706f42a22.e51f1b07-d2cb-4baa-9442-3914e5f5e989 100 GiB Mon Dec 20 21:49:01 2021 mirror (primary peer_uuids:[49a8473a-9cfd-4797-8266-443eba34dfdf]) listed 0 entries ============================= Mon Dec 20 21:50:14 UTC 2021 SNAPID NAME SIZE PROTECTED TIMESTAMP NAMESPACE 11621 .mirror.primary.a2ef2d48-0cbb-4e05-8d29-d93706f42a22.6c67935b-5f0a-4490-ac65-55bca3ab24dc 100 GiB Mon Dec 20 14:44:01 2021 mirror (primary peer_uuids:[49a8473a-9cfd-4797-8266-443eba34dfdf]) 11630 .mirror.primary.a2ef2d48-0cbb-4e05-8d29-d93706f42a22.6687c762-78d3-4e75-ad75-a88de4ea513b 100 GiB Mon Dec 20 14:45:39 2021 mirror (primary peer_uuids:[49a8473a-9cfd-4797-8266-443eba34dfdf]) 13319 .mirror.primary.a2ef2d48-0cbb-4e05-8d29-d93706f42a22.e51f1b07-d2cb-4baa-9442-3914e5f5e989 100 GiB Mon Dec 20 21:49:01 2021 mirror (primary peer_uuids:[49a8473a-9cfd-4797-8266-443eba34dfdf]) 172.17.0.2:0/4264307978 2021-12-20T22:50:12.108020+0000 listed 1 entries ============================= NOTE: Post IO errors, the blocklist starts listing the client that was performing the IO As image is fast-diff (and hence exclusive lock) enabled, it is possibly the client not giving up the exclusive lock that causes the blocklist by the mirror process and results in the IO errors.
Thanks Shyam for providing dev environment, I was able to reproduce this issue. On debugging, I found the issue is: Since we are using list_snaps in parallel to performing i/o's, both these process use rados_api.execute boost::asio process with same object, this makes the the list_snaps operation blocking for running i/o's. This should not be a blocking issue as we are not expecting user to generally be constantly be monitoring snapshot listing `snap ls`. The issue is not observed when we do not have a monitoring shell `snap ls` running in parallel to I/O's.
Deepika, what is the plan for this BZ. Are we going to fix it? If yes, then do you want to open a clone Ceph BZ.
Not a 4.10 blocker, moving DR BZs to 4.10.z
Ceph BZ is moved to 6.2