Description of problem: While testing out a scenario where rbd_support client is blocklisted, the error message for ceph rbd task add flatten was empty. Version-Release number of selected component (if applicable): 17.2.6-50.el9cp How reproducible: Many times in a scenario Steps to Reproduce: 1. Blocklist rbd_support client 2. Tried to add a ceph task to flatten a client Actual results: # ceph rbd task add flatten mirror_pool/r_clone Error EAGAIN: # Expected results: User understandable error message Additional info: - Created snapshot mirror_pool/r@snap and clone mirror_pool/r_clone - block listed rbd_suport client # CLIENT_ADDR=$(ceph mgr dump | jq .active_clients[] | jq 'select(.name == "rbd_support")' | jq -r '[.addrvec[0].addr, "/", .addrvec[0].nonce|tostring] | add') ceph osd blocklist add $CLIENT_ADDR ceph osd blocklist ls | grep $CLIENT_ADDR blocklisting 1x.x.1xx.2x:0/3400216612 until 2023-05-29T08:48:03.770899+0000 (3600 sec) listed 16 entries 1x.x.1xx.2x:0/3400216612 2023-05-29T08:48:03.770899+0000 - Tried adding ceph task to flatten rbd clone few times before rbd_support client recovers # date;ceph rbd task add flatten mirror_pool/r_clone Mon May 29 07:48:24 AM UTC 2023 Error EAGAIN: # date;ceph rbd task add flatten mirror_pool/r_clone Mon May 29 07:48:36 AM UTC 2023 Error EAGAIN: - Finally after when rbd_support gets recovered # date;ceph rbd task add flatten mirror_pool/r_clone Mon May 29 07:49:02 AM UTC 2023 {"sequence": 1, "id": "0b0bff02-440a-4a64-a449-708275a7894b", "message": "Flattening image mirror_pool/r_clone", "refs": {"action": "flatten", "pool_name": "mirror_pool", "pool_namespace": "", "image_name": "r_clone", "image_id": "607c134e4c879"}}
Tested using ceph version 17.2.6-98.el9cp (b53d9dfff6b1021fbab8e28f2b873d7d49cf5665) quincy (stable) Created pool, image and snaphot of that image Cloned the snapshot to new image [ceph: root@ceph-rbd1-s-r-avxinc-node1-installer /]# rbd snap ls rep_pool_jDNsGQuutX/rep_image_KJXaGSnDfq SNAPID NAME SIZE PROTECTED TIMESTAMP 5 snap1 1 GiB yes Thu Jul 20 12:48:46 2023 6 snap2 1 GiB Thu Jul 20 13:09:39 2023 Protected the snapshot before cloning to child image [ceph: root@ceph-rbd1-s-r-avxinc-node1-installer /]# rbd clone rep_pool_jDNsGQuutX/rep_image_KJXaGSnDfq@snap2 pool2/image4 2023-07-20T13:10:58.488+0000 7f88b9c5d640 -1 librbd::image::CloneRequest: 0x5561a89d1a40 validate_parent: parent snapshot must be protected rbd: clone error: (22) Invalid argument [ceph: root@ceph-rbd1-s-r-avxinc-node1-installer /]# rbd snap protect rep_pool_jDNsGQuutX/rep_image_KJXaGSnDfq@snap2 [ceph: root@ceph-rbd1-s-r-avxinc-node1-installer /]# [ceph: root@ceph-rbd1-s-r-avxinc-node1-installer /]# rbd clone rep_pool_jDNsGQuutX/rep_image_KJXaGSnDfq@snap2 pool2/image4 [ceph: root@ceph-rbd1-s-r-avxinc-node1-installer /]# [ceph: root@ceph-rbd1-s-r-avxinc-node1-installer /]# rbd du pool2/image4 NAME PROVISIONED USED image4 1 GiB 0 B Performed client block list using this script http://pastebin.test.redhat.com/1101663 [root@ceph-rbd1-s-r-avxinc-node2 cephuser]# ./block_client.sh Blocking 10.0.211.92:0/3556506075 blocklisting 10.0.211.92:0/3556506075 until 2023-07-20T14:12:53.762610+0000 (3600 sec) listed 3 entries Confirmed 10.0.211.92:0/3556506075 got blocklisted ************************************************** Blocking 10.0.211.92:0/3556506075 blocklisting 10.0.211.92:0/3556506075 until 2023-07-20T14:13:00.631901+0000 (3600 sec) listed 3 entries Confirmed 10.0.211.92:0/3556506075 got blocklisted ************************************************** During this ongoing script execution when i try to add a task to flatten the child image, It throws below error [ceph: root@ceph-rbd1-s-r-avxinc-node1-installer /]# ceph rbd task add flatten pool2/image4 Error EAGAIN: [errno 108] RBD connection was shutdown (error opening image b'image4' at snapshot None) @Rama Raja, Is this the expected error message added for it? Can you please suggest the steps to see ? Error message added in the PR https://github.com/ceph/ceph/pull/52064/files "rbd_support module is not ready, try again" Just NOTE: after stopping the script able to add a task to flatten the child image [ceph: root@ceph-rbd1-s-r-avxinc-node1-installer /]# ceph rbd task add flatten pool2/image4 {"sequence": 1, "id": "7865a8f9-44b2-4c0a-9e5e-f07b84a83e8b", "message": "Flattening image pool2/image4", "refs": {"action": "flatten", "pool_name": "pool2", "pool_namespace": "", "image_name": "image4", "image_id": "3bac5a603593"}}
(In reply to Sunil Angadi from comment #8) > Tested using > ceph version 17.2.6-98.el9cp (b53d9dfff6b1021fbab8e28f2b873d7d49cf5665) > quincy (stable) > > Created pool, image and snaphot of that image > Cloned the snapshot to new image > > [ceph: root@ceph-rbd1-s-r-avxinc-node1-installer /]# rbd snap ls > rep_pool_jDNsGQuutX/rep_image_KJXaGSnDfq > SNAPID NAME SIZE PROTECTED TIMESTAMP > 5 snap1 1 GiB yes Thu Jul 20 12:48:46 2023 > 6 snap2 1 GiB Thu Jul 20 13:09:39 2023 > > Protected the snapshot before cloning to child image > [ceph: root@ceph-rbd1-s-r-avxinc-node1-installer /]# rbd clone > rep_pool_jDNsGQuutX/rep_image_KJXaGSnDfq@snap2 pool2/image4 > 2023-07-20T13:10:58.488+0000 7f88b9c5d640 -1 librbd::image::CloneRequest: > 0x5561a89d1a40 validate_parent: parent snapshot must be protected > > rbd: clone error: (22) Invalid argument > [ceph: root@ceph-rbd1-s-r-avxinc-node1-installer /]# rbd snap protect > rep_pool_jDNsGQuutX/rep_image_KJXaGSnDfq@snap2 > [ceph: root@ceph-rbd1-s-r-avxinc-node1-installer /]# > [ceph: root@ceph-rbd1-s-r-avxinc-node1-installer /]# rbd clone > rep_pool_jDNsGQuutX/rep_image_KJXaGSnDfq@snap2 pool2/image4 > [ceph: root@ceph-rbd1-s-r-avxinc-node1-installer /]# > [ceph: root@ceph-rbd1-s-r-avxinc-node1-installer /]# rbd du pool2/image4 > NAME PROVISIONED USED > image4 1 GiB 0 B > > Performed client block list using this script > http://pastebin.test.redhat.com/1101663 > > > [root@ceph-rbd1-s-r-avxinc-node2 cephuser]# ./block_client.sh > Blocking 10.0.211.92:0/3556506075 > blocklisting 10.0.211.92:0/3556506075 until 2023-07-20T14:12:53.762610+0000 > (3600 sec) > listed 3 entries > Confirmed 10.0.211.92:0/3556506075 got blocklisted > ************************************************** > Blocking 10.0.211.92:0/3556506075 > blocklisting 10.0.211.92:0/3556506075 until 2023-07-20T14:13:00.631901+0000 > (3600 sec) > listed 3 entries > Confirmed 10.0.211.92:0/3556506075 got blocklisted > ************************************************** > > > During this ongoing script execution when i try to add a task to flatten the > child image, > It throws below error > > [ceph: root@ceph-rbd1-s-r-avxinc-node1-installer /]# ceph rbd task add > flatten pool2/image4 > Error EAGAIN: [errno 108] RBD connection was shutdown (error opening image > b'image4' at snapshot None) > > @Ram Raja, Is this the expected error message added for it? > > Can you please suggest the steps to see ? > Error message added in the PR > https://github.com/ceph/ceph/pull/52064/files > "rbd_support module is not ready, try again" > You won't always see the above message when you execute a `rbd_support` module command after the module's client is blocklisted. Seeing "Error EAGAIN: [errno 108] RBD connection was shutdown (error opening image b'image4' at snapshot None)" is also expected behavior. The error message you see depends on when exactly the blocklist is detected by the rbd_support module and its handlers. Instead of running the script, I suggest that you manually blocklist the `rbd_support` module's client as mentioned in https://bugzilla.redhat.com/show_bug.cgi?id=2210716#c0. And immediately after that, execute the `ceph rbd task flatten image` several times in quick session or in a loop. You will eventually see this stderr output "rbd_support module is not ready, try again" as the rbd_support module tries to recover from blocklisting. > > Just NOTE: after stopping the script able to add a task to flatten the child > image > > [ceph: root@ceph-rbd1-s-r-avxinc-node1-installer /]# ceph rbd task add > flatten pool2/image4 > {"sequence": 1, "id": "7865a8f9-44b2-4c0a-9e5e-f07b84a83e8b", "message": > "Flattening image pool2/image4", "refs": {"action": "flatten", "pool_name": > "pool2", "pool_namespace": "", "image_name": "image4", "image_id": > "3bac5a603593"}} This is good. This means that the rbd_support module was able to recover from blocklisting.
Thanks for the info Ram, Able to get Error message as per the PR [ceph: root@ceph-rbd1-s-r-avxinc-node1-installer /]# ceph rbd task add flatten pool2/image5 Error EAGAIN: rbd_support module is not ready, try again also this error message was also expected [ceph: root@ceph-rbd1-s-r-avxinc-node1-installer /]# ceph rbd task add flatten pool2/image4 Error EAGAIN: [errno 108] RBD connection was shutdown (error opening image b'image4' at snapshot None) Fix working as expected, Verified using ceph version 17.2.6-98.el9cp (b53d9dfff6b1021fbab8e28f2b873d7d49cf5665)
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Red Hat Ceph Storage 6.1 Bug Fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2023:4473