Bug 2277692 - nvmeof namespace resize fails while expanding image/ namespace
Summary: nvmeof namespace resize fails while expanding image/ namespace
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat Storage
Component: NVMeOF
Version: 7.1
Hardware: Unspecified
OS: Unspecified
unspecified
urgent
Target Milestone: ---
: 7.1
Assignee: Gil Bregman
QA Contact: Rahul Lepakshi
ceph-doc-bot
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2024-04-29 08:14 UTC by Rahul Lepakshi
Modified: 2024-10-12 04:25 UTC (History)
5 users (show)

Fixed In Version: ceph-18.2.1-175.el9cp
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2024-06-13 14:32:13 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker RHCEPH-8897 0 None None None 2024-04-29 08:26:29 UTC
Red Hat Product Errata RHSA-2024:3925 0 None None None 2024-06-13 14:32:16 UTC

Description Rahul Lepakshi 2024-04-29 08:14:19 UTC
Description of problem:
I am trying to increase namespace/ rbd image from 1TB to 2TB and it fails as below

[root@ceph-ibm-ha-v2-81ocsz-node4 cephuser]# nvmeof namespace list -n nqn.2016-06.io.spdk:cnode1 --nsid 1
Namespace 1 in subsystem nqn.2016-06.io.spdk:cnode1:
╒════════╤════════════════════════╤════════╤═════════════╤═════════╤═════════╤═════════════════════╤═════════════╤═══════════╤═══════════╤════════════╤═════════════╕
│   NSID │ Bdev                   │ RBD    │ RBD         │ Image   │ Block   │ UUID                │        Load │ R/W IOs   │ R/W MBs   │ Read MBs   │ Write MBs   │
│        │ Name                   │ Pool   │ Image       │ Size    │ Size    │                     │   Balancing │ per       │ per       │ per        │ per         │
│        │                        │        │             │         │         │                     │       Group │ second    │ second    │ second     │ second      │
╞════════╪════════════════════════╪════════╪═════════════╪═════════╪═════════╪═════════════════════╪═════════════╪═══════════╪═══════════╪════════════╪═════════════╡
│      1 │ bdev_77566f15-5d5d-    │ rbd    │ 6A9L-image1 │ 1 TiB   │ 512 B   │ 77566f15-5d5d-4672- │           1 │ unlimited │ unlimited │ unlimited  │ unlimited   │
│        │ 4672-b94d-fedbacadee6d │        │             │         │         │ b94d-fedbacadee6d   │             │           │           │            │             │
╘════════╧════════════════════════╧════════╧═════════════╧═════════╧═════════╧═════════════════════╧═════════════╧═══════════╧═══════════╧════════════╧═════════════╛
[root@ceph-ibm-ha-v2-81ocsz-node4 cephuser]# nvmeof namespace resize --size 2TB --nsid 1 -n nqn.2016-06.io.spdk:cnode1
Failure resizing namespace: Failure resizing bdev bdev_77566f15-5d5d-4672-b94d-fedbacadee6d: Cannot send after transport endpoint shutdown
[root@ceph-ibm-ha-v2-81ocsz-node4 cephuser]# nvmeof namespace list -n nqn.2016-06.io.spdk:cnode1 --nsid 1
Namespace 1 in subsystem nqn.2016-06.io.spdk:cnode1:
╒════════╤════════════════════════╤════════╤═════════════╤═════════╤═════════╤═════════════════════╤═════════════╤═══════════╤═══════════╤════════════╤═════════════╕
│   NSID │ Bdev                   │ RBD    │ RBD         │ Image   │ Block   │ UUID                │        Load │ R/W IOs   │ R/W MBs   │ Read MBs   │ Write MBs   │
│        │ Name                   │ Pool   │ Image       │ Size    │ Size    │                     │   Balancing │ per       │ per       │ per        │ per         │
│        │                        │        │             │         │         │                     │       Group │ second    │ second    │ second     │ second      │
╞════════╪════════════════════════╪════════╪═════════════╪═════════╪═════════╪═════════════════════╪═════════════╪═══════════╪═══════════╪════════════╪═════════════╡
│      1 │ bdev_77566f15-5d5d-    │ rbd    │ 6A9L-image1 │ 1 TiB   │ 512 B   │ 77566f15-5d5d-4672- │           1 │ unlimited │ unlimited │ unlimited  │ unlimited   │
│        │ 4672-b94d-fedbacadee6d │        │             │         │         │ b94d-fedbacadee6d   │             │           │           │            │             │
╘════════╧════════════════════════╧════════╧═════════════╧═════════╧═════════╧═════════════════════╧═════════════╧═══════════╧═══════════╧════════════╧═════════════╛

Apr 29 03:56:51 ceph-ibm-ha-v2-81ocsz-node4 ceph-b09b9892-00a1-11ef-9976-fa163e070688-nvmeof-nvmeof_pool-ceph-ibm-ha-v2-81ocsz-node4-tulxkw[1633011]: [29-Apr-2024 07:56:51] INFO grpc.py:1536: Received request to resize namespace using NSID 1 on nqn.2016-06.io.spdk:cnode1 to 1907349 MiB, context: <grpc._server._Context object at 0x7f38e2114730>, client address: IPv4 10.88.1.59:49504                                  Apr 29 03:56:51 ceph-ibm-ha-v2-81ocsz-node4 ceph-b09b9892-00a1-11ef-9976-fa163e070688-nvmeof-nvmeof_pool-ceph-ibm-ha-v2-81ocsz-node4-tulxkw[1633011]: [29-Apr-2024 07:56:51] INFO grpc.py:371: Received request to resize bdev bdev_77566f15-5d5d-4672-b94d-fedbacadee6d to 1907349 MiB, client address: IPv4 10.88.1.59:49504                                                                                                    Apr 29 03:56:51 ceph-ibm-ha-v2-81ocsz-node4 ceph-b09b9892-00a1-11ef-9976-fa163e070688-nvmeof-nvmeof_pool-ceph-ibm-ha-v2-81ocsz-node4-tulxkw[1633011]: [2024-04-29 07:56:51.843995] bdev_rbd.c:1349:bdev_rbd_resize: *ERROR*: failed to resize the ceph bdev.                                                                                                                                                                      Apr 29 03:56:51 ceph-ibm-ha-v2-81ocsz-node4 ceph-b09b9892-00a1-11ef-9976-fa163e070688-nvmeof-nvmeof_pool-ceph-ibm-ha-v2-81ocsz-node4-tulxkw[1633011]: [29-Apr-2024 07:56:51] ERROR grpc.py:382: Failure resizing bdev bdev_77566f15-5d5d-4672-b94d-fedbacadee6d                                                                                                                                                                   Apr 29 03:56:51 ceph-ibm-ha-v2-81ocsz-node4 ceph-b09b9892-00a1-11ef-9976-fa163e070688-nvmeof-nvmeof_pool-ceph-ibm-ha-v2-81ocsz-node4-tulxkw[1633011]: Traceback (most recent call last):                         Apr 29 03:56:51 ceph-ibm-ha-v2-81ocsz-node4 ceph-b09b9892-00a1-11ef-9976-fa163e070688-nvmeof-nvmeof_pool-ceph-ibm-ha-v2-81ocsz-node4-tulxkw[1633011]:   File "/remote-source/ceph-nvmeof/app/control/grpc.py", line 374, in resize_bdev                                                                                                                                                                                           Apr 29 03:56:51 ceph-ibm-ha-v2-81ocsz-node4 ceph-b09b9892-00a1-11ef-9976-fa163e070688-nvmeof-nvmeof_pool-ceph-ibm-ha-v2-81ocsz-node4-tulxkw[1633011]:     ret = rpc_bdev.bdev_rbd_resize(                        Apr 29 03:56:51 ceph-ibm-ha-v2-81ocsz-node4 ceph-b09b9892-00a1-11ef-9976-fa163e070688-nvmeof-nvmeof_pool-ceph-ibm-ha-v2-81ocsz-node4-tulxkw[1633011]:   File "/usr/lib/python3.9/site-packages/spdk/rpc/bdev.py", line 1158, in bdev_rbd_resize                                                                                                                                                                                   Apr 29 03:56:51 ceph-ibm-ha-v2-81ocsz-node4 ceph-b09b9892-00a1-11ef-9976-fa163e070688-nvmeof-nvmeof_pool-ceph-ibm-ha-v2-81ocsz-node4-tulxkw[1633011]:     return client.call('bdev_rbd_resize', params)          Apr 29 03:56:51 ceph-ibm-ha-v2-81ocsz-node4 ceph-b09b9892-00a1-11ef-9976-fa163e070688-nvmeof-nvmeof_pool-ceph-ibm-ha-v2-81ocsz-node4-tulxkw[1633011]:   File "/usr/lib/python3.9/site-packages/spdk/rpc/client.py", line 203, in call                                                                                                                                                                                             Apr 29 03:56:51 ceph-ibm-ha-v2-81ocsz-node4 ceph-b09b9892-00a1-11ef-9976-fa163e070688-nvmeof-nvmeof_pool-ceph-ibm-ha-v2-81ocsz-node4-tulxkw[1633011]:     raise JSONRPCException(msg)                            Apr 29 03:56:51 ceph-ibm-ha-v2-81ocsz-node4 ceph-b09b9892-00a1-11ef-9976-fa163e070688-nvmeof-nvmeof_pool-ceph-ibm-ha-v2-81ocsz-node4-tulxkw[1633011]: spdk.rpc.client.JSONRPCException: request:                 Apr 29 03:56:51 ceph-ibm-ha-v2-81ocsz-node4 ceph-b09b9892-00a1-11ef-9976-fa163e070688-nvmeof-nvmeof_pool-ceph-ibm-ha-v2-81ocsz-node4-tulxkw[1633011]: {                                                          Apr 29 03:56:51 ceph-ibm-ha-v2-81ocsz-node4 ceph-b09b9892-00a1-11ef-9976-fa163e070688-nvmeof-nvmeof_pool-ceph-ibm-ha-v2-81ocsz-node4-tulxkw[1633011]:   "name": "bdev_77566f15-5d5d-4672-b94d-fedbacadee6d",     Apr 29 03:56:51 ceph-ibm-ha-v2-81ocsz-node4 ceph-b09b9892-00a1-11ef-9976-fa163e070688-nvmeof-nvmeof_pool-ceph-ibm-ha-v2-81ocsz-node4-tulxkw[1633011]:   "new_size": 1907349,                                     Apr 29 03:56:51 ceph-ibm-ha-v2-81ocsz-node4 ceph-b09b9892-00a1-11ef-9976-fa163e070688-nvmeof-nvmeof_pool-ceph-ibm-ha-v2-81ocsz-node4-tulxkw[1633011]:   "method": "bdev_rbd_resize",                             Apr 29 03:56:51 ceph-ibm-ha-v2-81ocsz-node4 ceph-b09b9892-00a1-11ef-9976-fa163e070688-nvmeof-nvmeof_pool-ceph-ibm-ha-v2-81ocsz-node4-tulxkw[1633011]:   "req_id": 173528                                         Apr 29 03:56:51 ceph-ibm-ha-v2-81ocsz-node4 ceph-b09b9892-00a1-11ef-9976-fa163e070688-nvmeof-nvmeof_pool-ceph-ibm-ha-v2-81ocsz-node4-tulxkw[1633011]: }                                                          Apr 29 03:56:51 ceph-ibm-ha-v2-81ocsz-node4 ceph-b09b9892-00a1-11ef-9976-fa163e070688-nvmeof-nvmeof_pool-ceph-ibm-ha-v2-81ocsz-node4-tulxkw[1633011]: Got JSON-RPC error response                                Apr 29 03:56:51 ceph-ibm-ha-v2-81ocsz-node4 ceph-b09b9892-00a1-11ef-9976-fa163e070688-nvmeof-nvmeof_pool-ceph-ibm-ha-v2-81ocsz-node4-tulxkw[1633011]: response:                                                  Apr 29 03:56:51 ceph-ibm-ha-v2-81ocsz-node4 ceph-b09b9892-00a1-11ef-9976-fa163e070688-nvmeof-nvmeof_pool-ceph-ibm-ha-v2-81ocsz-node4-tulxkw[1633011]: {                                                          Apr 29 03:56:51 ceph-ibm-ha-v2-81ocsz-node4 ceph-b09b9892-00a1-11ef-9976-fa163e070688-nvmeof-nvmeof_pool-ceph-ibm-ha-v2-81ocsz-node4-tulxkw[1633011]:   "code": -108,                                            Apr 29 03:56:51 ceph-ibm-ha-v2-81ocsz-node4 ceph-b09b9892-00a1-11ef-9976-fa163e070688-nvmeof-nvmeof_pool-ceph-ibm-ha-v2-81ocsz-node4-tulxkw[1633011]:   "message": "Cannot send after transport endpoint shutdown"                                                                                                                                                                                                                Apr 29 03:56:51 ceph-ibm-ha-v2-81ocsz-node4 ceph-b09b9892-00a1-11ef-9976-fa163e070688-nvmeof-nvmeof_pool-ceph-ibm-ha-v2-81ocsz-node4-tulxkw[1633011]: }                                                          Apr 29 03:56:51 ceph-ibm-ha-v2-81ocsz-node4 ceph-b09b9892-00a1-11ef-9976-fa163e070688-nvmeof-nvmeof_pool-ceph-ibm-ha-v2-81ocsz-node4-tulxkw[1633011]: [29-Apr-2024 07:56:51] ERROR grpc.py:1555: Failure resizing namespace: Failure resizing bdev bdev_77566f15-5d5d-4672-b94d-fedbacadee6d: Cannot send after transport endpoint shutdown


Version-Release number of selected component (if applicable):
# ceph version
ceph version 18.2.1-149.el9cp (6944266a2186e8940baeefc45140e9c798b90141) reef (stable)

How reproducible:always


Steps to Reproduce:
1.Deploy nvmeof on a ceph cluster
2.Add namespace to a subsystem
3.Try to expand using namespace resize command as in description 

Actual results: Resizing namespace fails


Expected results: Resizing namespace should be effective 


Additional info:

Comment 1 Rahul Lepakshi 2024-04-29 08:19:46 UTC
If size is given as 2T, command executes successfully but if we have 2TB as size it fails as mentioned above

Help command has TB, MB in in it
[root@ceph-ibm-ha-v2-81ocsz-node4 cephuser]# nvmeof namespace resize -h
usage: python3 -m control.cli namespace resize [-h] --subsystem SUBSYSTEM
                                               [--uuid UUID] [--nsid NSID]
                                               --size SIZE

Resize a namespace

optional arguments:
  -h, --help            show this help message and exit
  --subsystem SUBSYSTEM, -n SUBSYSTEM
                        Subsystem NQN
  --uuid UUID, -u UUID  UUID
  --nsid NSID           Namespace ID
  --size SIZE           Size in bytes or specified unit (KB, KiB, MB, MiB, GB,
                        GiB, TB, TiB)



[root@ceph-ibm-ha-v2-81ocsz-node4 ~]# podman run --rm cp.stg.icr.io/cp/ibm-ceph/nvmeof-cli-rhel9:1.2.4-1  --server-address 10.0.209.20 namespace resize -n nqn.2016-06.io.spdk:cnode1 --nsid 1 --size 2T
Resizing namespace 1 in nqn.2016-06.io.spdk:cnode1 to 2097152 MiB: Successful

FYI - Aviv

Comment 3 Gil Bregman 2024-05-02 09:59:17 UTC
@rlepaksh was there anything else running there? As far as we can see this error comes from RBD when trying to resize an image which was closed.

Comment 4 Aviv Caro 2024-05-06 06:51:59 UTC
In 1.2.6 we allow only binary (1024) units for image creation (as supported by Ceph RBD (https://docs.ceph.com/en/quincy/man/8/rbd/). I'm still not sure this is the root cause of this issue. Please try to reproduce once you get a new build with GW 1.2.6.

Comment 5 Aviv Caro 2024-05-07 05:25:47 UTC
Rahul please retest with 1.2.6-1 downstream build.

Comment 6 Rahul Lepakshi 2024-05-08 07:26:19 UTC
Unable to reproduce this issue at 1.2.6-1

But for shrink, it would be good if CLI gives correct output as "shrink not supported" instead of "Invalid arguement"

# nvmeof namespace resize --size 4TB --nsid 1 -n nqn.2016-06.io.spdk:cnode1
Failure resizing namespace: Failure resizing bdev bdev_8e413907-e62c-498d-8983-59854d2c2112: Invalid argument

[08-May-2024 07:07:42] INFO grpc.py:1536: Received request to resize namespace using NSID 1 on nqn.2016-06.io.spdk:cnode1 to 4194304 MiB, context: <grpc._server._Co>
-ceph-ibm-ha-v2-msxv5d-node4-qatkoy[29025]: [08-May-2024 07:07:42] INFO grpc.py:371: Received request to resize bdev bdev_8e413907-e62c-498d-8983-59854d2c2112 to 4194304 MiB, client address: IPv4 10.88.0.35:5>
-ceph-ibm-ha-v2-msxv5d-node4-qatkoy[29025]: [2024-05-08 07:07:42.505360] bdev_rbd.c:1496:bdev_rbd_resize: *ERROR*: The new bdev size must be larger than current bdev size.
-ceph-ibm-ha-v2-msxv5d-node4-qatkoy[29025]: [08-May-2024 07:07:42] ERROR grpc.py:382: Failure resizing bdev bdev_8e413907-e62c-498d-8983-59854d2c2112
-ceph-ibm-ha-v2-msxv5d-node4-qatkoy[29025]: Traceback (most recent call last):
-ceph-ibm-ha-v2-msxv5d-node4-qatkoy[29025]:   File "/remote-source/ceph-nvmeof/app/control/grpc.py", line 374, in resize_bdev
-ceph-ibm-ha-v2-msxv5d-node4-qatkoy[29025]:     ret = rpc_bdev.bdev_rbd_resize(
-ceph-ibm-ha-v2-msxv5d-node4-qatkoy[29025]:   File "/usr/local/lib/python3.9/site-packages/spdk/rpc/bdev.py", line 1276, in bdev_rbd_resize
-ceph-ibm-ha-v2-msxv5d-node4-qatkoy[29025]:     return client.call('bdev_rbd_resize', params)
-ceph-ibm-ha-v2-msxv5d-node4-qatkoy[29025]:   File "/usr/local/lib/python3.9/site-packages/spdk/rpc/client.py", line 205, in call
-ceph-ibm-ha-v2-msxv5d-node4-qatkoy[29025]:     raise JSONRPCException(msg)
-ceph-ibm-ha-v2-msxv5d-node4-qatkoy[29025]: spdk.rpc.client.JSONRPCException: request:
-ceph-ibm-ha-v2-msxv5d-node4-qatkoy[29025]: {
-ceph-ibm-ha-v2-msxv5d-node4-qatkoy[29025]:   "name": "bdev_8e413907-e62c-498d-8983-59854d2c2112",
-ceph-ibm-ha-v2-msxv5d-node4-qatkoy[29025]:   "new_size": 4194304,
-ceph-ibm-ha-v2-msxv5d-node4-qatkoy[29025]:   "method": "bdev_rbd_resize",
-ceph-ibm-ha-v2-msxv5d-node4-qatkoy[29025]:   "req_id": 78498
-ceph-ibm-ha-v2-msxv5d-node4-qatkoy[29025]: }
-ceph-ibm-ha-v2-msxv5d-node4-qatkoy[29025]: Got JSON-RPC error response
-ceph-ibm-ha-v2-msxv5d-node4-qatkoy[29025]: response:
-ceph-ibm-ha-v2-msxv5d-node4-qatkoy[29025]: {
-ceph-ibm-ha-v2-msxv5d-node4-qatkoy[29025]:   "code": -22,
-ceph-ibm-ha-v2-msxv5d-node4-qatkoy[29025]:   "message": "Invalid argument"
-ceph-ibm-ha-v2-msxv5d-node4-qatkoy[29025]: }

Comment 9 Rahul Lepakshi 2024-05-20 10:33:42 UTC
Issue is not seen on recent builds

Comment 10 errata-xmlrpc 2024-06-13 14:32:13 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Critical: Red Hat Ceph Storage 7.1 security, enhancements, and bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2024:3925

Comment 11 Red Hat Bugzilla 2024-10-12 04:25:25 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days


Note You need to log in before you can comment on or make changes to this bug.