Bug 2241040 - SPDK reactor (pool_group_0) becomes a bottleneck under I/O load
Summary: SPDK reactor (pool_group_0) becomes a bottleneck under I/O load
Keywords:
Status: CLOSED COMPLETED
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat Storage
Component: NVMeOF
Version: 7.0
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
: 7.0
Assignee: Aviv Caro
QA Contact: Manohar Murthy
Rivka Pollack
URL:
Whiteboard:
Depends On:
Blocks: 2237662
TreeView+ depends on / blocked
 
Reported: 2023-09-27 19:48 UTC by Paul Cuzner
Modified: 2024-05-24 04:25 UTC (History)
10 users (show)

Fixed In Version: 0.0.4-1
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2023-12-13 15:23:41 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker RHCEPH-7570 0 None None None 2023-09-27 19:51:46 UTC
Red Hat Product Errata RHBA-2023:7780 0 None None None 2023-12-13 15:23:45 UTC

Comment 1 Vikhyat Umrao 2023-09-27 19:57:49 UTC
Moving it to 7.0 and changing it to assign as the patch would be part of Ceph7.

Comment 8 Rahul Lepakshi 2023-10-04 06:47:28 UTC
The expectation from this BZ is to have 8 reactors IMO and as at https://ibm-systems-storage.slack.com/archives/C04QC5EGBPU/p1696275905395639?thread_ts=1695844215.107879&cid=C04QC5EGBPU
But I am seeing only reactor_0 getting started on nvmeof service deployment. 

Below are observations and logs-

NVMeOF GW version - registry-proxy.engineering.redhat.com/rh-osbs/ceph-nvmeof:0.0.4-1

[ceph: root@ceph-nvmf3-hrhd31-node1-installer /]# ceph version
ceph version 18.2.0-72.el9cp (3f281315a9c7d4bb2281729a5f3c3366ad99193d) reef (stable)

[root@ceph-nvmf3-hrhd31-node5 cephuser]# podman exec -it fee3c2414157 /bin/bash

[root@ceph-nvmf3-hrhd31-node5 /]# ./usr/libexec/spdk/scripts/rpc.py framework_get_reactors
{
  "tick_rate": 2290000000,
  "reactors": [
    {
      "lcore": 0,
      "busy": 44811720374,
      "idle": 16390563017344,
      "in_interrupt": false,
      "lw_threads": [
        {
          "name": "app_thread",
          "id": 1,
          "cpumask": "1",
          "elapsed": 16435377140690
        },
        {
          "name": "nvmf_tgt_poll_group_0",
          "id": 2,
          "cpumask": "1",
          "elapsed": 16435204058622
        }
      ]
    }
  ]
}
[root@ceph-nvmf3-hrhd31-node5 /]# ./usr/libexec/spdk/scripts/rpc.py thread_get_pollers
{
  "tick_rate": 2290000000,
  "threads": [
    {
      "name": "app_thread",
      "id": 1,
      "active_pollers": [],
      "timed_pollers": [
        {
          "name": "rpc_subsystem_poll",
          "id": 1,
          "state": "waiting",
          "run_count": 1795366,
          "busy_count": 1795366,
          "period_ticks": 9160000
        },
        {
          "name": "nvmf_tcp_accept",
          "id": 2,
          "state": "waiting",
          "run_count": 718330,
          "busy_count": 0,
          "period_ticks": 22900000
        }
      ],
      "paused_pollers": []
    },
    {
      "name": "nvmf_tgt_poll_group_0",
      "id": 2,
      "active_pollers": [
        {
          "name": "nvmf_poll_group_poll",
          "id": 1,
          "state": "waiting",
          "run_count": 71462271532,
          "busy_count": 0
        },
        {
          "name": "accel_comp_poll",
          "id": 2,
          "state": "waiting",
          "run_count": 71460946591,
          "busy_count": 0
        }
      ],
      "timed_pollers": [],
      "paused_pollers": []
    }
  ]
}
[root@ceph-nvmf3-hrhd31-node5 /]# ./usr/libexec/spdk/scripts/rpc.py thread_get_stats
{
  "tick_rate": 2290000000,
  "threads": [
    {
      "name": "app_thread",
      "id": 1,
      "cpumask": "1",
      "busy": 44867777158,
      "idle": 7791316594912,
      "active_pollers_count": 0,
      "timed_pollers_count": 2,
      "paused_pollers_count": 0
    },
    {
      "name": "nvmf_tgt_poll_group_0",
      "id": 2,
      "cpumask": "1",
      "busy": 30718892,
      "idle": 8656715169684,
      "active_pollers_count": 2,
      "timed_pollers_count": 0,
      "paused_pollers_count": 0
    }
  ]
}


[root@ceph-nvmf3-hrhd31-node5 cephuser]# top -p 24061
top - 02:35:20 up  2:11,  1 user,  load average: 1.05, 1.09, 1.11
Tasks:   1 total,   1 running,   0 sleeping,   0 stopped,   0 zombie
%Cpu(s): 12.5 us,  0.0 sy,  0.0 ni, 87.5 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
MiB Mem :  15724.9 total,   1047.3 free,   9979.9 used,   5288.3 buff/cache
MiB Swap:      0.0 total,      0.0 free,      0.0 used.   5745.0 avail Mem

    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
  24061 root      20   0   71.9g 268520  29696 R 100.0   1.7 120:06.24 reactor_0


Oct 04 00:34:46 ceph-nvmf3-hrhd31-node5 systemd[1]: Starting Ceph nvmeof.rbd.ceph-nvmf3-hrhd31-node5.begrve for 33fddd3a-626e-11ee-8bcb-fa163e0c7e19...
Oct 04 00:34:46 ceph-nvmf3-hrhd31-node5 bash[23971]: Trying to pull registry-proxy.engineering.redhat.com/rh-osbs/ceph-nvmeof:0.0.4-1...
Oct 04 00:34:47 ceph-nvmf3-hrhd31-node5 bash[23971]: Getting image source signatures
Oct 04 00:34:47 ceph-nvmf3-hrhd31-node5 bash[23971]: Copying blob sha256:920975eb20bc5ffc05b1e0e8667d21a888ff961462740c338ecdd44748722d5d
Oct 04 00:34:47 ceph-nvmf3-hrhd31-node5 bash[23971]: Copying blob sha256:35e8d0567610305e5133f45eac553d3f57e4f33e2f764a1f16bab4f3bf24ad86
Oct 04 00:35:08 ceph-nvmf3-hrhd31-node5 bash[23971]: Copying config sha256:69c2cf6e1104312b4bd816e9911d0e5494a8de22797e35c29f1a17808f1c1c81
Oct 04 00:35:08 ceph-nvmf3-hrhd31-node5 bash[23971]: Writing manifest to image destination
Oct 04 00:35:08 ceph-nvmf3-hrhd31-node5 bash[23971]: Storing signatures
Oct 04 00:35:08 ceph-nvmf3-hrhd31-node5 podman[23971]:
Oct 04 00:35:08 ceph-nvmf3-hrhd31-node5 podman[23971]: 2023-10-04 00:35:08.147382731 -0400 EDT m=+21.639621888 container create fee3c241415706fc196d2c1f91a46cd22d736bf7bd5950f59a203f201601c7ed (image=registry-proxy.engineering.redhat.com/rh-osbs/ceph-nvmeof:0.0.4-1, name=ceph-33fddd3a-626e-11ee-8bcb-fa163e0c7e19-nvmeof-rbd-ceph-nvmf3-hrhd31-node5-begrve, build-date=2023-09-28T18:35:50, vcs-type=git, io.openshift.expose-services=, url=https://access.redhat.com/containers/#/registry.access.redhat.com/ceph-nvmeof/images/0.0.4-1, description=Ceph NVMe over Fabrics Gateway, architecture=x86_64, io.buildah.version=1.29.0, io.k8s.description=Ceph NVMe over Fabrics Gateway, io.openshift.tags=minimal rhel9, name=ceph-nvmeof, summary=Service to provide block storage on top of Ceph for platforms (e.g.: VMWare) without native Ceph support (RBD), replacing existing approaches (iSCSI) with a newer and more versatile standard (NVMe-oF)., com.redhat.license_terms=https://www.redhat.com/agreements, com.redhat.component=ceph-nvmeof-container, vcs-ref=1bf80844db1f8190085b23b96d894dd34ee5e7f5, vendor=Red Hat, Inc., maintainer=Alexander Indenbaum <aindenba>, distribution-scope=public, version=0.0.4, release=1, io.k8s.display-name=Red Hat Universal Base Image 9 Minimal)
Oct 04 00:35:08 ceph-nvmf3-hrhd31-node5 podman[23971]: 2023-10-04 00:34:46.529800398 -0400 EDT m=+0.022039534 image pull  registry-proxy.engineering.redhat.com/rh-osbs/ceph-nvmeof:0.0.4-1
Oct 04 00:35:08 ceph-nvmf3-hrhd31-node5 podman[23971]: 2023-10-04 00:35:08.198964668 -0400 EDT m=+21.691203781 container init fee3c241415706fc196d2c1f91a46cd22d736bf7bd5950f59a203f201601c7ed (image=registry-proxy.engineering.redhat.com/rh-osbs/ceph-nvmeof:0.0.4-1, name=ceph-33fddd3a-626e-11ee-8bcb-fa163e0c7e19-nvmeof-rbd-ceph-nvmf3-hrhd31-node5-begrve, io.k8s.description=Ceph NVMe over Fabrics Gateway, io.k8s.display-name=Red Hat Universal Base Image 9 Minimal, io.openshift.expose-services=, io.buildah.version=1.29.0, build-date=2023-09-28T18:35:50, io.openshift.tags=minimal rhel9, architecture=x86_64, com.redhat.license_terms=https://www.redhat.com/agreements, com.redhat.component=ceph-nvmeof-container, release=1, url=https://access.redhat.com/containers/#/registry.access.redhat.com/ceph-nvmeof/images/0.0.4-1, description=Ceph NVMe over Fabrics Gateway, summary=Service to provide block storage on top of Ceph for platforms (e.g.: VMWare) without native Ceph support (RBD), replacing existing approaches (iSCSI) with a newer and more versatile standard (NVMe-oF)., maintainer=Alexander Indenbaum <aindenba>, version=0.0.4, name=ceph-nvmeof, vcs-type=git, vendor=Red Hat, Inc., vcs-ref=1bf80844db1f8190085b23b96d894dd34ee5e7f5, distribution-scope=public)
Oct 04 00:35:08 ceph-nvmf3-hrhd31-node5 podman[23971]: 2023-10-04 00:35:08.203373732 -0400 EDT m=+21.695612835 container start fee3c241415706fc196d2c1f91a46cd22d736bf7bd5950f59a203f201601c7ed (image=registry-proxy.engineering.redhat.com/rh-osbs/ceph-nvmeof:0.0.4-1, name=ceph-33fddd3a-626e-11ee-8bcb-fa163e0c7e19-nvmeof-rbd-ceph-nvmf3-hrhd31-node5-begrve, version=0.0.4, name=ceph-nvmeof, release=1, maintainer=Alexander Indenbaum <aindenba>, summary=Service to provide block storage on top of Ceph for platforms (e.g.: VMWare) without native Ceph support (RBD), replacing existing approaches (iSCSI) with a newer and more versatile standard (NVMe-oF)., description=Ceph NVMe over Fabrics Gateway, distribution-scope=public, build-date=2023-09-28T18:35:50, url=https://access.redhat.com/containers/#/registry.access.redhat.com/ceph-nvmeof/images/0.0.4-1, vcs-ref=1bf80844db1f8190085b23b96d894dd34ee5e7f5, io.openshift.expose-services=, io.openshift.tags=minimal rhel9, vcs-type=git, architecture=x86_64, com.redhat.license_terms=https://www.redhat.com/agreements, com.redhat.component=ceph-nvmeof-container, io.k8s.description=Ceph NVMe over Fabrics Gateway, io.k8s.display-name=Red Hat Universal Base Image 9 Minimal, vendor=Red Hat, Inc., io.buildah.version=1.29.0)
Oct 04 00:35:08 ceph-nvmf3-hrhd31-node5 bash[23971]: fee3c241415706fc196d2c1f91a46cd22d736bf7bd5950f59a203f201601c7ed
Oct 04 00:35:08 ceph-nvmf3-hrhd31-node5 systemd[1]: Started Ceph nvmeof.rbd.ceph-nvmf3-hrhd31-node5.begrve for 33fddd3a-626e-11ee-8bcb-fa163e0c7e19.
Oct 04 00:35:08 ceph-nvmf3-hrhd31-node5 ceph-33fddd3a-626e-11ee-8bcb-fa163e0c7e19-nvmeof-rbd-ceph-nvmf3-hrhd31-node5-begrve[24032]: INFO:control.server:Starting gateway client.nvmeof.rbd.ceph-nvmf3-hrhd31-node5.begrve
Oct 04 00:35:08 ceph-nvmf3-hrhd31-node5 ceph-33fddd3a-626e-11ee-8bcb-fa163e0c7e19-nvmeof-rbd-ceph-nvmf3-hrhd31-node5-begrve[24032]: DEBUG:control.server:Starting serve
Oct 04 00:35:08 ceph-nvmf3-hrhd31-node5 ceph-33fddd3a-626e-11ee-8bcb-fa163e0c7e19-nvmeof-rbd-ceph-nvmf3-hrhd31-node5-begrve[24032]: DEBUG:control.server:Configuring server client.nvmeof.rbd.ceph-nvmf3-hrhd31-node5.begrve
Oct 04 00:35:08 ceph-nvmf3-hrhd31-node5 ceph-33fddd3a-626e-11ee-8bcb-fa163e0c7e19-nvmeof-rbd-ceph-nvmf3-hrhd31-node5-begrve[24032]: INFO:control.server:SPDK Target Path: /usr/local/bin/nvmf_tgt
Oct 04 00:35:08 ceph-nvmf3-hrhd31-node5 ceph-33fddd3a-626e-11ee-8bcb-fa163e0c7e19-nvmeof-rbd-ceph-nvmf3-hrhd31-node5-begrve[24032]: INFO:control.server:SPDK Socket: /var/tmp/spdk.sock
Oct 04 00:35:08 ceph-nvmf3-hrhd31-node5 ceph-33fddd3a-626e-11ee-8bcb-fa163e0c7e19-nvmeof-rbd-ceph-nvmf3-hrhd31-node5-begrve[24032]: INFO:control.server:Starting /usr/local/bin/nvmf_tgt -u -r /var/tmp/spdk.sock
Oct 04 00:35:08 ceph-nvmf3-hrhd31-node5 ceph-33fddd3a-626e-11ee-8bcb-fa163e0c7e19-nvmeof-rbd-ceph-nvmf3-hrhd31-node5-begrve[24032]: INFO:control.server:Attempting to initialize SPDK: rpc_socket: /var/tmp/spdk.sock, conn_retries: 300, timeout: 60.0
Oct 04 00:35:08 ceph-nvmf3-hrhd31-node5 ceph-33fddd3a-626e-11ee-8bcb-fa163e0c7e19-nvmeof-rbd-ceph-nvmf3-hrhd31-node5-begrve[24032]: INFO: Setting log level to WARN
Oct 04 00:35:08 ceph-nvmf3-hrhd31-node5 ceph-33fddd3a-626e-11ee-8bcb-fa163e0c7e19-nvmeof-rbd-ceph-nvmf3-hrhd31-node5-begrve[24032]: INFO:JSONRPCClient(/var/tmp/spdk.sock):Setting log level to WARN
Oct 04 00:35:08 ceph-nvmf3-hrhd31-node5 ceph-33fddd3a-626e-11ee-8bcb-fa163e0c7e19-nvmeof-rbd-ceph-nvmf3-hrhd31-node5-begrve[24032]: [2023-10-04 04:35:08.378455] Starting SPDK v23.01.1 / DPDK 22.11.0 initialization...
Oct 04 00:35:08 ceph-nvmf3-hrhd31-node5 ceph-33fddd3a-626e-11ee-8bcb-fa163e0c7e19-nvmeof-rbd-ceph-nvmf3-hrhd31-node5-begrve[24032]: [2023-10-04 04:35:08.378524] [ DPDK EAL parameters: nvmf --no-shconf -c 0x1 --no-pci --huge-unlink --log-level=lib.eal:6 --log-level=lib.cryptodev:5 --log-level=user1:6 --iova-mode=pa --base-virtaddr=0x200000000000 --match-allocations --file-prefix=spdk_pid3 ]
Oct 04 00:35:08 ceph-nvmf3-hrhd31-node5 ceph-33fddd3a-626e-11ee-8bcb-fa163e0c7e19-nvmeof-rbd-ceph-nvmf3-hrhd31-node5-begrve[24032]: TELEMETRY: No legacy callbacks, legacy socket not created
Oct 04 00:35:08 ceph-nvmf3-hrhd31-node5 ceph-33fddd3a-626e-11ee-8bcb-fa163e0c7e19-nvmeof-rbd-ceph-nvmf3-hrhd31-node5-begrve[24032]: [2023-10-04 04:35:08.499367] app.c: 712:spdk_app_start: *NOTICE*: Total cores available: 1
Oct 04 00:35:08 ceph-nvmf3-hrhd31-node5 ceph-33fddd3a-626e-11ee-8bcb-fa163e0c7e19-nvmeof-rbd-ceph-nvmf3-hrhd31-node5-begrve[24032]: [2023-10-04 04:35:08.564323] reactor.c: 926:reactor_run: *NOTICE*: Reactor started on core 0
Oct 04 00:35:08 ceph-nvmf3-hrhd31-node5 ceph-33fddd3a-626e-11ee-8bcb-fa163e0c7e19-nvmeof-rbd-ceph-nvmf3-hrhd31-node5-begrve[24032]: [2023-10-04 04:35:08.614055] accel_sw.c: 681:sw_accel_module_init: *NOTICE*: Accel framework software module initialized.
Oct 04 00:35:08 ceph-nvmf3-hrhd31-node5 ceph-33fddd3a-626e-11ee-8bcb-fa163e0c7e19-nvmeof-rbd-ceph-nvmf3-hrhd31-node5-begrve[24032]: DEBUG:control.server:create_transport: tcp options: {"in_capsule_data_size": 8192, "max_io_qpairs_per_ctrlr": 7}
Oct 04 00:35:08 ceph-nvmf3-hrhd31-node5 ceph-33fddd3a-626e-11ee-8bcb-fa163e0c7e19-nvmeof-rbd-ceph-nvmf3-hrhd31-node5-begrve[24032]: [2023-10-04 04:35:08.755984] tcp.c: 629:nvmf_tcp_create: *NOTICE*: *** TCP Transport Init ***
Oct 04 00:35:08 ceph-nvmf3-hrhd31-node5 ceph-33fddd3a-626e-11ee-8bcb-fa163e0c7e19-nvmeof-rbd-ceph-nvmf3-hrhd31-node5-begrve[24032]: INFO:control.state:First gateway: created object nvmeof.None.state
Oct 04 01:44:12 ceph-nvmf3-hrhd31-node5 ceph-33fddd3a-626e-11ee-8bcb-fa163e0c7e19-nvmeof-rbd-ceph-nvmf3-hrhd31-node5-begrve[24032]: INFO:control.grpc:Received request to create subsystem nqn.2016-06.io.spdk:cnode1
Oct 04 01:44:12 ceph-nvmf3-hrhd31-node5 ceph-33fddd3a-626e-11ee-8bcb-fa163e0c7e19-nvmeof-rbd-ceph-nvmf3-hrhd31-node5-begrve[24032]: INFO:control.grpc:create_subsystem nqn.2016-06.io.spdk:cnode1: True
Oct 04 01:44:12 ceph-nvmf3-hrhd31-node5 ceph-33fddd3a-626e-11ee-8bcb-fa163e0c7e19-nvmeof-rbd-ceph-nvmf3-hrhd31-node5-begrve[24032]: DEBUG:control.state:omap_key generated: subsystem_nqn.2016-06.io.spdk:cnode1
Oct 04 01:44:14 ceph-nvmf3-hrhd31-node5 ceph-33fddd3a-626e-11ee-8bcb-fa163e0c7e19-nvmeof-rbd-ceph-nvmf3-hrhd31-node5-begrve[24032]: INFO:control.grpc:Received request to create client.nvmeof.rbd.ceph-nvmf3-hrhd31-node5.begrve TCP listener for nqn.2016-06.io.spdk:cnode1 at 10.0.211.212:5001.
Oct 04 01:44:14 ceph-nvmf3-hrhd31-node5 ceph-33fddd3a-626e-11ee-8bcb-fa163e0c7e19-nvmeof-rbd-ceph-nvmf3-hrhd31-node5-begrve[24032]: [2023-10-04 05:44:14.713298] tcp.c: 850:nvmf_tcp_listen: *NOTICE*: *** NVMe/TCP Target Listening on 10.0.211.212 port 5001 ***
Oct 04 01:44:14 ceph-nvmf3-hrhd31-node5 ceph-33fddd3a-626e-11ee-8bcb-fa163e0c7e19-nvmeof-rbd-ceph-nvmf3-hrhd31-node5-begrve[24032]: INFO:control.grpc:create_listener: True
Oct 04 01:44:14 ceph-nvmf3-hrhd31-node5 ceph-33fddd3a-626e-11ee-8bcb-fa163e0c7e19-nvmeof-rbd-ceph-nvmf3-hrhd31-node5-begrve[24032]: DEBUG:control.state:omap_key generated: listener_nqn.2016-06.io.spdk:cnode1_client.nvmeof.rbd.ceph-nvmf3-hrhd31-node5.begrve_TCP_10.0.211.212_5001
Oct 04 01:44:15 ceph-nvmf3-hrhd31-node5 ceph-33fddd3a-626e-11ee-8bcb-fa163e0c7e19-nvmeof-rbd-ceph-nvmf3-hrhd31-node5-begrve[24032]: INFO:control.grpc:Received request to allow any host to nqn.2016-06.io.spdk:cnode1
Oct 04 01:44:15 ceph-nvmf3-hrhd31-node5 ceph-33fddd3a-626e-11ee-8bcb-fa163e0c7e19-nvmeof-rbd-ceph-nvmf3-hrhd31-node5-begrve[24032]: INFO:control.grpc:add_host *: True
Oct 04 01:44:15 ceph-nvmf3-hrhd31-node5 ceph-33fddd3a-626e-11ee-8bcb-fa163e0c7e19-nvmeof-rbd-ceph-nvmf3-hrhd31-node5-begrve[24032]: DEBUG:control.state:omap_key generated: host_nqn.2016-06.io.spdk:cnode1_*
Oct 04 01:44:15 ceph-nvmf3-hrhd31-node5 ceph-33fddd3a-626e-11ee-8bcb-fa163e0c7e19-nvmeof-rbd-ceph-nvmf3-hrhd31-node5-begrve[24032]: INFO:control.grpc:Received request to create bdev U7U9-bdev1 from rbd/U7U9-image1 with block size 4096
Oct 04 01:44:15 ceph-nvmf3-hrhd31-node5 ceph-33fddd3a-626e-11ee-8bcb-fa163e0c7e19-nvmeof-rbd-ceph-nvmf3-hrhd31-node5-begrve[24032]: INFO:control.grpc:Allocating cluster name='cluster_context_0'
Oct 04 01:44:15 ceph-nvmf3-hrhd31-node5 ceph-33fddd3a-626e-11ee-8bcb-fa163e0c7e19-nvmeof-rbd-ceph-nvmf3-hrhd31-node5-begrve[24032]: [2023-10-04 05:44:15.838915] bdev_rbd.c:1199:bdev_rbd_create: *NOTICE*: Add U7U9-bdev1 rbd disk to lun
Oct 04 01:44:15 ceph-nvmf3-hrhd31-node5 ceph-33fddd3a-626e-11ee-8bcb-fa163e0c7e19-nvmeof-rbd-ceph-nvmf3-hrhd31-node5-begrve[24032]: INFO:control.grpc:create_bdev: U7U9-bdev1
Oct 04 01:44:15 ceph-nvmf3-hrhd31-node5 ceph-33fddd3a-626e-11ee-8bcb-fa163e0c7e19-nvmeof-rbd-ceph-nvmf3-hrhd31-node5-begrve[24032]: DEBUG:control.state:omap_key generated: bdev_U7U9-bdev1
Oct 04 01:44:16 ceph-nvmf3-hrhd31-node5 ceph-33fddd3a-626e-11ee-8bcb-fa163e0c7e19-nvmeof-rbd-ceph-nvmf3-hrhd31-node5-begrve[24032]: INFO:control.grpc:Received request to add U7U9-bdev1 to nqn.2016-06.io.spdk:cnode1
Oct 04 01:44:16 ceph-nvmf3-hrhd31-node5 ceph-33fddd3a-626e-11ee-8bcb-fa163e0c7e19-nvmeof-rbd-ceph-nvmf3-hrhd31-node5-begrve[24032]: INFO:control.grpc:add_namespace: 1
Oct 04 01:44:16 ceph-nvmf3-hrhd31-node5 ceph-33fddd3a-626e-11ee-8bcb-fa163e0c7e19-nvmeof-rbd-ceph-nvmf3-hrhd31-node5-begrve[24032]: DEBUG:control.state:omap_key generated: namespace_nqn.2016-06.io.spdk:cnode1_1

Comment 9 Aviv Caro 2023-10-04 08:28:34 UTC
Paul I think this is fixed in 0.0.4 please confirm.

Comment 10 Paul Cuzner 2023-10-04 19:44:30 UTC
This patch doesn't make the default number of reactors 8 - that's still governed by the cpumask setting in the ceph-nvmeof.conf file. Unless you specify the mask with tgt_cmd_extra_args in the spec you'll only get one reactor.

This focus of the backport is to prevent the serialisation of all I/O going through reactor_0

Comment 11 Paul Cuzner 2023-10-04 23:24:12 UTC
Here's an example of deploying a gateway with 8 reactors

service_type: nvmeof
service_id: gw
placement:
  label: nvmeof
spec:
  pool: rbd
  tgt_cmd_extra_args: --cpumask=0xFF --msg-mempool-size=524288


However, there is a known issue where the parameter is not formatted correctly when the config is written by cephadm
See: https://tracker.ceph.com/issues/62838

The parameter is not dynamic, so for testing you can manually update the conf file and restart the gateway. Restarting the gateway is quick, and the hosts don't detect it as a timeout - at least on my tests!

Comment 12 Rahul Lepakshi 2023-10-05 06:01:20 UTC
Thanks Paul. 
Moving BZ to verified but constraint is again https://tracker.ceph.com/issues/62838 as Paul mentioned. I was able to start only 2 reactors.

Comment 13 Akash Raj 2023-10-11 09:14:20 UTC
Hi Aviv.

Please confirm if this needs to be added to 7.0 release notes. If so, please provide the doc type and text.

Thanks.

Comment 15 errata-xmlrpc 2023-12-13 15:23:41 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat Ceph Storage 7.0 Bug Fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2023:7780

Comment 16 Red Hat Bugzilla 2024-05-24 04:25:08 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days


Note You need to log in before you can comment on or make changes to this bug.