Bug 2240169
| Summary: | Gateway fails to load subsystems upon nvmeof service restart | ||
|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat Ceph Storage | Reporter: | Rahul Lepakshi <rlepaksh> |
| Component: | NVMeOF | Assignee: | Aviv Caro <aviv.caro> |
| Status: | CLOSED ERRATA | QA Contact: | Manohar Murthy <mmurthy> |
| Severity: | medium | Docs Contact: | Rivka Pollack <rpollack> |
| Priority: | unspecified | ||
| Version: | 7.0 | CC: | acaro, akraj, aviv.caro, cephqe-warriors, tserlin |
| Target Milestone: | --- | ||
| Target Release: | 7.0 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | ceph-nvmeof-container-0.0.5-1 | Doc Type: | No Doc Update |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2023-12-13 15:23:56 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Rahul Lepakshi
2023-09-22 08:26:14 UTC
Another instance - OMAP entries are intact on nvmeof service restart but looks like GW is unable to consume these entries upon restart - http://pastebin.test.redhat.com/1109748 Tested this again with newer GW and ceph version ceph version 18.2.0-72.el9cp (3f281315a9c7d4bb2281729a5f3c3366ad99193d) reef (stable) registry-proxy.engineering.redhat.com/rh-osbs/ceph-nvmeof:0.0.4-1 This issue seems to be intermittent and GW fails to load upon restart and get_subsystems command is issued while service is coming up. Logs at http://magna002.ceph.redhat.com/cephci-jenkins/nvmeof_gw_restart1.log Oct 04 06:42:15 ceph-nvmf3-hrhd31-node5 ceph-33fddd3a-626e-11ee-8bcb-fa163e0c7e19-nvmeof-rbd-ceph-nvmf3-hrhd31-node5-begrve[207500]: INFO:control.grpc:Received request to create bdev U7U9-bdev245 from rbd/U7U9-image245 with block size 4096 Oct 04 06:42:15 ceph-nvmf3-hrhd31-node5 ceph-33fddd3a-626e-11ee-8bcb-fa163e0c7e19-nvmeof-rbd-ceph-nvmf3-hrhd31-node5-begrve[207500]: [2023-10-04 10:42:15.388799] bdev_rbd.c:1199:bdev_rbd_create: *NOTICE*: Add U7U9-bdev245 rbd disk to lun Oct 04 06:42:15 ceph-nvmf3-hrhd31-node5 ceph-33fddd3a-626e-11ee-8bcb-fa163e0c7e19-nvmeof-rbd-ceph-nvmf3-hrhd31-node5-begrve[207500]: INFO:control.grpc:create_bdev: U7U9-bdev245 Oct 04 06:42:15 ceph-nvmf3-hrhd31-node5 ceph-33fddd3a-626e-11ee-8bcb-fa163e0c7e19-nvmeof-rbd-ceph-nvmf3-hrhd31-node5-begrve[207500]: INFO:control.grpc:Received request to create bdev U7U9-bdev195 from rbd/U7U9-image195 with block size 4096 Oct 04 06:42:15 ceph-nvmf3-hrhd31-node5 ceph-33fddd3a-626e-11ee-8bcb-fa163e0c7e19-nvmeof-rbd-ceph-nvmf3-hrhd31-node5-begrve[207500]: INFO:control.grpc:Allocating cluster name='cluster_context_4' Oct 04 06:42:15 ceph-nvmf3-hrhd31-node5 ceph-33fddd3a-626e-11ee-8bcb-fa163e0c7e19-nvmeof-rbd-ceph-nvmf3-hrhd31-node5-begrve[207500]: [2023-10-04 10:42:15.440532] bdev_rbd.c:1199:bdev_rbd_create: *NOTICE*: Add U7U9-bdev195 rbd disk to lun Oct 04 06:42:15 ceph-nvmf3-hrhd31-node5 ceph-33fddd3a-626e-11ee-8bcb-fa163e0c7e19-nvmeof-rbd-ceph-nvmf3-hrhd31-node5-begrve[207500]: INFO:control.grpc:create_bdev: U7U9-bdev195 Oct 04 06:42:15 ceph-nvmf3-hrhd31-node5 ceph-33fddd3a-626e-11ee-8bcb-fa163e0c7e19-nvmeof-rbd-ceph-nvmf3-hrhd31-node5-begrve[207500]: INFO:control.grpc:Received request to create bdev U7U9-bdev49 from rbd/U7U9-image49 with block size 4096 Oct 04 06:42:15 ceph-nvmf3-hrhd31-node5 ceph-33fddd3a-626e-11ee-8bcb-fa163e0c7e19-nvmeof-rbd-ceph-nvmf3-hrhd31-node5-begrve[207500]: INFO:control.grpc:Received request to get subsystems Oct 04 06:42:15 ceph-nvmf3-hrhd31-node5 ceph-33fddd3a-626e-11ee-8bcb-fa163e0c7e19-nvmeof-rbd-ceph-nvmf3-hrhd31-node5-begrve[207500]: [2023-10-04 10:42:15.461168] bdev_rbd.c:1199:bdev_rbd_create: *NOTICE*: Add U7U9-bdev49 rbd disk to lun Oct 04 06:42:15 ceph-nvmf3-hrhd31-node5 ceph-33fddd3a-626e-11ee-8bcb-fa163e0c7e19-nvmeof-rbd-ceph-nvmf3-hrhd31-node5-begrve[207500]: INFO:control.grpc:create_bdev: U7U9-bdev49 Oct 04 06:42:15 ceph-nvmf3-hrhd31-node5 ceph-33fddd3a-626e-11ee-8bcb-fa163e0c7e19-nvmeof-rbd-ceph-nvmf3-hrhd31-node5-begrve[207500]: INFO:control.grpc:Received request to create bdev U7U9-bdev243 from rbd/U7U9-image243 with block size 4096 Oct 04 06:42:15 ceph-nvmf3-hrhd31-node5 ceph-33fddd3a-626e-11ee-8bcb-fa163e0c7e19-nvmeof-rbd-ceph-nvmf3-hrhd31-node5-begrve[207500]: INFO:control.grpc:create_bdev: [] Oct 04 06:42:15 ceph-nvmf3-hrhd31-node5 ceph-33fddd3a-626e-11ee-8bcb-fa163e0c7e19-nvmeof-rbd-ceph-nvmf3-hrhd31-node5-begrve[207500]: Exception in thread Thread-1: Oct 04 06:42:15 ceph-nvmf3-hrhd31-node5 ceph-33fddd3a-626e-11ee-8bcb-fa163e0c7e19-nvmeof-rbd-ceph-nvmf3-hrhd31-node5-begrve[207500]: Traceback (most recent call last): Oct 04 06:42:15 ceph-nvmf3-hrhd31-node5 ceph-33fddd3a-626e-11ee-8bcb-fa163e0c7e19-nvmeof-rbd-ceph-nvmf3-hrhd31-node5-begrve[207500]: File "/usr/lib64/python3.9/threading.py", line 980, in _bootstrap_inner Oct 04 06:42:15 ceph-nvmf3-hrhd31-node5 ceph-33fddd3a-626e-11ee-8bcb-fa163e0c7e19-nvmeof-rbd-ceph-nvmf3-hrhd31-node5-begrve[207500]: self.run() Oct 04 06:42:15 ceph-nvmf3-hrhd31-node5 ceph-33fddd3a-626e-11ee-8bcb-fa163e0c7e19-nvmeof-rbd-ceph-nvmf3-hrhd31-node5-begrve[207500]: File "/usr/lib64/python3.9/threading.py", line 917, in run Oct 04 06:42:15 ceph-nvmf3-hrhd31-node5 ceph-33fddd3a-626e-11ee-8bcb-fa163e0c7e19-nvmeof-rbd-ceph-nvmf3-hrhd31-node5-begrve[207500]: self._target(*self._args, **self._kwargs) Oct 04 06:42:15 ceph-nvmf3-hrhd31-node5 ceph-33fddd3a-626e-11ee-8bcb-fa163e0c7e19-nvmeof-rbd-ceph-nvmf3-hrhd31-node5-begrve[207500]: File "/remote-source/ceph-nvmeof/app/control/state.py", line 420, in _update_caller Oct 04 06:42:15 ceph-nvmf3-hrhd31-node5 ceph-33fddd3a-626e-11ee-8bcb-fa163e0c7e19-nvmeof-rbd-ceph-nvmf3-hrhd31-node5-begrve[207500]: self.update() Oct 04 06:42:15 ceph-nvmf3-hrhd31-node5 ceph-33fddd3a-626e-11ee-8bcb-fa163e0c7e19-nvmeof-rbd-ceph-nvmf3-hrhd31-node5-begrve[207500]: File "/remote-source/ceph-nvmeof/app/control/state.py", line 465, in update Oct 04 06:42:15 ceph-nvmf3-hrhd31-node5 ceph-33fddd3a-626e-11ee-8bcb-fa163e0c7e19-nvmeof-rbd-ceph-nvmf3-hrhd31-node5-begrve[207500]: self._update_call_rpc(grouped_added, True, prefix_list) Oct 04 06:42:15 ceph-nvmf3-hrhd31-node5 ceph-33fddd3a-626e-11ee-8bcb-fa163e0c7e19-nvmeof-rbd-ceph-nvmf3-hrhd31-node5-begrve[207500]: File "/remote-source/ceph-nvmeof/app/control/state.py", line 487, in _update_call_rpc Oct 04 06:42:15 ceph-nvmf3-hrhd31-node5 ceph-33fddd3a-626e-11ee-8bcb-fa163e0c7e19-nvmeof-rbd-ceph-nvmf3-hrhd31-node5-begrve[207500]: self.gateway_rpc_caller(component_update, True) Oct 04 06:42:15 ceph-nvmf3-hrhd31-node5 ceph-33fddd3a-626e-11ee-8bcb-fa163e0c7e19-nvmeof-rbd-ceph-nvmf3-hrhd31-node5-begrve[207500]: File "/remote-source/ceph-nvmeof/app/control/server.py", line 297, in gateway_rpc_caller Oct 04 06:42:15 ceph-nvmf3-hrhd31-node5 ceph-33fddd3a-626e-11ee-8bcb-fa163e0c7e19-nvmeof-rbd-ceph-nvmf3-hrhd31-node5-begrve[207500]: self.gateway_rpc.create_bdev(req) Oct 04 06:42:15 ceph-nvmf3-hrhd31-node5 ceph-33fddd3a-626e-11ee-8bcb-fa163e0c7e19-nvmeof-rbd-ceph-nvmf3-hrhd31-node5-begrve[207500]: File "/remote-source/ceph-nvmeof/app/control/grpc.py", line 128, in create_bdev Oct 04 06:42:15 ceph-nvmf3-hrhd31-node5 ceph-33fddd3a-626e-11ee-8bcb-fa163e0c7e19-nvmeof-rbd-ceph-nvmf3-hrhd31-node5-begrve[207500]: return pb2.bdev(bdev_name=bdev_name, status=True) Oct 04 06:42:15 ceph-nvmf3-hrhd31-node5 ceph-33fddd3a-626e-11ee-8bcb-fa163e0c7e19-nvmeof-rbd-ceph-nvmf3-hrhd31-node5-begrve[207500]: TypeError: bad argument type for built-in operation Oct 04 06:42:15 ceph-nvmf3-hrhd31-node5 ceph-33fddd3a-626e-11ee-8bcb-fa163e0c7e19-nvmeof-rbd-ceph-nvmf3-hrhd31-node5-begrve[207500]: [2023-10-04 10:42:15.478168] bdev_rbd.c:1199:bdev_rbd_create: *NOTICE*: Add U7U9-bdev243 rbd disk to lun Oct 04 06:42:15 ceph-nvmf3-hrhd31-node5 ceph-33fddd3a-626e-11ee-8bcb-fa163e0c7e19-nvmeof-rbd-ceph-nvmf3-hrhd31-node5-begrve[207500]: INFO:control.grpc:get_subsystems: U7U9-bdev243 Oct 04 06:42:17 ceph-nvmf3-hrhd31-node5 ceph-33fddd3a-626e-11ee-8bcb-fa163e0c7e19-nvmeof-rbd-ceph-nvmf3-hrhd31-node5-begrve[207500]: INFO:control.grpc:Received request to get subsystems Oct 04 06:42:17 ceph-nvmf3-hrhd31-node5 ceph-33fddd3a-626e-11ee-8bcb-fa163e0c7e19-nvmeof-rbd-ceph-nvmf3-hrhd31-node5-begrve[207500]: INFO:control.grpc:get_subsystems: [] Oct 04 06:42:19 ceph-nvmf3-hrhd31-node5 ceph-33fddd3a-626e-11ee-8bcb-fa163e0c7e19-nvmeof-rbd-ceph-nvmf3-hrhd31-node5-begrve[207500]: INFO:control.grpc:Received request to get subsystems Oct 04 06:42:19 ceph-nvmf3-hrhd31-node5 ceph-33fddd3a-626e-11ee-8bcb-fa163e0c7e19-nvmeof-rbd-ceph-nvmf3-hrhd31-node5-begrve[207500]: INFO:control.grpc:get_subsystems: [] Rahul I believe this is fixed with 0.0.5. Please validate. Yes Aviv, it is but it is not downstream yet to validate it. This will be again tested once it is downstream and then BZ will be marked closed Fixed in 0.0.5. Please verify. No need to add to RN, it is fixed in the build we have for 7.0. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Red Hat Ceph Storage 7.0 Bug Fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2023:7780 The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days |