Bug 2147561

Summary: snapshot schedule fails to be scheduled
Product: [Red Hat Storage] Red Hat OpenShift Data Foundation Reporter: Elvir Kuric <ekuric>
Component: odf-drAssignee: Madhu Rajanna <mrajanna>
odf-dr sub component: volume-replication-operator QA Contact: krishnaram Karthick <kramdoss>
Status: CLOSED NOTABUG Docs Contact:
Severity: unspecified    
Priority: unspecified CC: idryomov, madam, muagarwa, ocs-bugs, odf-bz-bot
Version: 4.12Flags: mrajanna: needinfo? (ekuric)
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-11-24 12:04:52 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Comment 3 Madhu Rajanna 2022-11-24 10:55:44 UTC
>debug 2022-11-24T02:21:51.096+0000 7f4806ed4700 -1 librbd::mirror::snapshot::CreatePrimaryRequest: 0x55bba0a3cf70 handle_create_snapshot: failed to create mirror snapshot: (108) Cannot send after transport endpoint shutdown
debug 2022-11-24T02:21:51.096+0000 7f4806ed4700 -1 librbd::io::AioCompletion: 0x55bb9fbccdc0 fail: (108) Cannot send after transport endpoint shutdown
debug 2022-11-24T02:21:51.097+0000 7f48076d5700  0 [rbd_support DEBUG root] CreateSnapshotRequests.handle_create_snapshot for 1//10bf6aad31987: r=-108, snap_id=None
debug 2022-11-24T02:21:51.097+0000 7f48076d5700  0 [rbd_support ERROR root] error when creating snapshot for 1//10bf6aad31987: -108
debug 2022-11-24T02:21:51.097+0000 7f48076d5700  0 [rbd_support DEBUG root] CreateSnapshotRequests.close_image 1//10bf6aad31987
debug 2022-11-24T02:21:51.097+0000 7f47f8d78700  0 [rbd_support CRITICAL root] Fatal runtime error: [errno 108] RBD connection was shutdown (error opening image b'10bf66fd5c311' at snapshot None)
Traceback (most recent call last):
  File "/usr/share/ceph/mgr/rbd_support/mirror_snapshot_schedule.py", line 327, in run
    refresh_delay = self.refresh_images()
  File "/usr/share/ceph/mgr/rbd_support/mirror_snapshot_schedule.py", line 363, in refresh_images
    self.load_schedules()
  File "/usr/share/ceph/mgr/rbd_support/mirror_snapshot_schedule.py", line 352, in load_schedules
    schedules.load(namespace_validator, image_validator)
  File "/usr/share/ceph/mgr/rbd_support/schedule.py", line 375, in load
    image_validator)
  File "/usr/share/ceph/mgr/rbd_support/schedule.py", line 405, in load_from_pool
    image_validator)
  File "/usr/share/ceph/mgr/rbd_support/schedule.py", line 191, in from_id
    read_only=True) as image:
     File "rbd.pyx", line 2894, in rbd.Image.__init__
rbd.ConnectionShutdown: [errno 108] RBD connection was shutdown (error opening image b'10bf66fd5c311' at snapshot None)

debug 2022-11-24T02:21:51.097+0000 7f4806ed4700 -1 librbd::image::RefreshRequest: failed to retrieve mutable metadata: (108) Cannot send after transport endpoint shutdown
debug 2022-11-24T02:21:51.097+0000 7f4806ed4700 -1 librbd::io::AioCompletion: 0x55bb9fb7c580 fail: (108) Cannot send after transport endpoint shutdown
debug 2022-11-24T02:21:51.097+0000 7f4806ed4700 -1 librbd::exclusive_lock::PreReleaseRequest: 0x55bba046ac80 handle_set_require_lock: failed to set lock: (108) Cannot send after transport endpoint shutdown
debug 2022-11-24T02:21:51.098+0000 7f48076d5700 -1 librbd::object_map::UnlockRequest: failed to release object map lock: (108) Cannot send after transport endpoint shutdown
debug 2022-11-24T02:21:51.099+0000 7f48076d5700 -1 librbd::image::OpenRequest: failed to stat v2 image header: (108) Cannot send after transport endpoint shutdown
debug 2022-11-24T02:21:51.100+0000 7f4806ed4700 -1 librbd::ImageState: 0x55bba0a7e580 failed to open image: (108) Cannot send after transport endpoint shutdown
debug 2022-11-24T02:21:51.100+0000 7f48243de700  0 [rbd_support CRITICAL root] Fatal runtime error: [errno 108] RBD connection was shutdown (error opening image b'csi-vol-9384ccfd-2c93-4959-a2de-b692e23052e9' at snapshot None)
Traceback (most recent call last):
  File "/usr/share/ceph/mgr/rbd_support/module.py", line 174, in handle_command
    inbuf, prefix[29:], cmd)
  File "/usr/share/ceph/mgr/rbd_support/mirror_snapshot_schedule.py", line 579, in handle_command
    image_validator)
  File "/usr/share/ceph/mgr/rbd_support/schedule.py", line 130, in from_name
    read_only=True) as image:



@Elvir. I would suggest running the test with rbd CLI commands. Looks like you might be hitting any RBD bug.

@Ilya any idea when there is a panic in the mgr pod? looks like its connection/network issue.