Bug 2080982 - [rbd-mirror] : force promote crashed for some images - snapshot/PromoteRequest.cc: 261: FAILED ceph_assert(info != nullptr)
Summary: [rbd-mirror] : force promote crashed for some images - snapshot/PromoteReques...
Keywords:
Status: ASSIGNED
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat Storage
Component: RBD-Mirror
Version: 5.1
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: 7.0
Assignee: Ilya Dryomov
QA Contact:
URL:
Whiteboard:
: 2102107 (view as bug list)
Depends On:
Blocks: 2135372
TreeView+ depends on / blocked
 
Reported: 2022-05-02 14:04 UTC by Vasishta
Modified: 2023-07-31 21:50 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Ceph Project Bug Tracker 53537 0 None None None 2022-07-26 16:17:41 UTC
Red Hat Issue Tracker RHCEPH-4223 0 None None None 2022-05-02 14:27:29 UTC

Description Vasishta 2022-05-02 14:04:25 UTC
Description of problem:
When force promote was issued on images, promote operation on some images crashed 

Version-Release number of selected component (if applicable):
16.2.7-109.el8cp

How reproducible:
Tried only once

Steps to Reproduce:
1. Configure snapshot based mirroring between two clusters
2. Create 50 images (with ec pool) one one cluster and 50 on another cluster (without ec pool)
3. Promote images on opposite clusters with --force option.
4. Issue was observed when non-expool images were being promoted on one of the clusters. (Mirroring daemon of the opposite cluster went down as rbd_mirror_die_after_seconds was being tested)

Actual results:
    -2> 2022-05-02T13:06:32.854+0000 7fc111a46700 10 monclient: get_auth_request con 0x7fc0f400b830 auth_method 0
    -1> 2022-05-02T13:06:32.871+0000 7fc110a44700 -1 /builddir/build/BUILD/ceph-16.2.7/src/librbd/mirror/snapshot/PromoteRequest.cc: In function 'void librbd::mirror::snapshot::PromoteRequest<ImageCtxT>::rollback() [with ImageCtxT = librbd::ImageCtx]' thread 7fc110a44700 time 2022-05-02T13:06:32.871080+0000
/builddir/build/BUILD/ceph-16.2.7/src/librbd/mirror/snapshot/PromoteRequest.cc: 261: FAILED ceph_assert(info != nullptr)

Impact :
Some images won't get promoted

Expected results:
promote shouldn't crash

Workaround:
Note down images which were not promoted, retry promote.

Additional information 
# ceph crash ls
ID                                                                ENTITY        NEW  
2022-05-02T13:04:17.888612Z_212a8a21-a839-4d29-a36b-b8f68dc913e8  client.admin   *   
2022-05-02T13:04:49.777751Z_a40ab99b-d13e-4c90-830f-b5002d11515b  client.admin   *   
2022-05-02T13:06:32.873214Z_b0b9c415-4008-4031-ab0c-4478824d707f  client.admin   *

Comment 3 Ilya Dryomov 2022-06-29 13:57:36 UTC
*** Bug 2102107 has been marked as a duplicate of this bug. ***

Comment 13 Scott Ostapovicz 2023-02-06 16:53:36 UTC
 Missed the 5.3 z1 window.  Moving to 6.1.  Please advise if this is a problem.

Comment 14 Josh Durgin 2023-03-22 23:03:57 UTC
As discussed in the DR meetings, force promote fixes will take longer to land. Moving out of 6.1.


Note You need to log in before you can comment on or make changes to this bug.