Bug 1344235

Summary: Hitting a Crash "librbd/AioImageRequestWQ.cc: 212: FAILED assert(!m_shutdown)"
Product: [Red Hat Storage] Red Hat Ceph Storage Reporter: Tanay Ganguly <tganguly>
Component: RBDAssignee: Jason Dillaman <jdillama>
Status: CLOSED ERRATA QA Contact: Tanay Ganguly <tganguly>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: 2.0CC: ceph-eng-bugs, hnallurv, kurs
Target Milestone: rc   
Target Release: 2.0   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: RHEL: ceph-10.2.1-18.el7cp Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-08-23 19:41:02 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Crash Log none

Description Tanay Ganguly 2016-06-09 08:51:16 UTC
Created attachment 1166209 [details]
Crash Log

Description of problem:
While working on Bug https://bugzilla.redhat.com/show_bug.cgi?id=1344212
I hit a Crash.

Version-Release number of selected component (if applicable):
rbd-mirror-10.2.1-12.el7cp.x86_64

How reproducible:
Once

Steps to Reproduce:
I will try to figure out the exact sequence of steps, just didn't want to lose on the BT and log, so filing a defect for it.

Actual results:


Expected results:


Additional info:
Log

--------------------------------------------------------------------------------

   -11> 2016-06-09 12:54:26.710972 7ff3d7fff700 20 rbd::mirror::ImageReplayer: 0x7ff3bc006790 [1/08cda886-7639-47dd-beed-4974a2c8fba2] set_state_description: 0 bootstrapping, OPEN_REMOTE_IMAGE
   -10> 2016-06-09 12:54:26.710975 7ff3d7fff700 20 rbd::mirror::ImageReplayer: 0x7ff3bc006790 [1/08cda886-7639-47dd-beed-4974a2c8fba2] update_mirror_image_status:
    -9> 2016-06-09 12:54:26.710977 7ff3d7fff700 20 rbd::mirror::ImageReplayer: 0x7ff3bc006790 [1/08cda886-7639-47dd-beed-4974a2c8fba2] start_mirror_image_status_update: already sending update
    -8> 2016-06-09 12:54:27.535100 7ff3d67fc700 20 rbd::mirror::image_replayer::BootstrapRequest: 0x7ff3bc028260 handle_open_remote_image: r=-2
    -7> 2016-06-09 12:54:27.535106 7ff3d67fc700 -1 rbd::mirror::image_replayer::BootstrapRequest: 0x7ff3bc028260 handle_open_remote_image: failed to open remote image: (2) No such file or directory
    -6> 2016-06-09 12:54:27.535111 7ff3d67fc700 20 rbd::mirror::image_replayer::BootstrapRequest: 0x7ff3bc028260 close_remote_image
    -5> 2016-06-09 12:54:27.535112 7ff3d67fc700 20 rbd::mirror::image_replayer::BootstrapRequest: 0x7ff3bc028260 update_progress: CLOSE_REMOTE_IMAGE
    -4> 2016-06-09 12:54:27.535115 7ff3d67fc700 20 rbd::mirror::ImageReplayer: 0x7ff3bc006790 [1/08cda886-7639-47dd-beed-4974a2c8fba2] set_state_description: 0 bootstrapping, CLOSE_REMOTE_IMAGE
    -3> 2016-06-09 12:54:27.535117 7ff3d67fc700 20 rbd::mirror::ImageReplayer: 0x7ff3bc006790 [1/08cda886-7639-47dd-beed-4974a2c8fba2] update_mirror_image_status:
    -2> 2016-06-09 12:54:27.535126 7ff3d67fc700 20 rbd::mirror::ImageReplayer: 0x7ff3bc006790 [1/08cda886-7639-47dd-beed-4974a2c8fba2] start_mirror_image_status_update: already sending update
    -1> 2016-06-09 12:54:27.535128 7ff3d67fc700 20 rbd::mirror::image_replayer::CloseImageRequest: 0x7ff39c001080 close_image
     0> 2016-06-09 12:54:27.536809 7ff3d67fc700 -1 librbd/AioImageRequestWQ.cc: In function 'void librbd::AioImageRequestWQ::shut_down(Context*)' thread 7ff3d67fc700 time 2016-06-09 12:54:27.535131
librbd/AioImageRequestWQ.cc: 212: FAILED assert(!m_shutdown)

 ceph version 10.2.1-12.el7cp (939056d19a2a523223611ef08194666b41086b03)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x85) [0x7ff4282aabb5]
 2: (librbd::AioImageRequestWQ::shut_down(Context*)+0x3c4) [0x7ff4281ba134]
 3: (librbd::image::CloseRequest<librbd::ImageCtx>::send_shut_down_aio_queue()+0xe0) [0x7ff42816c3a0]
 4: (librbd::image::CloseRequest<librbd::ImageCtx>::send_unregister_image_watcher()+0x1f5) [0x7ff42816c7f5]
 5: (librbd::ImageState<librbd::ImageCtx>::send_close_unlock()+0xc9) [0x7ff428104f89]
 6: (librbd::ImageState<librbd::ImageCtx>::close(Context*)+0xbb) [0x7ff4281068fb]
 7: (rbd::mirror::image_replayer::CloseImageRequest<librbd::ImageCtx>::close_image()+0xa0) [0x7ff4280d5880]
 8: (rbd::mirror::image_replayer::BootstrapRequest<librbd::ImageCtx>::close_remote_image()+0xf2) [0x7ff4280ce432]
 9: (rbd::mirror::image_replayer::BootstrapRequest<librbd::ImageCtx>::handle_open_remote_image(int)+0x30e) [0x7ff4280d103e]
 10: (Context::complete(int)+0x9) [0x7ff4280ab1b9]
 11: (librbd::ImageState<librbd::ImageCtx>::complete_action_unlock(librbd::ImageState<librbd::ImageCtx>::State, int)+0x11d) [0x7ff42810596d]
 12: (librbd::ImageState<librbd::ImageCtx>::handle_open(int)+0xc9) [0x7ff428105b29]
 13: (Context::complete(int)+0x9) [0x7ff4280ab1b9]
 14: (ThreadPool::worker(ThreadPool::WorkThread*)+0xa7e) [0x7ff42829be9e]
 15: (ThreadPool::WorkThread::entry()+0x10) [0x7ff42829cd70]
 16: (()+0x7dc5) [0x7ff41d7afdc5]
 17: (clone()+0x6d) [0x7ff41c6961cd]

Comment 2 Jason Dillaman 2016-06-09 11:53:17 UTC
The fix is already downstream.

Comment 5 Tanay Ganguly 2016-06-30 09:38:25 UTC
Unable to reproduce the same.

Marking it as Verified.
ceph version 10.2.2-9.el7cp

Comment 7 errata-xmlrpc 2016-08-23 19:41:02 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-1755.html