Bug 1316287 - Possible QEMU deadlock after creating image snapshots
Summary: Possible QEMU deadlock after creating image snapshots
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat
Component: RBD
Version: 1.3.2
Hardware: Unspecified
OS: Unspecified
Target Milestone: rc
: 1.3.3
Assignee: Jason Dillaman
QA Contact: Tejas
Bara Ancincova
Depends On: 1335269
Blocks: 1372735
TreeView+ depends on / blocked
Reported: 2016-03-09 21:41 UTC by Jason Dillaman
Modified: 2017-07-30 15:35 UTC (History)
7 users (show)

Fixed In Version: RHEL: ceph-0.94.7-5.el7cp Ubuntu: ceph_0.94.7-3redhat1trusty
Doc Type: Bug Fix
Doc Text:
.The QEMU process no longer hangs when creating snapshots on images When the RADOS Block Device (RBD) cache was enabled, creating a snapshot on an image with active I/O operations could cause the QEMU process to become unresponsive. With this update, the QEMU process no longer hangs in the described scenario.
Clone Of:
Last Closed: 2016-09-29 12:57:02 UTC
Target Upstream Version:

Attachments (Terms of Use)

System ID Priority Status Summary Last Updated
Ceph Project Bug Tracker 13755 None None None 2016-03-09 21:43:37 UTC
Ceph Project Bug Tracker 14988 None None None 2016-03-09 21:43:48 UTC
Red Hat Product Errata RHSA-2016:1972 normal SHIPPED_LIVE Moderate: Red Hat Ceph Storage 1.3.3 security, bug fix, and enhancement update 2016-09-29 16:51:21 UTC

Description Jason Dillaman 2016-03-09 21:41:39 UTC
Description of problem:
Creating a snapshot out-of-band (e.g. rbd snap create image@snap) on an image with active IO might result in all IO hanging within the QEMU process.  This only occurs when the RBD cache is enabled.

Version-Release number of selected component (if applicable):

How reproducible:
Requires that the RBD cache have pending writeback IO when a snapshot is created.

Steps to Reproduce:
1. Start a write-intensive operation within a VM
2. Create a snapshot of the RBD image

Actual results:
QEMU IO will hang

Expected results:
QEMU IO continues without issue

Additional info:

Comment 2 Jason Dillaman 2016-03-23 12:02:06 UTC
This issue only affects 1.3.x -- it doesn't affect 2.0.  Resetting the flags to account for the change.

Comment 3 Ken Dreyer (Red Hat) 2016-08-02 20:16:18 UTC
Fixed in v0.94.7 upstream. We'll take this BZ as part of the rebase (bz 1335269).

Comment 8 Tejas 2016-09-12 10:39:58 UTC
Verified in ceph version:
ceph version 0.94.9-1.el7cp

Steps followed:
1. enable rbd caching.
2. attach a RBD image to a KVM instance.
3. start IO on the RBD image from the VM.
4. Take  snapshots of the RBD image.
5. clone a snapshot.

No imapct to IO.

Moving this to Verified.


Comment 11 errata-xmlrpc 2016-09-29 12:57:02 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.