Bug 1325932
Summary: | Seeing a Crash while writing and re-sizing on a RBD Image in parallel | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | [Red Hat Storage] Red Hat Ceph Storage | Reporter: | Tanay Ganguly <tganguly> | ||||||
Component: | RBD | Assignee: | Jason Dillaman <jdillama> | ||||||
Status: | CLOSED ERRATA | QA Contact: | ceph-qe-bugs <ceph-qe-bugs> | ||||||
Severity: | high | Docs Contact: | |||||||
Priority: | unspecified | ||||||||
Version: | 2.0 | CC: | ceph-eng-bugs, jdillama, kdreyer, kurs, nlevine, vakulkar | ||||||
Target Milestone: | rc | ||||||||
Target Release: | 2.0 | ||||||||
Hardware: | x86_64 | ||||||||
OS: | Linux | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | ceph-10.2.0-1.el7cp | Doc Type: | Bug Fix | ||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2016-08-23 19:35:36 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Attachments: |
|
Created attachment 1145989 [details]
BT
This is a crash within 'rbd bench-write' because you shrunk the image below its in-flight write extent. The write failed (correctly) because it was out-of-bounds but the CLI has an expectation that the write won't fail. We can remove the assert and just have bench-write exit with a failure code, but in general it doesn't make sense to shrink images with live IO within the deleted region. Fix is to gracefully stop the "rbd bench-write" operation when an IO error is encountered (e.g. the write was out-of-bounds). Upstream PR: https://github.com/ceph/ceph/pull/8565 The above PR is present in v10.2.0. Marking it Verified. ceph version 10.2.1-6.el7cp Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2016-1755.html |
Created attachment 1145988 [details] Resize Script Description of problem: Resize and Writing (using rbd bench) causing the rbd to crash Version-Release number of selected component (if applicable): ceph-release-1-1.el7.noarch ceph-selinux-10.1.0-1.el7cp.x86_64 ceph-osd-10.1.0-1.el7cp.x86_64 libcephfs1-10.1.0-1.el7cp.x86_64 ceph-base-10.1.0-1.el7cp.x86_64 ceph-10.1.0-1.el7cp.x86_64 python-cephfs-10.1.0-1.el7cp.x86_64 ceph-mds-10.1.0-1.el7cp.x86_64 ceph-common-10.1.0-1.el7cp.x86_64 ceph-mon-10.1.0-1.el7cp.x86_64 How reproducible: 100% Steps to Reproduce: 1. Create an Image rbd create Tanay-RBD/BIG_Image1 --size 2048000 --image-format 2 --image-feature layering --image-feature exclusive-lock --image-feature object-map --image-feature fast-diff --image-feature deep-flatten 2.Take Snap. rbd snap create Tanay-RBD/BIG_Image1@snap 3. Protect It rbd snap protect Tanay-RBD/BIG_Image1@snap 4. Create Clone rbd clone Tanay-RBD/BIG_Image1@snap Tanay-RBD/BUG-CLONE --image-feature layering --image-feature exclusive-lock --image-feature object-map --image-feature fast-diff --image-feature deep-flatten 5. Start rbd bench write. rbd bench-write -p Tanay-RBD --image BUG-CLONE --io-size 10240 --io-pattern rand 6. While the write starts then run the resize script (PFA) Actual results: Initially step 5 acquires the lock, and step 6 waits for the lock to be release, as mentioned below: 2016-04-11 22:37:43.050150 7f6df3fff700 -1 librbd::image_watcher::NotifyLockOwner: 0x7f6dec0020c0 handle_notify: no lock owners detected 2016-04-11 22:37:48.052145 7f6df3fff700 -1 librbd::image_watcher::NotifyLockOwner: 0x7f6dec0020c0 handle_notify: no lock owners detected But after a while rbd bench write crashed and lock was being released to the resize and resize completed successfully. Expected results: There should not be any Crash. Additional info: Crash Dump