Bug 1334182 - crash in librbd while when write size is large (~99429910 bytes)
Summary: crash in librbd while when write size is large (~99429910 bytes)
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat
Component: RBD
Version: 2.0
Hardware: x86_64
OS: Linux
Target Milestone: rc
: 2.0
Assignee: Jason Dillaman
QA Contact: Tanay Ganguly
Depends On:
TreeView+ depends on / blocked
Reported: 2016-05-09 06:22 UTC by Tanay Ganguly
Modified: 2017-07-30 15:32 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Last Closed: 2016-08-23 19:37:59 UTC
Target Upstream Version:

Attachments (Terms of Use)
Crash (24.32 KB, text/plain)
2016-05-09 06:22 UTC, Tanay Ganguly
no flags Details
Script (644 bytes, text/x-python)
2016-05-09 06:25 UTC, Tanay Ganguly
no flags Details

System ID Private Priority Status Summary Last Updated
Ceph Project Bug Tracker 15791 0 None None None 2016-05-09 21:12:37 UTC
Red Hat Product Errata RHBA-2016:1755 0 normal SHIPPED_LIVE Red Hat Ceph Storage 2.0 bug fix and enhancement update 2016-08-23 23:23:52 UTC

Description Tanay Ganguly 2016-05-09 06:22:40 UTC
Created attachment 1155149 [details]

Description of problem:
While larger object size, seeing a crash

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:
1. Create a RBD Image enabling all the features.
rbd image 'testing3':
        size 102400 MB in 25600 objects
        order 22 (4096 kB objects)
        block_name_prefix: rbd_data.128b238e1f29
        format: 2
        features: layering, exclusive-lock, object-map, fast-diff, deep-flatten, journaling
        journal: 128b238e1f29
        mirroring state: disabled

2. Write ~90M of data (PFA the script)
3. Seeing a crash

Actual results:
There should not be any crash

Expected results:
Seeing a crash

Additional info:
Log attached.

<rados.Ioctx object at 0x7f1ae13a3520>
librbd/AioCompletion.cc: In function 'void librbd::AioCompletion::fail(CephContext*, int)' thread 7f1aba364700 time 2016-05-08 19:02:46.422238
librbd/AioCompletion.cc: 142: FAILED assert(pending_count == 0)
 ceph version 10.2.0-1.el7cp (3a9fba20ec743699b69bd0181dd6c54dc01c64b9)
 1: (()+0x2765b5) [0x7f1ad043d5b5]
 2: (()+0x749b7) [0x7f1ad023b9b7]
 3: (()+0xe8253) [0x7f1ad02af253]
 4: (()+0x721f9) [0x7f1ad02391f9]
 5: (()+0xa2ac4) [0x7f1ad0269ac4]
 6: (()+0x26713e) [0x7f1ad042e13e]
 7: (()+0x268010) [0x7f1ad042f010]
 8: (()+0x7dc5) [0x7f1ae0d10dc5]
 9: (clone()+0x6d) [0x7f1ae033528d]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

Comment 1 Tanay Ganguly 2016-05-09 06:25:14 UTC
Created attachment 1155151 [details]

Comment 4 Ken Dreyer (Red Hat) 2016-05-10 13:00:05 UTC
From Jason's email today:
> This should be 2.z -- once bz 1331267 merges this error won't
> reproduce but it is still an issue.

Re-targeting to 2.1.

Comment 5 Jason Dillaman 2016-06-12 23:57:53 UTC
Merged, upstream Jewel PR: https://github.com/ceph/ceph/pull/9611

Comment 8 Tanay Ganguly 2016-06-21 11:42:17 UTC
Working Fine.

Marking it as Verified.
ceph version 10.2.2-5.el7cp

Comment 10 errata-xmlrpc 2016-08-23 19:37:59 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.