Bug 1454355 - FAILED assert(0) in OSD::shutdown(), wrong ref count when snap trimming
Summary: FAILED assert(0) in OSD::shutdown(), wrong ref count when snap trimming
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat Storage
Component: RADOS
Version: 2.2
Hardware: All
OS: Linux
urgent
high
Target Milestone: rc
: 2.3
Assignee: Greg Farnum
QA Contact: Vasishta
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-05-22 14:30 UTC by Harald Klein
Modified: 2020-07-16 09:36 UTC (History)
9 users (show)

Fixed In Version: RHEL: ceph-10.2.7-23.el7cp Ubuntu: ceph_10.2.7-25redhat1
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-06-19 13:33:29 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Ceph Project Bug Tracker 19931 0 None None None 2017-05-22 21:56:34 UTC
Red Hat Knowledge Base (Solution) 3093771 0 None None None 2017-06-26 20:47:27 UTC
Red Hat Product Errata RHBA-2017:1497 0 normal SHIPPED_LIVE Red Hat Ceph Storage 2.3 bug fix and enhancement update 2017-06-19 17:24:11 UTC

Comment 9 Vikhyat Umrao 2017-05-23 12:58:50 UTC
OSD log with debug_osd = 20 and debug_ms = 1
------------------------------------------------------------------------------

-6> 2017-05-22 11:05:33.388819 7f19afe3f700 30 osd.1 pg_epoch: 99671 pg[118.5dds1( v 99666'2885404 (99663'2882372,99666'2885404] local-les=97801 n=64075 ec=88336 les/c/f 97801/97803/0 97800/97800/93925) [19,1,38,60,91,72,129,113,143] r=1 lpr=97800 pi=93920-97799/17 luod=0'0 crt=99663'2885402 active NIBBLEWISE] lock

    -5> 2017-05-22 11:05:33.400876 7f19afe3f700 20 osd.1 99671  kicking pg 118.5fas6

    -4> 2017-05-22 11:05:33.400884 7f19afe3f700 30 osd.1 pg_epoch: 99671 pg[118.5fas6( v 99666'2877560 (99663'2874491,99666'2877560] local-les=97801 n=63833 ec=88336 les/c/f 97801/97803/0 97800/97800/94024) [52,89,63,143,118,124,1,36,21] r=6 lpr=97800 pi=43320-97799/1083 luod=0'0 crt=99663'2877558 active NIBBLEWISE] lock

    -3> 2017-05-22 11:05:33.412726 7f19afe3f700 20 osd.1 99671  kicking pg 118.5ffs0

    -2> 2017-05-22 11:05:33.412732 7f19afe3f700 30 osd.1 pg_epoch: 99671 pg[118.5ffs0( v 99666'2881222 (99663'2878149,99666'2881222] local-les=97804 n=64090 ec=88336 les/c/f 97804/97821/0 97800/97800/97800) [1,39,15,124,142,111,66,79,53] r=0 lpr=97800 luod=0'0 crt=99663'2881220 lcod 99666'2881221 mlcod 0'0 active+clean+snaptrim_wait NIBBLEWISE] lock

    -1> 2017-05-22 11:05:33.412741 7f19afe3f700 -1 osd.1 99671 pgid 118.5ffs0 has ref count of 2

     0> 2017-05-22 11:05:33.426260 7f19afe3f700 -1 osd/OSD.cc: In function 'int OSD::shutdown()' thread 7f19afe3f700 time 2017-05-22 11:05:33.412745
osd/OSD.cc: 2738: FAILED assert(0)

 ceph version 10.2.5-37.0.hotfix.bz1436752.el7cp (c0c8ee4a0dc2b9c639d5688b144b47623e2505a2)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x85) [0x7f19f88460c5]
 2: (OSD::shutdown()+0x194e) [0x7f19f81a322e]
 3: (OSD::handle_signal(int)+0x126) [0x7f19f81a36f6]
 4: (SignalHandler::entry()+0x127) [0x7f19f8749c87]
 5: (()+0x7dc5) [0x7f19f6765dc5]
 6: (clone()+0x6d) [0x7f19f4df173d]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

Comment 21 errata-xmlrpc 2017-06-19 13:33:29 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:1497


Note You need to log in before you can comment on or make changes to this bug.