Bug 1327540 - qemu-kvm crashes with double free or corruption in cephx code after hotfix in bz1296722
Summary: qemu-kvm crashes with double free or corruption in cephx code after hotfix in...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat Storage
Component: RADOS
Version: 1.3.2
Hardware: x86_64
OS: Linux
high
high
Target Milestone: rc
: 1.3.2
Assignee: Ali Maredia
QA Contact: Vasu Kulkarni
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-04-15 10:37 UTC by Vikhyat Umrao
Modified: 2019-10-10 11:53 UTC (History)
11 users (show)

Fixed In Version: RHEL: ceph-0.94.5-12.el7cp Ubuntu: ceph_0.94.5-6redhat1trusty
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-05-06 18:40:06 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Ceph Project Bug Tracker 14958 0 None None None 2016-04-15 11:31:54 UTC
Red Hat Bugzilla 1296722 0 urgent CLOSED qemu-kvm crashes with double free or corruption in cephx code 2021-02-22 00:41:40 UTC
Red Hat Knowledge Base (Solution) 2260091 0 None None None 2016-05-09 10:53:02 UTC
Red Hat Product Errata RHBA-2016:0721 0 normal SHIPPED_LIVE Red Hat Ceph Storage 1.3 bug fix and enhancement update 2016-05-06 22:39:03 UTC

Internal Links: 1296722

Description Vikhyat Umrao 2016-04-15 10:37:51 UTC
Description of problem:
qemu-kvm crashes with double free or corruption in cephx code after hotfix in bz1296722


(gdb) bt
#0  0x00007fa2519a05d7 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1  0x00007fa2519a1cc8 in __GI_abort () at abort.c:90
#2  0x00007fa2519e0e07 in __libc_message (do_abort=do_abort@entry=2, fmt=fmt@entry=0x7fa251ae98c8 "*** Error in `%s': %s: 0x%s ***\n") at ../sysdeps/unix/sysv/linux/libc_fatal.c:196
#3  0x00007fa2519e81fd in malloc_printerr (ptr=<optimized out>, str=0x7fa251ae99a0 "double free or corruption (!prev)", action=3) at malloc.c:4972
#4  _int_free (av=0x7fa251d25760 <main_arena>, p=<optimized out>, have_lock=0) at malloc.c:3804
#5  0x00007fa25bf7a424 in PK11_DestroyContext (context=0x7fa23c594870, freeit=1) at pk11cxt.c:68
#6  0x00007fa255cb0b7f in nss_aes_operation (op=op@entry=261, mechanism=<optimized out>, key=<optimized out>, param=<optimized out>, in=..., out=..., error=0x7fa247b23c30) at auth/Crypto.cc:246
#7  0x00007fa255cb163a in CryptoAESKeyHandler::decrypt (this=<optimized out>, in=..., out=..., error=<optimized out>) at auth/Crypto.cc:320
#8  0x00007fa255ca18cc in decrypt (cct=0x7fa247b23c30, error=0x7fa247b23c30, out=..., in=..., this=0x7fa247b23960) at auth/Crypto.h:114
#9  decode_decrypt_enc_bl<ceph::buffer::list> (cct=cct@entry=0x7fa25e00f930, t=..., key=..., bl_enc=..., error="") at auth/cephx/CephxProtocol.h:436
#10 0x00007fa255ca2160 in decode_decrypt<ceph::buffer::list> (cct=0x7fa25e00f930, t=..., key=..., iter=..., error="") at auth/cephx/CephxProtocol.h:474
#11 0x00007fa255c9c0ac in CephXTicketHandler::verify_service_ticket_reply (this=this@entry=0x7fa23c001d98, secret=..., indata=...) at auth/cephx/CephxProtocol.cc:162
#12 0x00007fa255c9db9b in CephXTicketManager::verify_service_ticket_reply (this=this@entry=0x7fa23c001a00, secret=..., indata=...) at auth/cephx/CephxProtocol.cc:276
#13 0x00007fa255c91d11 in CephxClientHandler::handle_response (this=0x7fa23c001950, ret=<optimized out>, indata=...) at auth/cephx/CephxClientHandler.cc:118
#14 0x00007fa255b2f3d1 in MonClient::handle_auth (this=this@entry=0x7fa25e015610, m=m@entry=0x7f9e24d1ad50) at mon/MonClient.cc:507
#15 0x00007fa255b312e9 in MonClient::ms_dispatch (this=0x7fa25e015610, m=0x7f9e24d1ad50) at mon/MonClient.cc:281
#16 0x00007fa255c2760a in ms_deliver_dispatch (m=0x7f9e24d1ad50, this=0x7fa25e02efd0) at msg/Messenger.h:567
#17 DispatchQueue::entry (this=0x7fa25e02f198) at msg/simple/DispatchQueue.cc:185
#18 0x00007fa255c5525d in DispatchQueue::DispatchThread::entry (this=<optimized out>) at msg/simple/DispatchQueue.h:103
#19 0x00007fa25b4bfdf5 in start_thread (arg=0x7fa247b25700) at pthread_create.c:308
#20 0x00007fa251a611ad in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113



Version-Release number of selected component (if applicable):
Red Hat Ceph Storage 1.3 with hotfix in bz1296722
ceph-common-0.94.1-19.el7cp.0.hotfix.bz1296722.x86_64

Comment 2 Vikhyat Umrao 2016-04-15 10:40:26 UTC
(gdb) f 6
#6  0x00007fa255cb0b7f in nss_aes_operation (op=op@entry=261, mechanism=<optimized out>, key=<optimized out>, param=<optimized out>, in=..., out=..., error=0x7fa247b23c30) at auth/Crypto.cc:246
246	    PK11_DestroyContext(ectx, PR_TRUE);

(gdb) p *ectx
$9 = {operation = 0, key = 0x7fa23c11b9a0, slot = 0x7fa25e0258a0, session = 19383812, sessionLock = 0x7fa23c526bc0, ownSession = 1, cx = 0x0, savedData = 0x0, savedLength = 140334768384896, 
  param = 0x7fa23c011c30, init = 0, type = 4229, fortezzaHack = 0}

(gdb) p *ectx->sessionLock
$10 = {mutex = {__data = {__lock = 1012046864, __count = 32674, __owner = 1009215584, __nusers = 32674, __kind = -1, __spins = 0, __list = {__prev = 0x0, __next = 0x0}}, 
    __size = "\020\234R<\242\177\000\000`h'<\242\177\000\000\377\377\377\377", '\000' <repeats 19 times>, __align = 140334773476368}, notified = {length = 0, cv = {{cv = 0x0, times = 0}, {cv = 0x0, times = 0}, 
      {cv = 0x0, times = 0}, {cv = 0x0, times = 0}, {cv = 0x0, times = 0}, {cv = 0x0, times = 0}}, link = 0x0}, locked = 0, owner = 140334964299520}
(gdb) 

- From this bt pattern it seems in frame 6 it is showing session is held by thread __owner = 1009215584 and there is no thread exist in this core full bt pattern of this thread id.

- It seems it is a garbage value which is passed by ceph code that is why it is crashing

Comment 36 Vasu Kulkarni 2016-04-28 17:47:24 UTC
Verified : RBD sanity + Qemu Regression runs.

Comment 40 errata-xmlrpc 2016-05-06 18:40:06 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-0721.html


Note You need to log in before you can comment on or make changes to this bug.