Bug 1291511 - SIGABRT in TrackedOp::dump() via dump_ops_in_flight()
SIGABRT in TrackedOp::dump() via dump_ops_in_flight()
Status: POST
Product: Red Hat Ceph Storage
Classification: Red Hat
Component: RADOS (Show other bugs)
1.3.1
x86_64 Linux
medium Severity medium
: rc
: 1.3.4
Assigned To: David Zafman
ceph-qe-bugs
: Patch
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2015-12-14 21:23 EST by Brad Hubbard
Modified: 2017-07-30 11:19 EDT (History)
8 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Ceph Project Bug Tracker 8885 None None None Never

  None (edit)
Description Brad Hubbard 2015-12-14 21:23:47 EST
Description of problem:

Customer seeing the following.

 ceph version 0.94.3 (95cefea9fd9ab740263bf8bb4796fd864d9afe2b)
 1: /usr/bin/ceph-osd() [0xa20592]
 2: (()+0xf100) [0x7f4f5e2e4100]
 3: (gsignal()+0x37) [0x7f4f5ccfc5f7]
 4: (abort()+0x148) [0x7f4f5ccfdce8]
 5: (__gnu_cxx::__verbose_terminate_handler()+0x165) [0x7f4f5d6009d5]
 6: (()+0x5e946) [0x7f4f5d5fe946]
 7: (()+0x5e973) [0x7f4f5d5fe973]
 8: (()+0x5f4df) [0x7f4f5d5ff4df]
 9: (TrackedOp::dump(utime_t, ceph::Formatter*) const+0x238) [0x6e8278]
 10: (OpTracker::dump_ops_in_flight(ceph::Formatter*)+0xa7) [0x6e8d17]
 11: (OSD::asok_command(std::string, std::map<std::string, boost::variant<std::string, bool, long, double, std::vector<std::string, std::allocator<std::string> >, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_>, std::less<std::string>, std::allocator<std::pair<std::string const, boost::variant<std::string, bool, long, double, std::vector<std::string, std::allocator<std::string> >, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> > > >&, std::string, std::ostream&)+0x216) [0x63bb96]
 12: (OSDSocketHook::call(std::string, std::map<std::string, boost::variant<std::string, bool, long, double, std::vector<std::string, std::allocator<std::string> >, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_>, std::less<std::string>, std::allocator<std::pair<std::string const, boost::variant<std::string, bool, long, double, std::vector<std::string, std::allocator<std::string> >, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> > > >&, std::string, ceph::buffer::list&)+0x25b) [0x6a764b]
 13: (AdminSocket::do_accept()+0xf36) [0xb13e36]
 14: (AdminSocket::entry()+0x280) [0xb15960]
 15: (()+0x7dc5) [0x7f4f5e2dcdc5]
 16: (clone()+0x6d) [0x7f4f5cdbd21d]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

$ git describe --contains a8e6de307f2e4f4ff5e7faad3809ed708e05fdde
v0.88~171

# gdb -q /usr/bin/ceph-osd
Reading symbols from /usr/bin/ceph-osd...Reading symbols from /usr/lib/debug/usr/bin/ceph-osd.debug...done.
done.
(gdb) disass /m 0x6e8278
Dump of assembler code for function TrackedOp::dump(utime_t, ceph::Formatter*) const:
322     {
   0x00000000006e8040 <+0>:     push   %r15
   0x00000000006e8042 <+2>:     push   %r14
   0x00000000006e8044 <+4>:     push   %r13
   0x00000000006e8046 <+6>:     push   %r12
   0x00000000006e8048 <+8>:     mov    %rdi,%r12
   0x00000000006e804b <+11>:    push   %rbp
   0x00000000006e804c <+12>:    mov    %rdx,%rbp
   0x00000000006e804f <+15>:    push   %rbx
   0x00000000006e8050 <+16>:    sub    $0x248,%rsp
   0x00000000006e8057 <+23>:    mov    %fs:0x28,%rax
   0x00000000006e8060 <+32>:    mov    %rax,0x238(%rsp)
   0x00000000006e8068 <+40>:    xor    %eax,%eax
   0x00000000006e807a <+58>:    mov    %rsi,0x18(%rsp)
   0x00000000006e807f <+63>:    mov    %esi,0x30(%rsp)

323       stringstream name;
324       _dump_op_descriptor_unlocked(name);
   0x00000000006e8265 <+549>:   mov    0x10(%rsp),%rcx
   0x00000000006e826a <+554>:   mov    (%r12),%rax
   0x00000000006e826e <+558>:   mov    %r12,%rdi
   0x00000000006e8271 <+561>:   lea    0x10(%rcx),%rsi
   0x00000000006e8275 <+565>:   callq  *0x10(%rax)

325       f->dump_string("description", name.str().c_str()); // this TrackedOp
   0x00000000006e8278 <+568>:   mov    0x0(%rbp),%rax     <----------HERE

Breakpoint 1, TrackedOp::dump (this=0x623fc00, now=now@entry=..., f=f@entry=0x5614680) at common/TrackedOp.cc:322
322     {
(gdb) n
...
(gdb) n
325       f->dump_string("description", name.str().c_str()); // this TrackedOp
(gdb) i r rbp
rbp            0x5614680        0x5614680
(gdb) p f
$4 = (ceph::Formatter *) 0x5614680

So it looks like the address of the formatter is invalid at the time of the crash, at least it looks that way to me.

Version-Release number of selected component (if applicable):
ceph-osd-0.94.3-3.el7cp.x86_64

How reproducible:
Unknown
Comment 1 Brad Hubbard 2015-12-14 21:25:56 EST
I've reopened http://tracker.ceph.com/issues/8885 as this is definitely the same issue.
Comment 3 David Zafman 2016-03-14 14:28:25 EDT
A pull request for upstream is in progress:

https://github.com/ceph/ceph/pull/8044
Comment 6 David Zafman 2016-06-09 19:02:41 EDT
I believe that the 8044 pull request is the complete fix.
Comment 7 Ken Dreyer (Red Hat) 2016-08-03 17:31:13 EDT
https://github.com/ceph/ceph/pull/10255 is the hammer backport, still in progress.

Note You need to log in before you can comment on or make changes to this bug.