Bug 1309556 - ceph-mon segfaults in PrebufferedStreambuf::overflow
Summary: ceph-mon segfaults in PrebufferedStreambuf::overflow
Keywords:
Status: CLOSED DUPLICATE of bug 1312587
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat Storage
Component: RADOS
Version: 1.3.2
Hardware: All
OS: Linux
low
low
Target Milestone: rc
: 2.0
Assignee: Samuel Just
QA Contact: ceph-qe-bugs
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-02-18 06:07 UTC by Brad Hubbard
Modified: 2017-07-30 15:10 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-02-29 23:40:55 UTC
Embargoed:


Attachments (Terms of Use)
Debug logs (74.40 KB, application/x-gzip)
2016-02-18 06:07 UTC, Brad Hubbard
no flags Details
Core dump (1.14 MB, application/x-gzip)
2016-02-18 06:07 UTC, Brad Hubbard
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Ceph Project Bug Tracker 13826 0 None None None 2016-02-18 06:07:09 UTC

Description Brad Hubbard 2016-02-18 06:07:09 UTC
Created attachment 1128147 [details]
Debug logs

Description of problem:

This was seen by a QE intern and reported on IRC.

# /usr/bin/ceph-mon -i hp-dl385pg8-09 --debug_mon 20 --debug_ms 20 --pid-file /var/run/ceph/mon.hp-dl385pg8-09.pid -c /etc/ceph/ceph.conf --cluster ceph -f                                                         
starting mon.hp-dl385pg8-09 rank 1 at 10.XX.XX.XXX:6789/0 mon_data /var/lib/ceph/mon/ceph-hp-dl385pg8-09 fsid 3b255df8-8fd2-4278-b579-89c9eb54dcca
*** Caught signal (Segmentation fault) **
 in thread 7fccb99b1700
Segmentation fault

Attaching debug logs and core file.


Version-Release number of selected component (if applicable):
ceph-mon-0.94.3-6.el7cp.x86_64

Comment 1 Brad Hubbard 2016-02-18 06:07:56 UTC
Created attachment 1128148 [details]
Core dump

Comment 2 Brad Hubbard 2016-02-18 06:12:29 UTC
2016-02-18 13:46:06.704898 7fccbbab6700 20 mon.hp-dl385pg8-09@0(probing) e0  caps allow *
2016-02-18 13:46:06.704905 7fccbbab6700 20 -- XX.XX.XXX.199:6789/0 done calling dispatch on 0x53f2ac0
2016-02-18 13:46:06.704925 7fccbe650700  0 log_channel(audit) log [DBG] : from='admin socket' entity='admin socket' cmd='mon_status' args=[]: dispatch
2016-02-18 13:46:06.704943 7fccbe650700  1 -- XX.XX.XXX.199:6789/0 --> XX.XX.XXX.199:6789/0 -- log(1 entries) v1 -- ?+0 0x53f2400 con 0x5238dc0
2016-02-18 13:46:06.704949 7fccbe650700 20 -- XX.XX.XXX.199:6789/0 submit_message log(1 entries) v1 local
2016-02-18 13:46:06.704979 7fccbbab6700  1 -- XX.XX.XXX.199:6789/0 <== mon.0 XX.XX.XXX.199:6789/0 0 ==== log(1 entries) v1 ==== 0+0+0 (0 0 0) 0x53f2400 con 0x5238dc0
2016-02-18 13:46:06.704989 7fccbe650700  0 log_channel(audit) log [DBG] : from='admin socket' entity='admin socket' cmd=mon_status args=[]: finished
2016-02-18 13:46:06.705008 7fccbe650700  1 -- XX.XX.XXX.199:6789/0 --> XX.XX.XXX.199:6789/0 -- log(1 entries) v1 -- ?+0 0x53f2f40 con 0x5238dc0
2016-02-18 13:46:06.705025 7fccbe650700 20 -- XX.XX.XXX.199:6789/0 submit_message log(1 entries) v1 local
2016-02-18 13:46:06.705043 7fccbbab6700 20 mon.hp-dl385pg8-09@0(probing) e0 have connection
2016-02-18 13:46:06.705047 7fccbbab6700 20 mon.hp-dl385pg8-09@0(probing) e0 ms_dispatch existing session MonSession: mon.0 XX.XX.XXX.199:6789/0 is openallow * for mon.0 XX.XX.XXX.199:6789/0
2016-02-18 13:46:06.705053 7fccbbab6700 20 mon.hp-dl385pg8-09@0(probing) e0  caps allow *
2016-02-18 13:46:06.705066 7fccbbab6700 20 -- XX.XX.XXX.199:6789/0 done calling dispatch on 0x53f2400
2016-02-18 13:46:06.705072 7fccbbab6700  1 -- XX.XX.XXX.199:6789/0 <== mon.0 XX.XX.XXX.199:6789/0 0 ==== log(1 entries) v1 ==== 0+0+0 (0 0 0) 0x53f2f40 con 0x5238dc0
2016-02-18 13:46:06.705079 7fccbbab6700 20 mon.hp-dl385pg8-09@0(probing) e0 have connection
2016-02-18 13:46:06.705081 7fccbbab6700 20 mon.hp-dl385pg8-09@0(probing) e0 ms_dispatch existing session MonSession: mon.0 XX.XX.XXX.199:6789/0 is openallow * for mon.0 XX.XX.XXX.199:6789/0
2016-02-18 13:46:06.705086 7fccbbab6700 20 mon.hp-dl385pg8-09@0(probing) e0  caps allow *
2016-02-18 13:46:06.705096 7fccbbab6700 20 -- XX.XX.XXX.199:6789/0 done calling dispatch on 0x53f2f40
2016-02-18 13:46:06.716842 7fccc2cb1700 10 -- XX.XX.XXX.199:6789/0 >> 0.0.0.0:0/1 pipe(0x53b6000 sd=15 :0 s=1 pgs=0 cs=0 l=0 c=0x5238f20).fault done waiting or woke up
2016-02-18 13:46:06.716883 7fccc2cb1700 10 -- XX.XX.XXX.199:6789/0 >> 0.0.0.0:0/1 pipe(0x53b6000 sd=15 :0 s=1 pgs=0 cs=0 l=0 c=0x5238f20).writer: state = connecting policy.server=0
2016-02-18 13:46:06.716892 7fccc2cb1700 10 -- XX.XX.XXX.199:6789/0 >> 0.0.0.0:0/1 pipe(0x53b6000 sd=15 :0 s=1 pgs=0 cs=0 l=0 c=0x5238f20).connect 0
2016-02-18 13:46:06.717053 7fccc2cb1700 10 -- XX.XX.XXX.199:6789/0 >> 0.0.0.0:0/1 pipe(0x53b6000 sd=15 :0 s=1 pgs=0 cs=0 l=0 c=0x5238f20).connecting to 0.0.0.0:0/1
2016-02-18 13:46:06.717121 7fccc2cb1700  2 -- XX.XX.XXX.199:6789/0 >> 0.0.0.0:0/1 pipe(0x53b6000 sd=15 :0 s=1 pgs=0 cs=0 l=0 c=0x5238f20).connect error 0.0.0.0:0/1, (111) Connection refused
2016-02-18 13:46:06.717147 7fccc2cb1700  2 -- XX.XX.XXX.199:6789/0 >> 0.0.0.0:0/1 pipe(0x53b6000 sd=15 :0 s=1 pgs=0 cs=0 l=0 c=0x5238f20).fault (111) Connection refused
2016-02-18 13:46:06.717158 7fccc2cb1700 10 -- XX.XX.XXX.199:6789/0 >> 0.0.0.0:0/1 pipe(0x53b6000 sd=15 :0 s=1 pgs=0 cs=0 l=0 c=0x5238f20).fault waiting 0.800000
2016-02-18 13:46:06.717666 7fccba2b3700 10 -- XX.XX.XXX.199:6789/0 >> XX.XX.XXX.155:6789/0 pipe(0x53bb000 sd=13 :0 s=1 pgs=0 cs=0 l=0 c=0x5239080).fault done waiting or woke up
2016-02-18 13:46:06.717784 7fccba2b3700 10 -- XX.XX.XXX.199:6789/0 >> XX.XX.XXX.155:6789/0 pipe(0x53bb000 sd=13 :0 s=1 pgs=0 cs=0 l=0 c=0x5239080).writer: state = connecting policy.server=0
2016-02-18 13:46:06.717790 7fccba2b3700 10 -- XX.XX.XXX.199:6789/0 >> XX.XX.XXX.155:6789/0 pipe(0x53bb000 sd=13 :0 s=1 pgs=0 cs=0 l=0 c=0x5239080).connect 0
2016-02-18 13:46:06.717870 7fccba2b3700 10 -- XX.XX.XXX.199:6789/0 >> XX.XX.XXX.155:6789/0 pipe(0x53bb000 sd=13 :0 s=1 pgs=0 cs=0 l=0 c=0x5239080).connecting to XX.XX.XXX.155:6789/0
2016-02-18 13:46:06.718209 7fccba2b3700  2 -- XX.XX.XXX.199:6789/0 >> XX.XX.XXX.155:6789/0 pipe(0x53bb000 sd=13 :0 s=1 pgs=0 cs=0 l=0 c=0x5239080).connect error XX.XX.XXX.155:6789/0, (111) Connection refused
2016-02-18 13:46:06.718290 7fccba2b3700  2 -- XX.XX.XXX.199:6789/0 >> XX.XX.XXX.155:6789/0 pipe(0x53bb000 sd=13 :0 s=1 pgs=0 cs=0 l=0 c=0x5239080).fault (111) Connection refused
2016-02-18 13:46:06.718303 7fccba2b3700 10 -- XX.XX.XXX.199:6789/0 >> XX.XX.XXX.155:6789/0 pipe(0x53bb000 sd=13 :0 s=1 pgs=0 cs=0 l=0 c=0x5239080).fault waiting 0.800000
2016-02-18 13:46:07.879075 7fccc2cb1700 10 -- XX.XX.XXX.199:6789/0 >> 0.0.0.0:0/1 pipe(0x53b6000 sd=15 :0 s=1 pgs=0 cs=0 l=0 c=0x5238f20).fault done waiting or woke up
2016-02-18 13:46:07.879090 7fccc2cb1700 10 -- XX.XX.XXX.199:6789/0 >> 0.0.0.0:0/1 pipe(0x53b6000 sd=15 :0 s=1 pgs=0 cs=0 l=0 c=0x5238f20).writer: state = connecting policy.server=0
2016-02-18 13:46:07.879096 7fccc2cb1700 10 -- XX.XX.XXX.199:6789/0 >> 0.0.0.0:0/1 pipe(0x53b6000 sd=15 :0 s=1 pgs=0 cs=0 l=0 c=0x5238f20).connect 0
2016-02-18 13:46:07.879134 7fccc2cb1700 10 -- XX.XX.XXX.199:6789/0 >> 0.0.0.0:0/1 pipe(0x53b6000 sd=15 :0 s=1 pgs=0 cs=0 l=0 c=0x5238f20).connecting to 0.0.0.0:0/1
2016-02-18 13:46:07.879173 7fccc2cb1700  2 -- XX.XX.XXX.199:6789/0 >> 0.0.0.0:0/1 pipe(0x53b6000 sd=15 :0 s=1 pgs=0 cs=0 l=0 c=0x5238f20).connect error 0.0.0.0:0/1, (111) Connection refused
2016-02-18 13:46:07.879187 7fccc2cb1700  2 -- XX.XX.XXX.199:6789/0 >> 0.0.0.0:0/1 pipe(0x53b6000 sd=15 :0 s=1 pgs=0 cs=0 l=0 c=0x5238f20).fault (111) Connection refused
2016-02-18 13:46:07.879201 7fccc2cb1700 10 -- XX.XX.XXX.199:6789/0 >> 0.0.0.0:0/1 pipe(0x53b6000 sd=15 :0 s=1 pgs=0 cs=0 l=0 c=0x5238f20).fault waiting 1.600000
2016-02-18 13:46:07.879257 7fccba2b3700 10 -- XX.XX.XXX.199:6789/0 >> XX.XX.XXX.155:6789/0 pipe(0x53bb000 sd=13 :0 s=1 pgs=0 cs=0 l=0 c=0x5239080).fault done waiting or woke up
2016-02-18 13:46:07.879349 7fccba2b3700 10 -- XX.XX.XXX.199:6789/0 >> XX.XX.XXX.155:6789/0 pipe(0x53bb000 sd=13 :0 s=1 pgs=0 cs=0 l=0 c=0x5239080).writer: state = connecting policy.server=0
2016-02-18 13:46:07.879373 7fccba2b3700 10 -- XX.XX.XXX.199:6789/0 >> XX.XX.XXX.155:6789/0 pipe(0x53bb000 sd=13 :0 s=1 pgs=0 cs=0 l=0 c=0x5239080).connect 0
2016-02-18 13:46:07.879452 7fccba2b3700 10 -- XX.XX.XXX.199:6789/0 >> XX.XX.XXX.155:6789/0 pipe(0x53bb000 sd=13 :0 s=1 pgs=0 cs=0 l=0 c=0x5239080).connecting to XX.XX.XXX.155:6789/0
2016-02-18 13:46:07.879748 7fccba2b3700  2 -- XX.XX.XXX.199:6789/0 >> XX.XX.XXX.155:6789/0 pipe(0x53bb000 sd=13 :0 s=1 pgs=0 cs=0 l=0 c=0x5239080).connect error XX.XX.XXX.155:6789/0, (111) Connection refused
2016-02-18 13:46:07.879770 7fccba2b3700  2 -- XX.XX.XXX.199:6789/0 >> XX.XX.XXX.155:6789/0 pipe(0x53bb000 sd=13 :0 s=1 pgs=0 cs=0 l=0 c=0x5239080).fault (111) Connection refused
2016-02-18 13:46:07.879784 7fccba2b3700 10 -- XX.XX.XXX.199:6789/0 >> XX.XX.XXX.155:6789/0 pipe(0x53bb000 sd=13 :0 s=1 pgs=0 cs=0 l=0 c=0x5239080).fault waiting 1.600000
2016-02-18 13:46:07.885942 7fccbaab4700 20 accepter.accepter poll got 1
2016-02-18 13:46:07.885974 7fccbaab4700 10 accepter.pfd.revents=1
2016-02-18 13:46:07.885999 7fccbaab4700 10 accepter.accepted incoming on sd 22
2016-02-18 13:46:07.886094 7fccbaab4700 20 accepter.accepter calling poll
2016-02-18 13:46:07.886130 7fccb99b1700 10 -- XX.XX.XXX.199:6789/0 >> :/0 pipe(0x540c000 sd=22 :0 s=0 pgs=0 cs=0 l=0 c=0x5239340).accept
2016-02-18 13:46:07.886289 7fccb99b1700  1 -- XX.XX.XXX.199:6789/0 >> :/0 pipe(0x540c000 sd=22 :6789 s=0 pgs=0 cs=0 l=0 c=0x5239340).accept sd=22 XX.XX.XXX.81:46368/0
2016-02-18 13:46:07.886692 7fccb99b1700 10 -- XX.XX.XXX.199:6789/0 >> XX.XX.XXX.81:6789/0 pipe(0x540c000 sd=22 :6789 s=0 pgs=0 cs=0 l=0 c=0x5239340).accept peer addr is XX.XX.XXX.81:6789/0
2016-02-18 13:46:07.888145 7fccb99b1700 20 -- XX.XX.XXX.199:6789/0 >> XX.XX.XXX.81:6789/0 pipe(0x540c000 sd=22 :6789 s=0 pgs=0 cs=0 l=0 c=0x5239340).accept got peer connect_seq 0 global_seq 6528
2016-02-18 13:46:07.888165 7fccb99b1700 10 -- XX.XX.XXX.199:6789/0 >> XX.XX.XXX.81:6789/0 pipe(0x540c000 sd=22 :6789 s=0 pgs=0 cs=0 l=0 c=0x5239340).accept of host_type 1, policy.lossy=0 policy.server=0 policy.standby=1 policy.resetcheck=1
2016-02-18 13:46:07.888177 7fccb99b1700 10 -- XX.XX.XXX.199:6789/0 >> XX.XX.XXX.81:6789/0 pipe(0x540c000 sd=22 :6789 s=0 pgs=0 cs=0 l=0 c=0x5239340).accept my proto 13, their proto 13
2016-02-18 13:46:07.888189 7fccb99b1700 10 mon.hp-dl385pg8-09@0(probing) e0 ms_verify_authorizer XX.XX.XXX.81:6789/0 mon protocol 2
2016-02-18 13:46:07.888325 7fccb99b1700  0 cephx: verify_authorizer could not decrypt ticket info: error: NSS AES final round failed: -8190
2016-02-18 13:46:07.888335 7fccb99b1700 10 -- XX.XX.XXX.199:6789/0 >> XX.XX.XXX.81:6789/0 pipe(0x540c000 sd=22 :6789 s=0 pgs=0 cs=0 l=0 c=0x5239340).accept new session^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@
2016-02-18 13:46:07.888348 7fccb99b1700 10 -- XX.XX.XXX.199:6789/0 >> XX.XX.XXX.81:6789/0 pipe(0x540c000 sd=22 :6789 s=0 pgs=0 cs=0 l=0 c=0x5239340).accept new session

Core was generated by `/usr/bin/ceph-mon -i hp-dl385pg8-09 --pid-file /var/run/ceph/mon.hp-dl385pg8-09'.
Program terminated with signal 11, Segmentation fault.
#0  __memcpy_ssse3 () at ../sysdeps/x86_64/multiarch/memcpy-ssse3.S:664
664             movaps  %xmm1, -0x10(%rdi)
Missing separate debuginfos, use: debuginfo-install gperftools-libs-2.4-7.el7.x86_64 leveldb-1.12.0-5.el7cp.x86_64 libunwind-1.1-5.el7_2.2.x86_64 nss-softokn-3.16.2.3-13.el7_1.x86_64 nss-softokn-freebl-3.16.2.3-13.el7_1.x86_64 nss-util-3.19.1-4.el7_1.x86_64 snappy-1.1.0-3.el7.x86_64 sqlite-3.7.17-8.el7.x86_64
(gdb) bt
#0  __memcpy_ssse3 () at ../sysdeps/x86_64/multiarch/memcpy-ssse3.S:664
#1  0x00000000008a23d1 in memcpy (__len=101, __src=0x3beb7a3, __dest=0x8dfc70 <CephxSessionHandler::~CephxSessionHandler()>) at /usr/include/bits/string3.h:51
#2  ceph::BackTrace::print (this=this@entry=0x7f2a4b9178b0, out=...) at common/BackTrace.cc:43
#3  0x0000000000902cdf in handle_fatal_signal (signum=11) at global/signal_handler.cc:95
#4  <signal handler called>
#5  std::string::_Rep::_S_create (__capacity=80, __old_capacity=<optimized out>, __alloc=...) at /usr/src/debug/gcc-4.8.5-20150702/obj-x86_64-redhat-linux/x86_64-redhat-linux/libstdc++-v3/include/bits/basic_string.tcc:609
#6  0x00007f2a531408ab in std::string::_Rep::_M_clone (this=0x7f2a5338a3e0 <std::string::_Rep::_S_empty_rep_storage>, __alloc=..., __res=<optimized out>)
    at /usr/src/debug/gcc-4.8.5-20150702/obj-x86_64-redhat-linux/x86_64-redhat-linux/libstdc++-v3/include/bits/basic_string.tcc:629
#7  0x00007f2a53140954 in std::string::reserve (this=this@entry=0x3b77730, __res=<optimized out>, __res@entry=80)
    at /usr/src/debug/gcc-4.8.5-20150702/obj-x86_64-redhat-linux/x86_64-redhat-linux/libstdc++-v3/include/bits/basic_string.tcc:510
#8  0x00007f2a53140d36 in std::string::append (this=0x3b77730, __n=80, __c=<optimized out>) at /usr/src/debug/gcc-4.8.5-20150702/obj-x86_64-redhat-linux/x86_64-redhat-linux/libstdc++-v3/include/bits/basic_string.tcc:288
#9  0x0000000000794e10 in resize (__n=<optimized out>, this=0x3b77730) at /usr/include/c++/4.8.2/bits/basic_string.h:754
#10 PrebufferedStreambuf::overflow (this=0x3b776e0, c=56) at common/PrebufferedStreambuf.cc:18
#11 0x00007f2a53121d66 in std::basic_streambuf<char, std::char_traits<char> >::xsputn (this=0x3b776e0, __s=0x7f2a4b918b35 "8912\314\020S*\177", __n=3)
    at /usr/src/debug/gcc-4.8.5-20150702/obj-x86_64-redhat-linux/x86_64-redhat-linux/libstdc++-v3/include/bits/streambuf.tcc:98
#12 0x00007f2a5310cd3d in sputn (__n=3, __s=<optimized out>, this=0x3b776e0) at /usr/src/debug/gcc-4.8.5-20150702/obj-x86_64-redhat-linux/x86_64-redhat-linux/libstdc++-v3/include/streambuf:451
#13 _M_put (__len=3, __ws=<optimized out>, this=<synthetic pointer>) at /usr/src/debug/gcc-4.8.5-20150702/obj-x86_64-redhat-linux/x86_64-redhat-linux/libstdc++-v3/include/bits/streambuf_iterator.h:282
#14 __write<char> (__len=<optimized out>, __ws=<optimized out>, __s=...) at /usr/src/debug/gcc-4.8.5-20150702/obj-x86_64-redhat-linux/x86_64-redhat-linux/libstdc++-v3/include/bits/locale_facets.h:114
#15 std::num_put<char, std::ostreambuf_iterator<char, std::char_traits<char> > >::_M_insert_int<unsigned long> (this=<optimized out>, __s=..., __io=..., __fill=<optimized out>, __v=<optimized out>)
    at /usr/src/debug/gcc-4.8.5-20150702/obj-x86_64-redhat-linux/x86_64-redhat-linux/libstdc++-v3/include/bits/locale_facets.tcc:928
#16 0x00007f2a5310cf2d in std::num_put<char, std::ostreambuf_iterator<char, std::char_traits<char> > >::do_put (this=<optimized out>, __s=..., __io=..., __fill=<optimized out>, __v=<optimized out>)
    at /usr/src/debug/gcc-4.8.5-20150702/obj-x86_64-redhat-linux/x86_64-redhat-linux/libstdc++-v3/include/bits/locale_facets.h:2480
#17 0x00007f2a531191ee in put (__v=891, __fill=<optimized out>, __io=..., __s=..., this=0x7f2a533898f0 <(anonymous namespace)::num_put_c>)
    at /usr/src/debug/gcc-4.8.5-20150702/obj-x86_64-redhat-linux/x86_64-redhat-linux/libstdc++-v3/include/bits/locale_facets.h:2341
#18 std::ostream::_M_insert<unsigned long> (this=this@entry=0x7f2a4b918d30, __v=__v@entry=891) at /usr/src/debug/gcc-4.8.5-20150702/obj-x86_64-redhat-linux/x86_64-redhat-linux/libstdc++-v3/include/bits/ostream.tcc:73
#19 0x00000000008bc1d5 in operator<< (__n=891, this=<optimized out>) at /usr/include/c++/4.8.2/ostream:196
#20 Pipe::_pipe_prefix (this=this@entry=0x3ccc000, _dout=_dout@entry=0x7f2a4b918d30) at msg/simple/Pipe.cc:45
#21 0x00000000008d58ee in Pipe::reader (this=0x3ccc000) at msg/simple/Pipe.cc:1521
#22 0x00000000008d996d in Pipe::Reader::entry (this=<optimized out>) at msg/simple/Pipe.h:50
#23 0x00007f2a53dd1dc5 in start_thread (arg=0x7f2a4b919700) at pthread_create.c:308
#24 0x00007f2a528901cd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113

Comment 4 Brad Hubbard 2016-02-18 06:54:19 UTC
(gdb) f
#10 PrebufferedStreambuf::overflow (this=0x3b776e0, c=56) at common/PrebufferedStreambuf.cc:18
18          m_overflow.resize(m_buf_len);

(gdb) p *this
$4 = {
  <std::basic_streambuf<char, std::char_traits<char> >> = {
    _vptr.basic_streambuf = 0xc32d30 <vtable for PrebufferedStreambuf+16>, 
    _M_in_beg = 0x0, 
    _M_in_cur = 0x0, 
    _M_in_end = 0x0, 
    _M_out_beg = 0x3b77690 "-- 10.73.4.199:6789/0 >> 10.73.194.81:6789/0 pipe(0x3ccc000 sd=22 :6789 s=2 pgs=0", <incomplete sequence \303>, 
    _M_out_cur = 0x3b776e0 "0", <incomplete sequence \303>, 
    _M_out_end = 0x3b776e0 "0", <incomplete sequence \303>, 
    _M_buf_locale = {
      static none = 0, 
      static ctype = 1, 
      static numeric = 2, 
      static collate = 4, 
      static time = 8, 
      static monetary = 16, 
      static messages = 32, 
      static all = 63, 
      _M_impl = 0x7f2a53389e00 <(anonymous namespace)::c_locale_impl>, 
      static _S_classic = 0x7f2a53389e00 <(anonymous namespace)::c_locale_impl>, 
      static _S_global = 0x7f2a53389e00 <(anonymous namespace)::c_locale_impl>, 
      static _S_categories = 0x7f2a5336c920 <__gnu_cxx::category_names>, 
      static _S_once = 2
    }
  }, 
  members of PrebufferedStreambuf: 
  m_buf = 0x3b77690 "-- 10.XX.4.199:6789/0 >> 10.XX.194.81:6789/0 pipe(0x3ccc000 sd=22 :6789 s=2 pgs=0", <incomplete sequence \303>, 
  m_buf_len = 80, 
  m_overflow = ""
}

(gdb) x/100c 0x3b77690
0x3b77690:      45 '-'  45 '-'  32 ' '  49 '1'  48 '0'  46 '.'  55 '7'  51 '3'
0x3b77698:      46 '.'  52 '4'  46 '.'  49 '1'  57 '9'  57 '9'  58 ':'  54 '6'
0x3b776a0:      55 '7'  56 '8'  57 '9'  47 '/'  48 '0'  32 ' '  62 '>'  62 '>'
0x3b776a8:      32 ' '  49 '1'  48 '0'  46 '.'  55 '7'  51 '3'  46 '.'  49 '1'
0x3b776b0:      57 '9'  52 '4'  46 '.'  56 '8'  49 '1'  58 ':'  54 '6'  55 '7'
0x3b776b8:      56 '8'  57 '9'  47 '/'  48 '0'  32 ' '  112 'p' 105 'i' 112 'p'
0x3b776c0:      101 'e' 40 '('  48 '0'  120 'x' 51 '3'  99 'c'  99 'c'  99 'c'
0x3b776c8:      48 '0'  48 '0'  48 '0'  32 ' '  115 's' 100 'd' 61 '='  50 '2'
0x3b776d0:      50 '2'  32 ' '  58 ':'  54 '6'  55 '7'  56 '8'  57 '9'  32 ' '
0x3b776d8:      115 's' 61 '='  50 '2'  32 ' '  112 'p' 103 'g' 115 's' 61 '='
0x3b776e0:      48 '0'  45 '-'  -61 '\303'      0 '\000'        0 '\000'        0 '\000'        0 '\000'        0 '\000'
0x3b776e8:      0 '\000'        0 '\000'        0 '\000'        0 '\000'        0 '\000'        0 '\000'        0 '\000'        0 '\000'
0x3b776f0:      0 '\000'        0 '\000'        0 '\000'        0 '\000'

Looks like the basic_streambuf got corrupted (but I guess that's obvious :) )

The final log entries look highly suspicious with unprintable chars present apparently.

2016-02-18 13:46:07.888325 7fccb99b1700  0 cephx: verify_authorizer could not decrypt ticket info: error: NSS AES final round failed: -8190
2016-02-18 13:46:07.888335 7fccb99b1700 10 -- XX.XX.XXX.199:6789/0 >> XX.XX.XXX.81:6789/0 pipe(0x540c000 sd=22 :6789 s=0 pgs=0 cs=0 l=0 c=0x5239340).accept new session^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@
2016-02-18 13:46:07.888348 7fccb99b1700 10 -- XX.XX.XXX.199:6789/0 >> XX.XX.XXX.81:6789/0 pipe(0x540c000 sd=22 :6789 s=0 pgs=0 cs=0 l=0 c=0x5239340).accept new session

Comment 5 Samuel Just 2016-02-18 14:58:03 UTC
How would one ssh into the machine where this happened?

Comment 6 Samuel Just 2016-02-18 15:02:47 UTC
What was being done before starting the mon?  Was it a fresh cluster?  Was this after an upgrade?

Comment 7 Brad Hubbard 2016-02-18 22:49:39 UTC
I can't currently log into this machine either. Setting needinfo on Guohao Wang (QE intern who reported the issue).

Comment 8 guowang 2016-02-19 02:21:50 UTC
(In reply to Samuel Just from comment #6)
> What was being done before starting the mon?  Was it a fresh cluster?  Was
> this after an upgrade?

hi, the old machine have be returned and this issue have be reproduce in hp-dl388g8-16.rhts.eng.pek2.redhat.com.
My operation is follow the 280 course's(openstack II) book.


#ceph-deploy new hp-dl388g8-16  hp-dl388g8-17 hp-dl385pg8-03
#ceph-deploy install new hp-dl388g8-16  hp-dl388g8-17 hp-dl385pg8-03
#ceph-deploy mon create new hp-dl388g8-16  hp-dl388g8-17 hp-dl385pg8-03 
#ceph-deploy gatherkeys 

#ceph -s
2016-02-19 10:16:42.347167 7fcc5b659700  0 librados: client.admin authentication error (1) Operation not permitted
Error connecting to cluster: PermissionError
when I entern ceph -s find some error and the ceph-mon process not be started.
And the messages log report error.

[root@hp-dl388g8-16 ceph-deploy]# /usr/bin/ceph-mon -i hp-dl388g8-16 --pid-file /var/run/ceph/mon.hp-dl388g8-16.pid -c /etc/ceph/ceph.conf --cluster ceph -f
starting mon.hp-dl388g8-16 rank 0 at 10.73.194.81:6789/0 mon_data /var/lib/ceph/mon/ceph-hp-dl388g8-16 fsid bb695966-4bc0-4163-abe8-7125d6ebd58e
*** Caught signal (Segmentation fault) **
 in thread 7fc559d95700
 ceph version 0.94.3 (95cefea9fd9ab740263bf8bb4796fd864d9afe2b)
 1: /usr/bin/ceph-mon() [0x902ca2]
 2: (()+0xf100) [0x7fc558fdb100]
 3: (std::string::_Rep::_S_create(unsigned long, unsigned long, std::allocator<char> const&)+0x59) [0x7fc558354c69]
 4: (std::string::_Rep::_M_clone(std::allocator<char> const&, unsigned long)+0x1b) [0x7fc5583558ab]
 5: (std::string::reserve(unsigned long)+0x44) [0x7fc558355954]
 6: (std::string::append(unsigned long, char)+0x46) [0x7fc558355d36]
 7: (PrebufferedStreambuf::overflow(int)+0x30) [0x794e10]
 8: (std::basic_streambuf<char, std::char_traits<char> >::xsputn(char const*, long)+0x36) [0x7fc558336d66]
 9: (std::basic_ostream<char, std::char_traits<char> >& std::__ostream_insert<char, std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char> >&, char const*, long)+0x215) [0x7fc55832dab5]
 10: (Pipe::_pipe_prefix(std::ostream*)+0x198) [0x8bc1c8]
 11: (Pipe::reader()+0x5fe) [0x8d58ee]
 12: (Pipe::Reader::entry()+0xd) [0x8d996d]
 13: (()+0x7dc5) [0x7fc558fd3dc5]
 14: (clone()+0x6d) [0x7fc557ab41cd]
Segmentation fault

Comment 10 Samuel Just 2016-02-19 15:17:08 UTC
I have logged in and don't seem to be able to reproduce the crash with the above command:

/usr/bin/ceph-mon -i hp-dl388g8-16 --pid-file /var/run/ceph/mon.hp-dl388g8-16.pid -c /etc/ceph/ceph.conf --cluster ceph -f

Can you still reproduce it?

Comment 14 Samuel Just 2016-02-29 23:40:55 UTC

*** This bug has been marked as a duplicate of bug 1312587 ***


Note You need to log in before you can comment on or make changes to this bug.