Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1087748

Summary: condor_startd segfault when restarting condor deamons
Product: Red Hat Enterprise MRG Reporter: Tomas Rusnak <trusnak>
Component: condorAssignee: grid-maint-list <grid-maint-list>
Status: CLOSED WONTFIX QA Contact: MRG Quality Engineering <mrgqe-bugs>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 2.5CC: matt, sgraf
Target Milestone: ---   
Target Release: ---   
Hardware: i686   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-05-26 19:29:25 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
core file dump none

Description Tomas Rusnak 2014-04-15 08:26:28 UTC
Created attachment 886398 [details]
core file dump

Description of problem:
The core file was generated in service condor restart

Version-Release number of selected component (if applicable):
condor-7.8.9-0.7.el6.i686

How reproducible:
very rare, 32bit only

Steps to Reproduce:
1. service condor restart in a loop

Actual results:
segfault and core file generated

Expected results:
no segfault

Additional info:

see attachment for full core file dump

 #0  0x00cab416 in __kernel_vsyscall ()
  #1  0x007c4a88 in send () from /lib/libpthread.so.0
  #2  0x00656142 in condor_write (peer_description=0x21b3f10 "daemon at <10.34.44.175:39930>", fd=18, buf=0x223f2c0 "\001", sz=503, timeout=390, flags=0) at /usr/src/debug/condor-7.8.9/src/condor_io/condor_rw.cpp:374
  #3  0x00658645 in Buf::write (this=0x21b4d70, peer_description=0x21b3f10 "daemon at <10.34.44.175:39930>", sockd=18, sz=503, timeout=390) at /usr/src/debug/condor-7.8.9/src/condor_io/buffers.cpp:94
  #4  0x006586f3 in Buf::flush (this=0x21b4d70, peer_description=0x21b3f10 "daemon at <10.34.44.175:39930>", sockd=18, hdr=0xbfbb5f07, sz=5, timeout=390) at /usr/src/debug/condor-7.8.9/src/condor_io/buffers.cpp:140
  #5  0x00644eab in ReliSock::SndMsg::snd_packet (this=0x21b4d64, peer_description=0x21b3f10 "daemon at <10.34.44.175:39930>", _sock=18, end=1, _timeout=390) at /usr/src/debug/condor-7.8.9/src/condor_io/reli_sock.cpp:782
  #6  0x00645b96 in ReliSock::end_of_message (this=0x21b4b20) at /usr/src/debug/condor-7.8.9/src/condor_io/reli_sock.cpp:469
  #7  0x0065238b in SecManStartCommand::sendAuthInfo_inner (this=0x22422d0) at /usr/src/debug/condor-7.8.9/src/condor_io/condor_secman.cpp:1641
  #8  0x00652d3b in SecManStartCommand::startCommand_inner (this=0x22422d0) at /usr/src/debug/condor-7.8.9/src/condor_io/condor_secman.cpp:1212
  #9  0x006536a1 in SecManStartCommand::startCommand (this=0x22422d0) at /usr/src/debug/condor-7.8.9/src/condor_io/condor_secman.cpp:1150
  #10 0x006537ed in SecMan::startCommand (this=0x21b3aec, cmd=60008, sock=0x21b4b20, raw_protocol=false, errstack=0x21b3eb0, subcmd=0, callback_fn=0, misc_data=0x0, nonblocking=false, cmd_description=0x6d4ad5 "DC_CHILDALIVE", sec_session_id_hint=0x0) at /usr/src/debug/condor-7.8.9/src/condor_io/condor_secman.cpp:1039
  #11 0x0067ab02 in Daemon::startCommand (cmd=60008, sock=0x21b4b20, timeout=390, errstack=0x21b3eb0, callback_fn=0, misc_data=0x0, nonblocking=false, cmd_description=0x6d4ad5 "DC_CHILDALIVE", sec_man=0x21b3aec, raw_protocol=false, sec_session_id=0x0) at /usr/src/debug/condor-7.8.9/src/condor_daemon_client/daemon.cpp:558
  #12 0x0067c95d in Daemon::startCommand (this=0x21b3aa0, cmd=60008, st=Stream::reli_sock, sock=0xbfbb632c, timeout=390, errstack=0x21b3eb0, callback_fn=0, misc_data=0x0, nonblocking=false, cmd_description=0x6d4ad5 "DC_CHILDALIVE", raw_protocol=false, sec_session_id=0x0) at /usr/src/debug/condor-7.8.9/src/condor_daemon_client/daemon.cpp:622
  #13 0x0067cae7 in Daemon::startCommand (this=0x21b3aa0, cmd=60008, st=Stream::reli_sock, timeout=390, errstack=0x21b3eb0, cmd_description=0x6d4ad5 "DC_CHILDALIVE", raw_protocol=false, sec_session_id=0x0) at /usr/src/debug/condor-7.8.9/src/condor_daemon_client/daemon.cpp:631
  #14 0x00668f7a in DCMessenger::sendBlockingMsg (this=0x21b4028, msg=...) at /usr/src/debug/condor-7.8.9/src/condor_daemon_client/dc_message.cpp:366
  #15 0x0067ba13 in Daemon::sendBlockingMsg (this=0x21b3aa0, msg=...) at /usr/src/debug/condor-7.8.9/src/condor_daemon_client/daemon.cpp:2278
  #16 0x0069a4df in DaemonCore::SendAliveToParent (this=0x21a1948) at /usr/src/debug/condor-7.8.9/src/condor_daemon_core.V6/daemon_core.cpp:9132
  #17 0x006b3f2c in TimerManager::Timeout (this=0x78c548, pNumFired=0xbfbb6708, pruntime=0xbfbb66f8) at /usr/src/debug/condor-7.8.9/src/condor_daemon_core.V6/timer_manager.cpp:428
  #18 0x0069f240 in DaemonCore::Driver (this=0x21a1948) at /usr/src/debug/condor-7.8.9/src/condor_daemon_core.V6/daemon_core.cpp:3167
  #19 0x0068a6b7 in dc_main (argc=1, argv=0xbfbb6d68) at /usr/src/debug/condor-7.8.9/src/condor_daemon_core.V6/daemon_core_main.cpp:2410
  #20 0x00402676 in main (argc=2, argv=0xbfbb6d64) at /usr/src/debug/condor-7.8.9/src/condor_startd.V6/startd_main.cpp:815

Comment 2 Anne-Louise Tangring 2016-05-26 19:29:25 UTC
MRG-Grid is in maintenance and only customer escalations will be considered. This issue can be reopened if a customer escalation associated with it occurs.