This service will be undergoing maintenance at 00:00 UTC, 2017-10-23 It is expected to last about 30 minutes
Bug 454456 - Unable to start carod immediately after stopped
Unable to start carod immediately after stopped
Status: CLOSED ERRATA
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: grid (Show other bugs)
1.0
All Linux
medium Severity high
: 1.1
: ---
Assigned To: Robert Rati
Kim van der Riet
:
Depends On:
Blocks: 454430
  Show dependency treegraph
 
Reported: 2008-07-08 12:19 EDT by Robert Rati
Modified: 2009-02-04 11:06 EST (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2009-02-04 11:06:04 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Robert Rati 2008-07-08 12:19:13 EDT
Description of problem:
Starting carod immediately after stopping it will fail.  This is because the
port that carod is listening on doesn't seem to be closing cleanly and is going
into the CLOSE_WAIT state.  The code attempts to cleaning shut down all sockets,
but something must be missed.

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:
Comment 1 Robert Rati 2008-09-19 09:40:12 EDT
A sock was found to be missed being closed, and along with daemonizing carod the problem seems to have been solved.  I have been able to stop and restart carod many times back-to-back.
Comment 3 Jeff Needle 2008-12-05 16:38:06 EST
Dec  5 15:37:34 north-11 carod: socket error 98: Address already in use
Dec  5 15:37:34 north-11 carod: Failed to listen on 127.0.0.1:10000
Dec  5 15:37:37 north-11 hook_fetch_work.py: socket error 107: Transport endpoint is not connected
Dec  5 15:37:38 north-11 hook_fetch_work.py: socket error 107: Transport endpoint is not connected
Dec  5 15:37:38 north-11 carod: socket error 98: Address already in use
Dec  5 15:37:38 north-11 carod: Failed to listen on 127.0.0.1:10000

with -8.  So still happening.
Comment 4 Robert Rati 2008-12-05 18:08:29 EST
Needed to set SO_REUSESOCKET on the listen socket carod uses.  Fixed in:
condor-job-hooks-1.0-4
condor-low-latency-1.0-5
Comment 6 errata-xmlrpc 2009-02-04 11:06:04 EST
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2009-0036.html

Note You need to log in before you can comment on or make changes to this bug.