Bug 694453

Summary: SECMAN:2007:Failed to end classad from condor_status with shared port configured
Product: Red Hat Enterprise MRG Reporter: Tomas Rusnak <trusnak>
Component: condorAssignee: Timothy St. Clair <tstclair>
Status: CLOSED NOTABUG QA Contact: Tomas Rusnak <trusnak>
Severity: medium Docs Contact:
Priority: medium    
Version: DevelopmentCC: iboverma, jneedle, matt
Target Milestone: 2.0   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-04-15 14:47:26 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Bug Depends On:    
Bug Blocks: 691821    

Description Tomas Rusnak 2011-04-07 12:40:03 UTC
Description of problem:
condor_status return no answer or communication error while shared port is configured.

Version-Release number of selected component (if applicable):
$CondorVersion: 7.6.0 Mar 24 2011 BuildID: RH-7.6.0-0.3.el5 PRE-RELEASE-GRID $
$CondorPlatform: X86_64-Redhat_5.6 $

How reproducible:
always

Steps to Reproduce:
1. configuration:

Node1:
USE_SHARED_PORT=True
DAEMON_SOCKET_DIR = /var/lock/condor/daemon_soc
SHARED_PORT_ARGS = "-p 4080"
DAEMON_LIST=$(DAEMON_LIST),SHARED_PORT
SHARED_PORT_DEBUG = D_COMMAND
 
Node2:

CONDOR_HOST = <node1_hostname>
DAEMON_LIST=MASTER SCHEDD STARTD

2. set up firewall to drop all except 4080(and 22) on node1

Chain INPUT (policy DROP 21407 packets, 6391K bytes)
 pkts bytes target     prot opt in     out     source               destination         
1409K 1563M ACCEPT     all  --  any    any     anywhere             anywhere            state RELATED,ESTABLISHED 
13437 1264K ACCEPT     tcp  --  any    any     anywhere             anywhere            tcp dpt:4080 
 3454  262K ACCEPT     tcp  --  any    any     anywhere             anywhere            tcp dpt:ssh 
60298 3618K ACCEPT     tcp  --  any    any     network/21           anywhere

3. try condor_status or condor_status -direct <node1>:4080 from node2
  
Actual results:

Without -direct it provides no output:
$condor_status
$

With -direct it provides error message like this:
$condor_status -direct <hostname>:4080
Error: communication error
SECMAN:2007:Failed to end classad message.
Error: Failed to contact startd <hostname>:4080 at <IP:4080>

At same time:
==> /var/log/condor/SharedPortLog <==
04/07/11 08:25:33 Calling Handler <DaemonCore::HandleReqSocketHandler> (2)
04/07/11 08:25:33 Return from Handler <DaemonCore::HandleReqSocketHandler> 0.0002s


Expected results:
just works

Additional info:
I'm also using settings to allow all connections to avoid issues with permissions:

ALLOW_WRITE = *
ALLOW_READ = *
ALLOW_NEGOTIATOR = *
ALLOW_ADMINISTRATOR_READ = *

Comment 3 Timothy St. Clair 2011-04-15 14:47:26 UTC
Presently you will still need port 9618 open for collector and (shared port) for all other (direct) requests.  

if you wish to route all ports through 4080 including the collector you will 
need to configure: 

COLLECTOR_HOST = $(CONDOR_HOST):4080?sock=collector
COLLECTOR_ARGS = -sock collector