Bug 845545 - aviary jobserver timeouts for ssl connections
Summary: aviary jobserver timeouts for ssl connections
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: condor-aviary
Version: 2.2
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: 2.3
: ---
Assignee: Pete MacKinnon
QA Contact: MRG Quality Engineering
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-08-03 12:01 UTC by Martin Kudlej
Modified: 2013-01-04 15:39 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2012-10-04 11:43:50 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Martin Kudlej 2012-08-03 12:01:51 UTC
Description of problem:
I've tried to write aviary client with ssl for our testing API. I've got code from aviary examples and refactored it. I've used the same code for ssl with same valid certificates for submitting and also for querying information about jobs. Submitting via aviary with ssl works but query to jobserver ends with timeout.

I've checked my code many times, I've debug it with python debugger but I haven't found anything wrong and any difference from your code expect of different classes. Strange is that same code without ssl works and same code for ssl works for submitting. I've also tried small(10s)/big(1 minute) timeout and it didn't work.

My code for ssl:
class HTTPSTransport(suds.transport.http.HttpTransport):
  def __init__(self, timeout=300, *args, **kwargs):
    suds.transport.http.HttpTransport.__init__(self, *args, **kwargs)
    ca_dir = mrg_utils.MRGEnv.get_env_var('MRG_GRID_CERTIFICATES_DIR', '/etc/condor/certs/') # this function just get variable from bash env
    self.key = mrg_utils.MRGEnv.get_env_var('MRG_OPENSSL_CLIENT_KEY', ca_dir + 'client.key')
    self.cert = mrg_utils.MRGEnv.get_env_var('MRG_OPENSSL_CLIENT_CRT', ca_dir + 'client.crt')
    self.ca_cert = mrg_utils.MRGEnv.get_env_var('MRG_OPENSSL_CA_CRT', ca_dir + 'ca.crt')
    self.timeout = timeout

  def u2open(self, u2request):
    url = urllib2.build_opener(HTTPSAuthHandler(self.key, self.cert, self.ca_cert, self.timeout))
    if self.u2ver() < 2.6:
      socket.setdefaulttimeout(self.timeout)
      return url.open(u2request)
    else:
      return url.open(u2request, timeout=self.timeout)

class HTTPSAuthHandler(urllib2.HTTPSHandler):
  def __init__(self, key, cert, ca_cert, timeout=300):
    urllib2.HTTPSHandler.__init__(self)
    self.key = key
    self.cert = cert
    self.ca_cert = ca_cert
    self.timeout = timeout

  def https_open(self, req):
    #print req.get_full_url()
    return self.do_open(self._get_connection, req)

  def _get_connection(self, host, timeout=300):
    return HTTPSConnection(host, key=self.key, cert=self.cert, ca_cert=self.ca_cert, timeout=self.timeout)

class HTTPSConnection(M2Crypto.httpslib.HTTPSConnection):
  def __init__(self, host, port=None, key=None, cert=None,
             ca_cert=None, strict=None, timeout=None):
    self.my_timeout = timeout
    ctx = M2Crypto.SSL.Context()
    ctx.load_cert(cert, key)
    ctx.load_verify_locations(cafile=ca_cert)
    ctx.load_client_CA(cafile=ca_cert)
    ctx.set_verify(M2Crypto.SSL.verify_peer | M2Crypto.SSL.verify_fail_if_no_peer_cert, depth=9)
    M2Crypto.httpslib.HTTPSConnection.__init__(self, host, port, strict, key_file=key, cert_file=cert, ssl_context=ctx)

  def connect(self):
    self.set_debuglevel(10)
    M2Crypto.httpslib.HTTPSConnection.connect(self)
    if self.my_timeout is not None:
      self.sock.settimeout(self.my_timeout)

message send by sud client and reply:
send: u'POST /services/query/getJobDetails HTTP/1.1\r\nAccept-Encoding: identity\r\nContent-length: 395\r\nSoapaction: "http://grid.redhat.com/aviary-query/job/details"\r\nHost: mrg-qe-04.lab.eng.brq.redhat.com:9091\r\nUser-agent: Python-urllib/2.4\r\nConnection: close\r\nContent-type: text/xml; charset=utf-8\r\n\r\n'
send: '<?xml version="1.0" encoding="UTF-8"?><SOAP-ENV:Envelope xmlns:ns0="http://query.aviary.grid.redhat.com" xmlns:ns1="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/"><SOAP-ENV:Header/><ns1:Body><ns0:GetJobDetails><ids><job>3.0</job></ids></ns0:GetJobDetails></ns1:Body></SOAP-ENV:Envelope>'
reply: ''

Submit works ok:
send: 'POST /services/job/submitJob HTTP/1.1\r\nAccept-Encoding: identity\r\nContent-length: 1136\r\nSoapaction: "http://grid.redhat.com/aviary-job/submit"\r\nHost: mrg-qe-04.lab.eng.brq.redhat.com:9090\r\nUser-agent: Python-urllib/2.4\r\nConnection: close\r\nContent-type: text/xml; charset=utf-8\r\n\r\n'
send: '<?xml version="1.0" encoding="UTF-8"?><SOAP-ENV:Envelope xmlns:ns0="http://schemas.xmlsoap.org/soap/envelope/" xmlns:ns1="http://job.aviary.grid.redhat.com" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/"><SOAP-ENV:Header/><ns0:Body><ns1:SubmitJob><cmd>/bin/sleep</cmd><args>71</args><owner>condor</owner><iwd>/tmp</iwd><submission_name>/bin/sleep 71</submission_name><extra><name>Requirements</name><type>EXPRESSION</type><value>(FileSystemDomain =!= UNDEFINED &amp;&amp; Arch =!= UNDEFINED)</value></extra><extra><name>JobUniverse</name><type>INTEGER</type><value>5</value></extra><extra><name>Err</name><type>STRING</type><value>/tmp/mrg_1.1.err43xJL</value></extra><extra><name>Out</name><type>STRING</type><value>/tmp/mrg_1.1.outDqcVv</value></extra><extra><name>UserLog</name><type>STRING</type><value>/tmp/mrg_1.1.logUK7wV</value></extra><extra><name>WhenToTransferOutput</name><type>STRING</type><value>ON_EXIT</value></extra><extra><name>ShouldTransferFiles</name><type>STRING</type><value>IF_NEEDED</value></extra></ns1:SubmitJob></ns0:Body></SOAP-ENV:Envelope>'
reply: 'HTTP/1.1 200 OK\r\n'
header: Date: Thu Aug  2 06:08:35 2012 GMT^M
header: Server: Axis2C/1.6.0 (Simple Axis2 HTTP Server)^M
header: Content-Type: text/xml;charset=utf-8^M
header: Connection: close^M
header: Content-Length: 456^M

If I try this same operation by client from aviary exmples it works and I get proper response.

Version-Release number of selected component (if applicable):
condor-7.6.5-0.19.el5
condor-aviary-7.6.5-0.19.el5
condor-classads-7.6.5-0.19.el5
condor-qmf-7.6.5-0.19.el5
condor-wallaby-base-db-1.22-5.el5
condor-wallaby-client-4.1.2-1.el5
condor-wallaby-tools-4.1.2-1.el5
python-condorutils-1.5-4.el5
python-wallabyclient-4.1.2-1.el5
ruby-wallaby-0.12.5-10.el5
wallaby-0.12.5-10.el5
wallaby-utils-0.12.5-10.el5
wso2-axis2-2.1.0-8.el5
wso2-wsf-cpp-2.1.0-8.el5
wso2-wsf-cpp-devel-2.1.0-8.el5


How reproducible:
100%

  
Actual results:
I've got timeout from jobserver and I haven't timeout from scheduler with same code for ssl and creating suds client.

Comment 2 Pete MacKinnon 2012-08-03 13:59:53 UTC
[Thu Aug  2 06:14:52 2012] [info]  [ssl] Client verified OK
[Thu Aug  2 06:15:52 2012] [error] http_request_line.c(117) Invalid status line or invalid request line
[Thu Aug  2 06:15:52 2012] [error] simple_http_svr_conn.c(161) Invalid status line or invalid request line
[Thu Aug  2 06:15:52 2012] [error] /builddir/build/BUILD/condor-7.6.4/src/condor_contrib/aviary/src/Axis2SoapProvider.cpp(273) Could not create request

Hmmm, getSubmissionSummary is OK so there's something about the getJobDetails call.

Comment 4 Martin Kudlej 2012-10-04 11:43:50 UTC
I've found that there is bug 863070 in python. -->NOTABUG


Note You need to log in before you can comment on or make changes to this bug.