Bug 606391 - condor_triggerd (and others) does nothing to operate with a secured broker
Summary: condor_triggerd (and others) does nothing to operate with a secured broker
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: condor
Version: 1.3
Hardware: All
OS: Linux
low
medium
Target Milestone: 2.0
: ---
Assignee: Robert Rati
QA Contact: Jan Sarenik
URL:
Whiteboard:
Depends On:
Blocks: 693778
TreeView+ depends on / blocked
 
Reported: 2010-06-21 14:17 UTC by Jan Sarenik
Modified: 2011-06-23 15:41 UTC (History)
3 users (show)

Fixed In Version: condor-7.5.6-0.2
Doc Type: Bug Fix
Doc Text:
C: The amqp broker (qpid) is configured to restrict access C: The condor_triggerd, condor_job_server, and QMF plugins would not connect to the broker F: New configuration parameters allow setting authentication information: QMF_BROKER_AUTH_MECH - The authorization mechanism (PLAIN or ANONYMOUS) QMF_BROKER_USERNAME - The user to authenticate to the broker QMF_BROKER_PASSWORD_FILE - The location of the file containing the broker password (in clear text). The security of the password file is the admin's responsibility. R: The above daemons/plugins will be able to connect to a secured broker
Clone Of:
Environment:
Last Closed: 2011-06-23 15:41:30 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
short example (833 bytes, application/x-gzip)
2010-06-24 09:22 UTC, Jan Sarenik
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2011:0889 0 normal SHIPPED_LIVE Red Hat Enterprise MRG Grid 2.0 Release 2011-06-23 15:35:53 UTC

Description Jan Sarenik 2010-06-21 14:17:25 UTC
As I reported a long time ago. Now it hit me into back again.
Despite I run qpidd with --auth=no (which is nowadays also
explicitly stated in /etc/qpidd.conf after install), there
are circumnstances when the broker rejects QMF agents.

After installing this SASL module, everything runs fine.

  qpid-cpp-server-0.7.946106-4.el5
  (the same on all the previous versions)

How reproducible: 100%

Steps to Reproduce:
1. yum remove cyrus-sasl-plain
2. Set up condor as described in Bug 563818 comment #18
3. Run qpidd from command line with trace enabled (qpidd -t --auth=no)
4. Run condor (service condor start)
  
Actual results (qpidd trace):
trace RECV [127.0.0.1:33691]: Frame[BEbe; channel=0; {ConnectionCloseBody: reply-code=501; reply-text=internal-error: Sasl error: SASL(-4): no mechanism available: No worthy mechs found (qpid/client/SaslFactory.cpp:245); }]
warning Client closed connection with 501: internal-error: Sasl error: SASL(-4): no mechanism available: No worthy mechs found (qpid/client/SaslFactory.cpp:245)

Last time we closed this off with "cyrus-sasl-plain is part of base system".
But base system is not clearly defined. I am testing with
  yum --installroot=$PWD/temproot install condor-qmf
so I have always clean root and catch such wiredness'.

Comment 1 Gordon Sim 2010-06-21 14:28:22 UTC
What is the 'fix' you would like to see here? 

There is no requirement to have that specific plugin installed in order to run qpidd (nor should there be in my view). All that is required is that plugins for the desired mechanisms are available. When using auth=no, the options are anonymous or plain, so the clients must have one of these installed. Other deployments however may want to explicitly remove these two options in favour of something more secure.

Comment 2 Jan Sarenik 2010-06-22 13:02:00 UTC
I am not sure where in code the problem lies, but I would
like either qpid packages to depend on cyrus-sasl-plain
or to be able to run a QMF agent against qpid broker
run with --auth=no and no cyrus-sasl-plain module installed
(as is possible now without breaking dependencies).

Comment 3 Jan Sarenik 2010-06-22 13:04:48 UTC
Excuse me for unclear wording.

It is possible to have qpid broker and client, along
with condor-qmf installed on a system without
cyrus-sasl-plain package.

But it is not possible for that QMF console, condor_trigger_config,
to connect to broker while cyrus-sasl-plain is not installed,
though the broker is running with --auth=no.

Comment 4 Jan Sarenik 2010-06-23 08:20:52 UTC
I think the problem is that even when the broker is run
with '--auth=no' (all are consequent messages when qpidd runs
in trace mode):

debug RECV [127.0.0.1:33292] INIT(0-10)
debug SASL: No Authentication Performed
trace SENT 127.0.0.1:33292 INIT(0-10)

(1) Broker advertises authentication mechanisms:

trace SENT [127.0.0.1:33292]: Frame[BEbe; channel=0; {ConnectionStartBody: server-properties={qpid.federation_tag:V2:36:str16(a1d7dc5c-8ec0-4753-a75d-71a8892e3e5d)}; mechanisms=str16{V2:9:str16(ANONYMOUS), V2:5:str16(PLAIN)}; locales=str16{V2:5:str16(en_US)}; }]

(2) Client gets confused:

trace RECV [127.0.0.1:33292]: Frame[BEbe; channel=0; {ConnectionCloseBody: reply-code=501; reply-text=internal-error: Sasl error: SASL(-4): no mechanism available: No worthy mechs found (qpid/client/SaslFactory.cpp:245); }]

(3) And the broker warns about it:

warning Client closed connection with 501: internal-error: Sasl error: SASL(-4): no mechanism available: No worthy mechs found (qpid/client/SaslFactory.cpp:245)

============================== AND ===================================

The problem actually does not have any connection with condor_trigger_config,
which authenticates fine even with no cyrus-sasl-plain module installed.
It just does not find the condor_triggerd which is a QMF agent as far as
I understand and which is not able to catch up with qpidd as can be seen
on following lines:

# rpm -qf /usr/sbin/condor_triggerd
condor-qmf-7.4.3-0.21.el5

# condor_triggerd
warning Closing connection due to internal-error: Sasl error: SASL(-4): no mechanism available: No worthy mechs found (qpid/client/SaslFactory.cpp:245)

And as soon as I install the cyrus-sasl-plain package (even without
restarting the already running 'qpidd --auth=no -t'), and start
condor_triggerd again, it suddenly connects with no problem and
'condor_trigger_config -l localhost' lists the triggers.

Comment 5 Gordon Sim 2010-06-23 09:02:37 UTC
The client should support anonymous, the plugin for which is part of the cyrus-sasl-lib package. 

Can you reproduce this error for any simple messaging/qmf client? E.g. qpid-config, qpid-client-test or the c++ qmf console examples? 

I cannot and suspect that condor-qmf/condor_triggerd is specifically requesting the PLAIN mechanism. (You see the same error if for example you force qpid-client-test to use --mechanism PLAIN).

Comment 6 Jan Sarenik 2010-06-23 09:27:37 UTC
Right, I have changed the product to grid-condor. Thank you, Gordon.

Comment 7 Jan Sarenik 2010-06-23 09:58:24 UTC
See Bug 601828 for an easy code example.

Comment 8 Matthew Farrellee 2010-06-23 12:17:30 UTC
What program is having a problem here? condor_trigger_config was re-written in Python (from C++) to solve Bug 563818, and connects to the broker without a username/password. condor_triggerd connects to the broker through QMF only, also without a username/password.

Comment 9 Matthew Farrellee 2010-06-23 12:18:05 UTC
The condor_trigger_config program used to set username and password on its ConnectionSettings. If that requires also installing cyrus-sasl-plain, I'd suggest making cyrus-sasl-plain a dep on the qpid-cpp-client to avoid complexity.

Comment 10 Gordon Sim 2010-06-23 12:23:38 UTC
Setting a username/password does *not* require cyrus-sasl-plain. Setting PLAIN as the authentication mechanism does.

Comment 11 Matthew Farrellee 2010-06-23 17:11:12 UTC
The console application is not aware that it is requesting the PLAIN mechanism via QMF. The client library should install with all the bits necessary for use ootb, possibly meaning a dep on -plain and -anonymous.

Comment 12 Gordon Sim 2010-06-24 07:21:02 UTC
A dependency on plain is *incorrect*. Requiring plain is actually in itself a potential bug as the component in question would not work with an adequately secured broker. Why that is happening in the case of condor_triggerd is an open question (and may indeed by due to some quirk with QMF). Modifying the c++ console examples does *not* result in the same behaviour.

Comment 13 Jan Sarenik 2010-06-24 09:22:00 UTC
Created attachment 426503 [details]
short example

Here I am attaching a shortest example I was able to prepare
now. Compiled with qmf-devel-0.7.946106-4.el5 and the same
version of qpid-cpp-client-devel, it outputs

----------------------------------------------------------------------
2010-06-24 11:16:38 warning Closing connection due to internal-error: Sasl error: SASL(-4): no mechanism available: No worthy mechs found (qpid/client/SaslFactory.cpp:245)
----------------------------------------------------------------------

when run against 'qpidd -t --auth=no'. The broker says:

----------------------------------------------------------------------
2010-06-24 11:17:11 debug RECV [127.0.0.1:49734] INIT(0-10)
2010-06-24 11:17:11 debug SASL: No Authentication Performed
2010-06-24 11:17:11 trace SENT 127.0.0.1:49734 INIT(0-10)
2010-06-24 11:17:11 trace SENT [127.0.0.1:49734]: Frame[BEbe; channel=0; {ConnectionStartBody: server-properties={qpid.federation_tag:V2:36:str16(5f1186d1-516e-4c56-ab5a-0d77e203c6a0)}; mechanisms=str16{V2:9:str16(ANONYMOUS), V2:5:str16(PLAIN)}; locales=str16{V2:5:str16(en_US)}; }]
2010-06-24 11:17:11 trace RECV [127.0.0.1:49734]: Frame[BEbe; channel=0; {ConnectionCloseBody: reply-code=501; reply-text=internal-error: Sasl error: SASL(-4): no mechanism available: No worthy mechs found (qpid/client/SaslFactory.cpp:245); }]
2010-06-24 11:17:11 warning Client closed connection with 501: internal-error: Sasl error: SASL(-4): no mechanism available: No worthy mechs found (qpid/client/SaslFactory.cpp:245)
2010-06-24 11:17:11 trace SENT [127.0.0.1:49734]: Frame[BEbe; channel=0; {ConnectionCloseOkBody: }]
2010-06-24 11:17:11 trace SEND raiseEvent (v1) class=org.apache.qpid.broker.clientDisconnect
----------------------------------------------------------------------

The same happens with condor_triggerd. So look at the code and
the appropriate functions to find out what should be fixed.

But after all, it seems to me the problem is not on Grid side,
but rather the client library which the agent links with.

Comment 14 Gordon Sim 2010-06-24 09:36:25 UTC
ManagementAgent::init() is defined as:

    virtual void init(const std::string& brokerHost = "localhost",
                      uint16_t brokerPort = 5672,
                      uint16_t intervalSeconds = 10,
                      bool useExternalThread = false,
                      const std::string& storeFile = "",
                      const std::string& uid = "guest",
                      const std::string& pwd = "guest",
                      const std::string& mech = "PLAIN",
                      const std::string& proto = "tcp") = 0;

Note PLAIN as the default mechanism in the header. I would say that is a poor default (likewise the uid and password). However your test program can override the defaults in which case it should work. i.e. agent->init("localhost", 5672, 10, true, "", "", "", "", "tcp"); or use the version of init() that takes a ConnectionSettings instance.

Does condor_triggerd use the ManagementAgent interface as well as the console? If so does it rely on the defaults in the above method? This would certainly explain the issue.

Comment 15 Gordon Sim 2010-06-24 10:45:37 UTC
In http://git.fedorahosted.org/git/?p=grid.git;a=blob;f=src/condor_triggerd/Triggerd.cpp;h=6c51a442411d9dee2a0531f78da3a85e28fceaf6;hb=44c83fb5302ea58d3d53f0f069f40368693e60fd, I suspect you will resolve the issue for condor_triggerd by changing line 214 from:

  agent->init(std::string(host), port, interval, true, storefile);

to e.g.:

  agent->init(std::string(host), port, interval, true, storefile, "guest", "guest", "");


Obviously the hardcoded use of guest user (with default password) prevents use in a real secured environment. If that is acceptable the folowing would be a nicer way to indicate it:

  agent->init(std::string(host), port, interval, true, storefile, "", "", "ANONYMOUS");

The defaults in that init method encourage users of the API to ignore important questions.

Comment 16 Gordon Sim 2010-06-24 10:57:20 UTC
I've created a separate BZ for addressing the defaults in the API:

  https://bugzilla.redhat.com/show_bug.cgi?id=607547

Unless you actually want condor_triggerd to not support authentication you will need to change that anyway, at least to add configurable username and password.

Comment 17 Robert Rati 2011-03-01 14:58:10 UTC
Added the following params:
QMF_BROKER_AUTH_MECH - The authorization mechanism (PLAIN or ANONYMOUS)
QMF_BROKER_USERNAME - The user to authenticate to the broker
QMF_BROKER_PASSWORD_FILE - The location of the file containing the broker password (in clear text).  The security of the password file is the admin's responsibility.

Fixed upstream for 7.6.

Comment 18 Robert Rati 2011-03-15 17:41:52 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
C: The amqp broker (qpid) is configured to restrict access
C: The condor_triggerd, condor_job_server, and QMF plugins would not connect to the broker
F: New configuration parameters allow setting authentication information:
QMF_BROKER_AUTH_MECH - The authorization mechanism (PLAIN or ANONYMOUS)
QMF_BROKER_USERNAME - The user to authenticate to the broker
QMF_BROKER_PASSWORD_FILE - The location of the file containing the broker password (in clear text).  The security of the password file is the admin's responsibility.
R: The above daemons/plugins will be able to connect to a secured broker

Comment 20 Jan Sarenik 2011-05-05 15:06:28 UTC
I see following message when I try to access Grid tab in Cumin:

----------------------------
Collector not found.
  Please ensure the grid is setup correctly and that cumin-data
  is running. If cumin was just started, it may take a few minutes
  for the collector to become available.
----------------------------

Condor set up with following script:
------------
cat >> /etc/condor/config.d/99jasan.config << EOF
QMF_BROKER_HOST = 127.0.0.1
QMF_UPDATE_INTERVAL = 5
COLLECTOR_UPDATE_INTERVAL = 5
SCHEDD_INTERVAL = 5
QMF_BROKER_AUTH_MECH = PLAIN
QMF_BROKER_USERNAME = cumin
QMF_BROKER_PASSWORD_FILE = /etc/condor/pass
EOF
echo cumin > /etc/condor/pass
chown condor:condor /etc/condor/pass
------------

User cumin with password cumin is correctly set in qpidd.sasldb
and qpidd runs with default auth settings.

After I tried to repeat this, it was working, will report
more later.

Comment 21 Jan Sarenik 2011-05-10 13:06:19 UTC
cyrus-sasl-plain stayed uninstalled during testing ANONYMOUS.
With ANONYMOUS QMF auth set, I see only following two
QMF agents from Grid:

    com.redhat.grid         condortriggerservice  object
    com.redhat.grid         master                object

Otherwise PLAIN seems to work and they appear all.

This bug was meant to allow whole Condor to work merely
with ANONYMOUS SASL authentication to AMQP broker and
this requirement is not met yet. I suggest moving it to 2.1.

Comment 22 Jan Sarenik 2011-05-11 05:33:34 UTC
-------- /etc/condor/config.d/99jasan.config: -----------
QMF_BROKER_HOST = 127.0.0.1
QMF_UPDATE_INTERVAL = 5
COLLECTOR_UPDATE_INTERVAL = 5
SCHEDD_INTERVAL = 5

QMF_BROKER_AUTH_MECH = ANONYMOUS

DAEMON_LIST = $(DAEMON_LIST), TRIGGERD
DC_DAEMON_LIST = $(DC_DAEMON_LIST), TRIGGERD
STARTD_CRON_NAME = TRIGGER_DATA
STARTD_CRON_AUTOPUBLISH = If_Changed
TRIGGER_DATA_JOBLIST = GetData
TRIGGER_DATA_GETDATA_PREFIX = Triggerd
TRIGGER_DATA_GETDATA_EXECUTABLE = $(BIN)/get_trigger_data
TRIGGER_DATA_GETDATA_PERIOD = 5m
TRIGGER_DATA_GETDATA_RECONFIG = FALSE
---------------------------------------------------------

-------- /etc/qpidd.conf --------------------------------
auth=no
mgmt-pub-interval=5
---------------------------------------------------------

cyrus-sasl-plain is not installed

Comment 23 Jan Sarenik 2011-05-11 13:17:44 UTC
False alarm.

The only problem was, the DC_DAEMON_LIST setting does not
work as I expected. Following is the full-blown and working
configuration snippet which makes all Condor QMF services
operate correctly:

----------------------------------------------------------------
ALL_DEBUG = D_FULLDEBUG
CREATE_CORE_FILES = True
ABORT_ON_EXCEPTION = True

SLOT_TYPE_1 = cpus=100%
SLOT_TYPE_1_PARTITIONABLE = TRUE
NUM_SLOTS_TYPE_1 = 1

JASANLIMIT_LIMIT = 1

# Note that GRID.SUB_1.SUB_B is defined as a group but has no corresponding quota
GROUP_NAMES = MSG, GRID, MGMT, RT, GRID.SUB_1, GRID.SUB_1.SUB_A, GRID.SUB_1.SUB_B
GROUP_QUOTA_DYNAMIC_GRID = 0.1
GROUP_QUOTA_DYNAMIC_GRID.SUB_1 = 0.05
GROUP_QUOTA_DYNAMIC_GRID.SUB_1.SUB_A = 0.05
GROUP_QUOTA_DYNAMIC_MGMT = 0.01
GROUP_QUOTA_DYNAMIC_MSG = 0.70
GROUP_QUOTA_DYNAMIC_RT = 0.08
ENABLE_RUNTIME_CONFIG = TRUE

QMF_BROKER_HOST = 127.0.0.1
QMF_UPDATE_INTERVAL = 5
COLLECTOR_UPDATE_INTERVAL = 5
SCHEDD_INTERVAL = 5

QMF_BROKER_AUTH_MECH = ANONYMOUS

JOB_SERVER = $(SBIN)/condor_job_server
JOB_SERVER_LOG = $(LOG)/JobServerLog
SCHEDD.QMF_PUBLISH_SUBMISSIONS = FALSE

TRIGGER_DATA_JOBLIST = GetData
TRIGGER_DATA_GETDATA_PREFIX = Triggerd
TRIGGER_DATA_GETDATA_EXECUTABLE = $(BIN)/get_trigger_data
TRIGGER_DATA_GETDATA_PERIOD = 5m
TRIGGER_DATA_GETDATA_RECONFIG = FALSE
STARTD_CRON_NAME = TRIGGER_DATA
STARTD_CRON_AUTOPUBLISH = If_Changed

DAEMON_LIST = $(DAEMON_LIST), JOB_SERVER, TRIGGERD
DC_DAEMON_LIST =+ JOB_SERVER, TRIGGERD
----------------------------------------------------------------

But also PLAIN works well:
------------------------------------------------
QMF_BROKER_AUTH_MECH = PLAIN
QMF_BROKER_USERNAME = cumin
QMF_BROKER_PASSWORD_FILE = /etc/condor/pass
------------------------------------------------

Verified on:
  cumin-0.1.4746-1.el6.noarch
  condor-qmf-7.6.1-0.4.el6.i686
  qpid-cpp-server-0.10-3.el6.i686

  cumin-0.1.4746-1.el5
  condor-qmf-7.6.1-0.4.el5
  qpid-cpp-server-0.10-5.el5

Comment 24 errata-xmlrpc 2011-06-23 15:41:30 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHEA-2011-0889.html


Note You need to log in before you can comment on or make changes to this bug.