As I reported a long time ago. Now it hit me into back again. Despite I run qpidd with --auth=no (which is nowadays also explicitly stated in /etc/qpidd.conf after install), there are circumnstances when the broker rejects QMF agents. After installing this SASL module, everything runs fine. qpid-cpp-server-0.7.946106-4.el5 (the same on all the previous versions) How reproducible: 100% Steps to Reproduce: 1. yum remove cyrus-sasl-plain 2. Set up condor as described in Bug 563818 comment #18 3. Run qpidd from command line with trace enabled (qpidd -t --auth=no) 4. Run condor (service condor start) Actual results (qpidd trace): trace RECV [127.0.0.1:33691]: Frame[BEbe; channel=0; {ConnectionCloseBody: reply-code=501; reply-text=internal-error: Sasl error: SASL(-4): no mechanism available: No worthy mechs found (qpid/client/SaslFactory.cpp:245); }] warning Client closed connection with 501: internal-error: Sasl error: SASL(-4): no mechanism available: No worthy mechs found (qpid/client/SaslFactory.cpp:245) Last time we closed this off with "cyrus-sasl-plain is part of base system". But base system is not clearly defined. I am testing with yum --installroot=$PWD/temproot install condor-qmf so I have always clean root and catch such wiredness'.
What is the 'fix' you would like to see here? There is no requirement to have that specific plugin installed in order to run qpidd (nor should there be in my view). All that is required is that plugins for the desired mechanisms are available. When using auth=no, the options are anonymous or plain, so the clients must have one of these installed. Other deployments however may want to explicitly remove these two options in favour of something more secure.
I am not sure where in code the problem lies, but I would like either qpid packages to depend on cyrus-sasl-plain or to be able to run a QMF agent against qpid broker run with --auth=no and no cyrus-sasl-plain module installed (as is possible now without breaking dependencies).
Excuse me for unclear wording. It is possible to have qpid broker and client, along with condor-qmf installed on a system without cyrus-sasl-plain package. But it is not possible for that QMF console, condor_trigger_config, to connect to broker while cyrus-sasl-plain is not installed, though the broker is running with --auth=no.
I think the problem is that even when the broker is run with '--auth=no' (all are consequent messages when qpidd runs in trace mode): debug RECV [127.0.0.1:33292] INIT(0-10) debug SASL: No Authentication Performed trace SENT 127.0.0.1:33292 INIT(0-10) (1) Broker advertises authentication mechanisms: trace SENT [127.0.0.1:33292]: Frame[BEbe; channel=0; {ConnectionStartBody: server-properties={qpid.federation_tag:V2:36:str16(a1d7dc5c-8ec0-4753-a75d-71a8892e3e5d)}; mechanisms=str16{V2:9:str16(ANONYMOUS), V2:5:str16(PLAIN)}; locales=str16{V2:5:str16(en_US)}; }] (2) Client gets confused: trace RECV [127.0.0.1:33292]: Frame[BEbe; channel=0; {ConnectionCloseBody: reply-code=501; reply-text=internal-error: Sasl error: SASL(-4): no mechanism available: No worthy mechs found (qpid/client/SaslFactory.cpp:245); }] (3) And the broker warns about it: warning Client closed connection with 501: internal-error: Sasl error: SASL(-4): no mechanism available: No worthy mechs found (qpid/client/SaslFactory.cpp:245) ============================== AND =================================== The problem actually does not have any connection with condor_trigger_config, which authenticates fine even with no cyrus-sasl-plain module installed. It just does not find the condor_triggerd which is a QMF agent as far as I understand and which is not able to catch up with qpidd as can be seen on following lines: # rpm -qf /usr/sbin/condor_triggerd condor-qmf-7.4.3-0.21.el5 # condor_triggerd warning Closing connection due to internal-error: Sasl error: SASL(-4): no mechanism available: No worthy mechs found (qpid/client/SaslFactory.cpp:245) And as soon as I install the cyrus-sasl-plain package (even without restarting the already running 'qpidd --auth=no -t'), and start condor_triggerd again, it suddenly connects with no problem and 'condor_trigger_config -l localhost' lists the triggers.
The client should support anonymous, the plugin for which is part of the cyrus-sasl-lib package. Can you reproduce this error for any simple messaging/qmf client? E.g. qpid-config, qpid-client-test or the c++ qmf console examples? I cannot and suspect that condor-qmf/condor_triggerd is specifically requesting the PLAIN mechanism. (You see the same error if for example you force qpid-client-test to use --mechanism PLAIN).
Right, I have changed the product to grid-condor. Thank you, Gordon.
See Bug 601828 for an easy code example.
What program is having a problem here? condor_trigger_config was re-written in Python (from C++) to solve Bug 563818, and connects to the broker without a username/password. condor_triggerd connects to the broker through QMF only, also without a username/password.
The condor_trigger_config program used to set username and password on its ConnectionSettings. If that requires also installing cyrus-sasl-plain, I'd suggest making cyrus-sasl-plain a dep on the qpid-cpp-client to avoid complexity.
Setting a username/password does *not* require cyrus-sasl-plain. Setting PLAIN as the authentication mechanism does.
The console application is not aware that it is requesting the PLAIN mechanism via QMF. The client library should install with all the bits necessary for use ootb, possibly meaning a dep on -plain and -anonymous.
A dependency on plain is *incorrect*. Requiring plain is actually in itself a potential bug as the component in question would not work with an adequately secured broker. Why that is happening in the case of condor_triggerd is an open question (and may indeed by due to some quirk with QMF). Modifying the c++ console examples does *not* result in the same behaviour.
Created attachment 426503 [details] short example Here I am attaching a shortest example I was able to prepare now. Compiled with qmf-devel-0.7.946106-4.el5 and the same version of qpid-cpp-client-devel, it outputs ---------------------------------------------------------------------- 2010-06-24 11:16:38 warning Closing connection due to internal-error: Sasl error: SASL(-4): no mechanism available: No worthy mechs found (qpid/client/SaslFactory.cpp:245) ---------------------------------------------------------------------- when run against 'qpidd -t --auth=no'. The broker says: ---------------------------------------------------------------------- 2010-06-24 11:17:11 debug RECV [127.0.0.1:49734] INIT(0-10) 2010-06-24 11:17:11 debug SASL: No Authentication Performed 2010-06-24 11:17:11 trace SENT 127.0.0.1:49734 INIT(0-10) 2010-06-24 11:17:11 trace SENT [127.0.0.1:49734]: Frame[BEbe; channel=0; {ConnectionStartBody: server-properties={qpid.federation_tag:V2:36:str16(5f1186d1-516e-4c56-ab5a-0d77e203c6a0)}; mechanisms=str16{V2:9:str16(ANONYMOUS), V2:5:str16(PLAIN)}; locales=str16{V2:5:str16(en_US)}; }] 2010-06-24 11:17:11 trace RECV [127.0.0.1:49734]: Frame[BEbe; channel=0; {ConnectionCloseBody: reply-code=501; reply-text=internal-error: Sasl error: SASL(-4): no mechanism available: No worthy mechs found (qpid/client/SaslFactory.cpp:245); }] 2010-06-24 11:17:11 warning Client closed connection with 501: internal-error: Sasl error: SASL(-4): no mechanism available: No worthy mechs found (qpid/client/SaslFactory.cpp:245) 2010-06-24 11:17:11 trace SENT [127.0.0.1:49734]: Frame[BEbe; channel=0; {ConnectionCloseOkBody: }] 2010-06-24 11:17:11 trace SEND raiseEvent (v1) class=org.apache.qpid.broker.clientDisconnect ---------------------------------------------------------------------- The same happens with condor_triggerd. So look at the code and the appropriate functions to find out what should be fixed. But after all, it seems to me the problem is not on Grid side, but rather the client library which the agent links with.
ManagementAgent::init() is defined as: virtual void init(const std::string& brokerHost = "localhost", uint16_t brokerPort = 5672, uint16_t intervalSeconds = 10, bool useExternalThread = false, const std::string& storeFile = "", const std::string& uid = "guest", const std::string& pwd = "guest", const std::string& mech = "PLAIN", const std::string& proto = "tcp") = 0; Note PLAIN as the default mechanism in the header. I would say that is a poor default (likewise the uid and password). However your test program can override the defaults in which case it should work. i.e. agent->init("localhost", 5672, 10, true, "", "", "", "", "tcp"); or use the version of init() that takes a ConnectionSettings instance. Does condor_triggerd use the ManagementAgent interface as well as the console? If so does it rely on the defaults in the above method? This would certainly explain the issue.
In http://git.fedorahosted.org/git/?p=grid.git;a=blob;f=src/condor_triggerd/Triggerd.cpp;h=6c51a442411d9dee2a0531f78da3a85e28fceaf6;hb=44c83fb5302ea58d3d53f0f069f40368693e60fd, I suspect you will resolve the issue for condor_triggerd by changing line 214 from: agent->init(std::string(host), port, interval, true, storefile); to e.g.: agent->init(std::string(host), port, interval, true, storefile, "guest", "guest", ""); Obviously the hardcoded use of guest user (with default password) prevents use in a real secured environment. If that is acceptable the folowing would be a nicer way to indicate it: agent->init(std::string(host), port, interval, true, storefile, "", "", "ANONYMOUS"); The defaults in that init method encourage users of the API to ignore important questions.
I've created a separate BZ for addressing the defaults in the API: https://bugzilla.redhat.com/show_bug.cgi?id=607547 Unless you actually want condor_triggerd to not support authentication you will need to change that anyway, at least to add configurable username and password.
Added the following params: QMF_BROKER_AUTH_MECH - The authorization mechanism (PLAIN or ANONYMOUS) QMF_BROKER_USERNAME - The user to authenticate to the broker QMF_BROKER_PASSWORD_FILE - The location of the file containing the broker password (in clear text). The security of the password file is the admin's responsibility. Fixed upstream for 7.6.
Technical note added. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: C: The amqp broker (qpid) is configured to restrict access C: The condor_triggerd, condor_job_server, and QMF plugins would not connect to the broker F: New configuration parameters allow setting authentication information: QMF_BROKER_AUTH_MECH - The authorization mechanism (PLAIN or ANONYMOUS) QMF_BROKER_USERNAME - The user to authenticate to the broker QMF_BROKER_PASSWORD_FILE - The location of the file containing the broker password (in clear text). The security of the password file is the admin's responsibility. R: The above daemons/plugins will be able to connect to a secured broker
I see following message when I try to access Grid tab in Cumin: ---------------------------- Collector not found. Please ensure the grid is setup correctly and that cumin-data is running. If cumin was just started, it may take a few minutes for the collector to become available. ---------------------------- Condor set up with following script: ------------ cat >> /etc/condor/config.d/99jasan.config << EOF QMF_BROKER_HOST = 127.0.0.1 QMF_UPDATE_INTERVAL = 5 COLLECTOR_UPDATE_INTERVAL = 5 SCHEDD_INTERVAL = 5 QMF_BROKER_AUTH_MECH = PLAIN QMF_BROKER_USERNAME = cumin QMF_BROKER_PASSWORD_FILE = /etc/condor/pass EOF echo cumin > /etc/condor/pass chown condor:condor /etc/condor/pass ------------ User cumin with password cumin is correctly set in qpidd.sasldb and qpidd runs with default auth settings. After I tried to repeat this, it was working, will report more later.
cyrus-sasl-plain stayed uninstalled during testing ANONYMOUS. With ANONYMOUS QMF auth set, I see only following two QMF agents from Grid: com.redhat.grid condortriggerservice object com.redhat.grid master object Otherwise PLAIN seems to work and they appear all. This bug was meant to allow whole Condor to work merely with ANONYMOUS SASL authentication to AMQP broker and this requirement is not met yet. I suggest moving it to 2.1.
-------- /etc/condor/config.d/99jasan.config: ----------- QMF_BROKER_HOST = 127.0.0.1 QMF_UPDATE_INTERVAL = 5 COLLECTOR_UPDATE_INTERVAL = 5 SCHEDD_INTERVAL = 5 QMF_BROKER_AUTH_MECH = ANONYMOUS DAEMON_LIST = $(DAEMON_LIST), TRIGGERD DC_DAEMON_LIST = $(DC_DAEMON_LIST), TRIGGERD STARTD_CRON_NAME = TRIGGER_DATA STARTD_CRON_AUTOPUBLISH = If_Changed TRIGGER_DATA_JOBLIST = GetData TRIGGER_DATA_GETDATA_PREFIX = Triggerd TRIGGER_DATA_GETDATA_EXECUTABLE = $(BIN)/get_trigger_data TRIGGER_DATA_GETDATA_PERIOD = 5m TRIGGER_DATA_GETDATA_RECONFIG = FALSE --------------------------------------------------------- -------- /etc/qpidd.conf -------------------------------- auth=no mgmt-pub-interval=5 --------------------------------------------------------- cyrus-sasl-plain is not installed
False alarm. The only problem was, the DC_DAEMON_LIST setting does not work as I expected. Following is the full-blown and working configuration snippet which makes all Condor QMF services operate correctly: ---------------------------------------------------------------- ALL_DEBUG = D_FULLDEBUG CREATE_CORE_FILES = True ABORT_ON_EXCEPTION = True SLOT_TYPE_1 = cpus=100% SLOT_TYPE_1_PARTITIONABLE = TRUE NUM_SLOTS_TYPE_1 = 1 JASANLIMIT_LIMIT = 1 # Note that GRID.SUB_1.SUB_B is defined as a group but has no corresponding quota GROUP_NAMES = MSG, GRID, MGMT, RT, GRID.SUB_1, GRID.SUB_1.SUB_A, GRID.SUB_1.SUB_B GROUP_QUOTA_DYNAMIC_GRID = 0.1 GROUP_QUOTA_DYNAMIC_GRID.SUB_1 = 0.05 GROUP_QUOTA_DYNAMIC_GRID.SUB_1.SUB_A = 0.05 GROUP_QUOTA_DYNAMIC_MGMT = 0.01 GROUP_QUOTA_DYNAMIC_MSG = 0.70 GROUP_QUOTA_DYNAMIC_RT = 0.08 ENABLE_RUNTIME_CONFIG = TRUE QMF_BROKER_HOST = 127.0.0.1 QMF_UPDATE_INTERVAL = 5 COLLECTOR_UPDATE_INTERVAL = 5 SCHEDD_INTERVAL = 5 QMF_BROKER_AUTH_MECH = ANONYMOUS JOB_SERVER = $(SBIN)/condor_job_server JOB_SERVER_LOG = $(LOG)/JobServerLog SCHEDD.QMF_PUBLISH_SUBMISSIONS = FALSE TRIGGER_DATA_JOBLIST = GetData TRIGGER_DATA_GETDATA_PREFIX = Triggerd TRIGGER_DATA_GETDATA_EXECUTABLE = $(BIN)/get_trigger_data TRIGGER_DATA_GETDATA_PERIOD = 5m TRIGGER_DATA_GETDATA_RECONFIG = FALSE STARTD_CRON_NAME = TRIGGER_DATA STARTD_CRON_AUTOPUBLISH = If_Changed DAEMON_LIST = $(DAEMON_LIST), JOB_SERVER, TRIGGERD DC_DAEMON_LIST =+ JOB_SERVER, TRIGGERD ---------------------------------------------------------------- But also PLAIN works well: ------------------------------------------------ QMF_BROKER_AUTH_MECH = PLAIN QMF_BROKER_USERNAME = cumin QMF_BROKER_PASSWORD_FILE = /etc/condor/pass ------------------------------------------------ Verified on: cumin-0.1.4746-1.el6.noarch condor-qmf-7.6.1-0.4.el6.i686 qpid-cpp-server-0.10-3.el6.i686 cumin-0.1.4746-1.el5 condor-qmf-7.6.1-0.4.el5 qpid-cpp-server-0.10-5.el5
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHEA-2011-0889.html