Bug 789921

Summary: SSL with PLAIN or GSSAPI does not work with federation
Product: Red Hat Enterprise MRG Reporter: Tim Powers <timp>
Component: qpid-cppAssignee: mick <mgoulish>
Status: NEW --- QA Contact: MRG Quality Engineering <mrgqe-bugs>
Severity: high Docs Contact:
Priority: medium    
Version: DevelopmentCC: iboverma, jross
Target Milestone: ---Keywords: Reopened
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-02-22 19:56:34 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Tim Powers 2012-02-13 09:50:47 UTC
Description of problem: Attempting to add a route for federation between two brokers which use SSL and require authentication (PLAIN or GSSAPI) results in an error and the link fails to be operational.


Version-Release number of selected component (if applicable):
qpid-cpp-server-0.14-3.el5

How reproducible:
Every time. 

Steps to Reproduce:
1. $ qpid-route --client-sasl-mechanism=GSSAPI -t ssl -s route add amqps://federation/federation.com:5671 amqps://sourcebroker.foo.com:5671 test.topic "test.1" "" PLAIN
2. You can also do this with the GSSAPI mechanism and we still get errors

Actual results:
Run qpid-route link list  amqps://sourcebroker.foo.com and notice that the link is always "connecting" and is never "operational"

Look at the logs in the dest broker:
info Inter-broker link disconnected from source broker.foo.com:5671 Failed: NSS error [-8172] (qpid/sys/ssl/SslSocket.cpp:162)

The result is that there is no functioning link and federation/routing between the brokers is non-operational.

Expected results:
The link over SSL should be established and operational and routing between brokers should function currently.

Comment 1 mick 2012-02-22 19:51:16 UTC
I am separating out the GSSAPI component of this bug into a separate BZ -- 796372 -- because I believe the issues with getting those two mechs to work with SSl transport are different.

Comment 2 mick 2012-02-22 19:56:34 UTC
PLAIN + SSL does work -- but you have to do two things:

  1. don't use (like I did) the --ssl-sasl-no-dict to start up the brokers.  This disallows the PLAIN mech because it is susceptible to dictionary attacks.

  2. be sure to include a valid username/passwd in the qpid-route command that creates the inter-broker connection.

Here is the whole script I used for this.  Sorry, it's kind of large.  But there are big parts you can ignore.


#! /bin/bash 



# Test environment   ======================================================== 
# ( ignore this part, and scroll down to "end test environment" )


absdir() { echo `cd $1 && pwd`; }

# Environment variables substituted by configure/cmake.
srcdir=`absdir /home/mick/trunk/qpid/cpp/src/tests`
builddir=`absdir /home/mick/trunk/qpid/cpp/src/tests`
top_srcdir=`absdir /home/mick/trunk/qpid/cpp/src/tests/../..`
top_builddir=`absdir /home/mick/trunk/qpid/cpp/src/tests/../../.`
moduledir=$top_builddir/src/.libs
testmoduledir=$builddir/.libs
export QPID_INSTALL_PREFIX=/usr/local

# Python paths and directories
export PYTHON_DIR=$builddir/python
export QPID_PYTHON_TEST=$PYTHON_DIR/commands/qpid-python-test
if [ ! -d $PYTHON_DIR -a -d $top_srcdir/../python ]; then
    export PYTHON_DIR=$top_srcdir/../python
    export QPID_PYTHON_TEST=$PYTHON_DIR/qpid-python-test
fi
export QPID_TESTS=$top_srcdir/../tests
export QPID_TESTS_PY=$QPID_TESTS/src/py
export QPID_TOOLS=$top_srcdir/../tools
export QMF_LIB=$top_srcdir/../extras/qmf/src/py
export PYTHON_COMMANDS=$QPID_TOOLS/src/py
export PYTHONPATH=$srcdir:$PYTHON_DIR:$PYTHON_COMMANDS:$QPID_TESTS_PY:$QMF_LIB:$PYTHONPATH
export QPID_CONFIG_EXEC=$PYTHON_COMMANDS/qpid-config
export QPID_ROUTE_EXEC=$PYTHON_COMMANDS/qpid-route
export QPID_CLUSTER_EXEC=$PYTHON_COMMANDS/qpid-cluster

# Executables
export QPIDD_EXEC=$top_builddir/src/qpidd
export QPID_WATCHDOG_EXEC=$top_builddir/src/qpidd_watchdog

# Test executables
export QPID_TEST_EXEC_DIR=$builddir
export RECEIVER_EXEC=$QPID_TEST_EXEC_DIR/receiver
export SENDER_EXEC=$QPID_TEST_EXEC_DIR/sender

# Path
export PATH=$top_builddir/src:$builddir:$srcdir:$PYTHON_COMMANDS:$QPID_TEST_EXEC_DIR:$PYTHON_DIR/commands:$PATH

# Modules
export TEST_STORE_LIB=$testmoduledir/test_store.so

exportmodule() { test -f $moduledir/$2 && eval "export $1=$moduledir/$2"; }
exportmodule ACL_LIB acl.so
exportmodule CLUSTER_LIB cluster.so
exportmodule HA_LIB ha.so
exportmodule REPLICATING_LISTENER_LIB replicating_listener.so
exportmodule REPLICATION_EXCHANGE_LIB replication_exchange.so
exportmodule SSLCONNECTOR_LIB sslconnector.so
exportmodule SSL_LIB ssl.so
exportmodule WATCHDOG_LIB watchdog.so
exportmodule XML_LIB xml.so

# Qpid options
export QPID_NO_MODULE_DIR=1     # Don't accidentally load installed modules
export QPID_DATA_DIR=

# Use temporary directory if $HOME does not exist
if [ ! -e "$HOME" ]; then
    export QPID_DATA_DIR=/tmp/qpid
    export QPID_PID_DIR=/tmp/qpid
fi

# Options for boost test framework
export BOOST_TEST_SHOW_PROGRESS=yes
export BOOST_TEST_CATCH_SYSTEM_ERRORS=no


# end test environment ===================================================================




# sasl config setup =====================================================================
# ( ignore this part, too. )

SASL_PW=/usr/sbin/saslpasswd2
test -x $SASL_PW || { echo Skipping SASL test, saslpasswd2 not found; exit 0; }

mkdir -p sasl_config

# Create configuration file.
cat > sasl_config/qpidd.conf <<EOF
pwcheck_method: auxprop
auxprop_plugin: sasldb
sasldb_path: $PWD/sasl_config/qpidd.sasldb
sql_select: dummy select
mech_list: ANONYMOUS PLAIN DIGEST-MD5 EXTERNAL
EOF

# Populate temporary sasl db.
SASLTEST_DB=./sasl_config/qpidd.sasldb
rm -f $SASLTEST_DB
echo guest | $SASL_PW -c -p -f $SASLTEST_DB -u QPID guest
echo zig | $SASL_PW -c -p -f $SASLTEST_DB -u QPID zig
echo zag | $SASL_PW -c -p -f $SASLTEST_DB -u QPID zag

# end sasl config setup ====================================================================





# HERE is the actual test script =========================================================
# ( don't ignore this part )


qpid_route_method=route

# Debugging print. --------------------------
debug=1
function print {
  if [ "$debug" ]; then
    echo "sasl_fed_plain_ssl: $1"
  fi
}


CERT_DIR=`pwd`/test_cert_db
CERT_PW_FILE=`pwd`/cert.password
TEST_HOSTNAME=127.0.0.1



# Define a couple handy functions ==============================================

create_certs() {
    #create certificate and key databases with single, simple, self-signed certificate in it
    mkdir ${CERT_DIR}
    certutil -N -d ${CERT_DIR} -f ${CERT_PW_FILE}
    certutil -S -d ${CERT_DIR} -n ${TEST_HOSTNAME} -s "CN=${TEST_HOSTNAME}" -t "CT,," -x -f ${CERT_PW_FILE} -z /usr/bin/certutil 2> /dev/null
}

delete_certs() {
    if [[ -e ${CERT_DIR} ]] ;  then
        print "removing cert dir ${CERT_DIR}"
        rm -rf ${CERT_DIR}
    fi
}


CERTUTIL=$(type -p certutil)
if [[ !(-x $CERTUTIL) ]] ; then
    echo "No certutil.  Quitting.";
    exit 0;
fi

delete_certs

create_certs 2> /dev/null
if [ ! $? ]; then
  error "Could not create test certificate"
  exit 1
fi

sasl_config_dir=$builddir/sasl_config

tmp_root=${builddir}/sasl_fed_ex_temp
print "results dir is ${tmp_root}"
rm -rf ${tmp_root}
mkdir -p $tmp_root

SRC_SSL_PORT=6667
DST_SSL_PORT=6666

SRC_SSL_PORT_2=6668
DST_SSL_PORT_2=6669

SRC_TCP_PORT=5801
DST_TCP_PORT=5807

SRC_TCP_PORT_2=5802
DST_TCP_PORT_2=5803

SSL_LIB=${moduledir}/ssl.so

export QPID_SSL_CERT_NAME=${TEST_HOSTNAME}

export QPID_NO_MODULE_DIR=1
export QPID_LOAD_MODULE=$SSLCONNECTOR_LIB
export QPID_SSL_CERT_DB=${CERT_DIR}
export QPID_SSL_CERT_PASSWORD_FILE=${CERT_PW_FILE}
export QPID_SSL_CERT_NAME=${TEST_HOSTNAME}



#######################################
# Understanding this Plumbing
#######################################
#  1. when you establish the route with qpid-route,
#     here is the best termiology to use:
#
#        qpid-route route add  DST  SRC
#
#  2. DST will connect to SRC through the ssl port of SRC.
#
#  3. sender client connects to the tcp port of SRC.
#
#  4. sender specifies mechanism ANONYMOUS.
#
#  5. DST pulls messages off the temp queue on SRC to itself.
#


# This broker flag will disallow PLAIN, because it is 
# vulnerable to dictionary attacks.
# removed --ssl-sasl-no-dict                         \

COMMON_BROKER_OPTIONS="                          \
      --sasl-config=$sasl_config_dir             \
      --ssl-require-client-authentication        \
      --auth yes                                 \
      --ssl-cert-db $CERT_DIR                    \
      --ssl-cert-password-file $CERT_PW_FILE     \
      --ssl-cert-name $TEST_HOSTNAME             \
      --no-data-dir                              \
      --no-module-dir                            \
      --load-module ${SSL_LIB}                   \
      --mgmt-enable=yes                          \
      --log-enable info+                         \
      --log-source yes                           \
      --daemon "


function start_brokers {
  print "Starting SRC broker"
  $QPIDD_EXEC                                  \
    --port=${SRC_TCP_PORT}                     \
    --ssl-port ${SRC_SSL_PORT}                 \
    ${COMMON_BROKER_OPTIONS}                   \
    --log-to-file $tmp_root/qpidd_src.log 2> /dev/null

  broker_ports[0]=${SRC_TCP_PORT}

  print "Starting DST broker"
  $QPIDD_EXEC                                  \
    --port=${DST_TCP_PORT}                     \
    --ssl-port ${DST_SSL_PORT}                 \
    ${COMMON_BROKER_OPTIONS}                   \
    --log-to-file $tmp_root/qpidd_dst.log 2> /dev/null

  broker_ports[1]=${DST_TCP_PORT}
}


function halt_brokers {
  print "Halting 2 brokers."
  for i in $(seq 0 1)
  do
    halt_port=${broker_ports[$i]}
    print "Halting broker $i on port ${halt_port}"
    $QPIDD_EXEC --port ${halt_port} --quit
  done

}


# OK, enough fooling around.  Let's do some work ==========================================
start_brokers

# TODO test this in script.
#print "Are the brokers up?"
#ps -aef | grep qpidd

QUEUE_NAME=sasl_fed_queue
ROUTING_KEY=sasl_fed_queue
EXCHANGE_NAME=sasl_fedex


print "add exchanges"
$QPID_CONFIG_EXEC -a localhost:${SRC_TCP_PORT} add exchange direct $EXCHANGE_NAME
$QPID_CONFIG_EXEC -a localhost:${DST_TCP_PORT} add exchange direct $EXCHANGE_NAME


print "add queues"
$QPID_CONFIG_EXEC -a localhost:${SRC_TCP_PORT} add queue $QUEUE_NAME
$QPID_CONFIG_EXEC -a localhost:${DST_TCP_PORT} add queue $QUEUE_NAME


print "create bindings"
$QPID_CONFIG_EXEC -a localhost:${SRC_TCP_PORT} bind $EXCHANGE_NAME $QUEUE_NAME $ROUTING_KEY
$QPID_CONFIG_EXEC -a localhost:${DST_TCP_PORT} bind $EXCHANGE_NAME $QUEUE_NAME $ROUTING_KEY


#
# NOTE: The SRC broker *must* be referred to as $TEST_HOSTNAME, and not as "localhost".
#       It must be referred to by the exact string given as the Common Name (CN) in 
#       the cert, which was created in the function create_certs, above.



# Here is the qpid-route command that does the magic.
print "route add"
$QPID_ROUTE_EXEC -t ssl route add localhost:${DST_TCP_PORT}   guest/guest@${TEST_HOSTNAME}:${SRC_SSL_PORT} $EXCHANGE_NAME $ROUTING_KEY "" "" PLAIN

# I don't know how to avoid this sleep yet.  It has to come after route-creation 
# to avoid false negatives.
sleep 5

# This should work the same whether or not we are running a clustered test.
# In the case of clustered tests, the status is not printed by qpid_route.
# So in either case, I will look only at the transport field, which should be "ssl".
print "check the link"
$QPID_ROUTE_EXEC link list localhost:${DST_TCP_PORT}
transport_type=$($QPID_ROUTE_EXEC link list localhost:${DST_TCP_PORT} | tail -1 | awk '{print $3}')
link_state=$($QPID_ROUTE_EXEC link list localhost:${DST_TCP_PORT} | tail -1 | awk '{print $5}')

print "transport_type ${transport_type}  link_state ${link_state}"

halt_brokers

sleep 1



# Now see if anything worked right... ================================================


if [ ! ${transport_type} ]; then
  print "transport_type is empty"
  print "result: fail"
  exit 2
fi

if [ ! ${link_state} ]; then
  print "link_state is empty"
  print "result: fail"
  exit 3
fi

if [ ${transport_type} != "ssl" ]; then
  print "bad transport_type: ${transport_type}"
  print "result: fail"
  exit 4
fi
if [ ${link_state} != "Operational" ]; then
  print "bad link_state: ${link_state}"
  print "result: fail"
  exit 5
fi

print "result: good"
# Only remove the tmp_root on success, to permit debugging.
print "Removing temporary directory $tmp_root"
rm -rf $tmp_root
exit 0

Comment 3 mick 2012-02-24 14:17:01 UTC
The above comment, in spite of its massive size, was not weighty enough to close this bug.
The reporter is doing something very similar, but can still repro the problem at will, 100%.

Comment 4 mick 2012-02-24 15:08:28 UTC
The error that the reporter is seeing is:
   NSS error 8172
which means
   Peer's certificate issuer has been marked as not trusted by the user.

I have been able to reproduce exactly this error in my repro script (see above) by using the following cert-creation fn:

create_certs() {
    #create certificate and key databases with single, simple, self-signed certificate in it
    mkdir ${SRC_CERT_DIR}
    certutil -N -d ${SRC_CERT_DIR} -f ${SRC_CERT_PW_FILE}
    certutil -S -d ${SRC_CERT_DIR} -n ${TEST_HOSTNAME} -s "CN=${TEST_HOSTNAME}" -t "T,," -x -f ${SRC_CERT_PW_FILE} -z /usr/bin/certutil 2> /dev/null
    certutil -S -d ${SRC_CERT_DIR} -n ${TEST_OTHER_NAME} -s "CN=${TEST_OTHER_NAME}" -t "T,," -x -f ${SRC_CERT_PW_FILE} -z /usr/bin/certutil 2> /dev/null
}

And then I make both brokers use the same cert DB -- but the destination broker uses this in its command line:
    --ssl-cert-name $TEST_OTHER_NAME

Please note the trust argument in the call to certutil which creates this cert.   I removed the 'C' flag from it.  The 'C' flag means  " Trusted CA to issue server certificates (SSL only)"

With that trust-flag removed I got the same behavior that the reporter is seeing.  Before removing that flag . . . . It Just Worked.

I will wait for confirmation from Tim -- but I think this is probably the issue.

Comment 5 Justin Ross 2012-02-28 21:31:09 UTC
Mick's last comment suggests the issue is the manner in which the certs are created.  Tim, is that correct?

Comment 6 Tim Powers 2012-03-01 10:09:07 UTC
I have checked with our admins and they claim everything is in order, save for one broker which didn't have the CA cert imported into the cert db. Unfortunately fixing this only fixed the error we were receiving...

While I no longer see the error that was originally reported, the link is still listed as "Connecting" when running "quid-qoute link list" (even after waiting 5-10 minutes) and no messages are routed to the receiving broker.

Tim

Comment 7 Justin Ross 2012-04-20 16:12:00 UTC
Mick, what's status of this one?

Comment 8 mick 2012-04-20 19:27:56 UTC
I have reproduced Tim's behavior, but my script was not very realistic.  
I have a much better example now, and am making a new script from that.
If it reproduces the same behavior, then we have a bug.  If not, it's a setup problem with the certs.