Bug 749139 - Ricci frequently times out
Summary: Ricci frequently times out
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: ricci
Version: 6.1
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: rc
: ---
Assignee: Chris Feist
QA Contact: Cluster QE
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-10-26 10:00 UTC by Gonzalo Servat
Modified: 2016-04-26 13:51 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2012-04-30 23:11:31 UTC
Target Upstream Version:


Attachments (Terms of Use)

Description Gonzalo Servat 2011-10-26 10:00:33 UTC
Description of problem:

Running luci in a dedicated VM (RH6.1) and ricci on a physical box (RH6.1). When clicking around in luci, some requests load fine and intermittently others just hang. It is clearly ricci hanging, as restarting ricci on the client makes luci return an error. Normally clicking a few more times on the same link eventually loads the page.

Luci was responding fine up until the point that I got my first 1 node cluster configured with a running service (MySQL).

Version-Release number of selected component (if applicable):

luci-0.23.0-13.el6.x86_64
ricci-0.16.2-35.el6.x86_64

How reproducible:

Always.

Steps to Reproduce:
1. Open luci (https://<ip>:8084)
2. Click on "Manage Clusters"
3. Even clicking on manage clusters often hangs. Otherwise any other link will intermittently hang.
  
Actual results:

Browser shows "Waiting for <luci host>..." forever (never loads).

Expected results:

Quickly load up the requested page.

Additional info:

Starting up ricci in debug mode shows the following for a successful request:

client added
ClientInstance.cpp:144: exception: SSL_read() error: SSL_ERROR_SYSCALL: Success
request completed in 299 milliseconds
client removed

Whenever a hanging request takes place, I see the following:

client added
ClientInstance.cpp:144: exception: Receive timeout
request completed in 121544 milliseconds
client removed

Luci simply shows "20:19:50,243 ERROR [luci.lib.ricci_communicator] An empty XML response was recei
ved from <ricci host>:11111"

Comment 1 Gonzalo Servat 2011-10-26 10:01:45 UTC
FWIW, starting up ricci shows:

# ricci -f -d -u ricci
failed to load authorized CAs
failed to load authorized CAs

Is this concerning?

Comment 3 Chris Feist 2011-10-26 14:34:26 UTC
Which version of modcluster do you have installed on both nodes?

Comment 4 Chris Feist 2011-10-26 14:37:20 UTC
Also, can you temporarily disable selinux, reboot and verify that you get the same "failed to load authorized CAs" error.

Comment 6 Gonzalo Servat 2011-10-26 20:48:10 UTC
SELinux is already disabled, Chris.

As for modcluster, it is only installed on the ricci (client) side:

modcluster-0.16.2-10.el6.x86_64

Comment 7 Gonzalo Servat 2011-10-26 20:49:42 UTC
Interesting... yesterday I deleted the cluster and started again. Now I can't reproduce the problem! And starting up ricci doesn't show the "failed to load authorized CAs" anymore!?

Comment 10 Chris Feist 2011-10-27 16:27:03 UTC
I'm closing this bug now since we don't have a reproducer, but if it happens again, please re-open this bug and send the contents of your '/var/lib/ricci' directory.  This is where ricci attempts to open up it's certificates, if there are errors in there, we should be able to detect them.

Thanks!

Comment 11 Gonzalo Servat 2012-01-06 03:45:33 UTC
Hi Chris,

I am able to replicate this issue again, however the luci part is running on RHEL 6.1, and the ricci/modcluster client is RHEL 5.7.

When I try to create the cluster, it just hangs and on starting up ricci in debug mode, it shows:

# ricci -d -f -u 102
failed to load authorized CAs
failed to load authorized CAs
client added
client added
exception: SSL_read() error: SSL_ERROR_SYSCALL: Success
request completed in 119 milliseconds
client added
exception: SSL_read() error: SSL_ERROR_SYSCALL: Success
request completed in 77 milliseconds
exception: SSL_read() error: SSL_ERROR_SYSCALL: Success
request completed in 141 milliseconds
client removed
client removed
client removed

The files in /var/lib/ricci/certs are:

# find .
.
./clients
./clients/client_cert_sJw8uX
./cacert.config
./privkey.pem
./cacert.pem

Any ideas?

Comment 13 euroford 2012-03-19 07:45:41 UTC
In ricci_defines.h
#define CLIENT_AUTH_CAs_PATH    "/var/lib/ricci/certs/auth_CAs.pem"

ricci read the client CA from this file.

Comment 14 Chris Feist 2012-04-30 23:11:31 UTC
I don't believe running luci on 6.1 and the clients on 5.7 is supported.  If you're configuring 5.7 nodes you'll want to use conga on 5.7.


Note You need to log in before you can comment on or make changes to this bug.