Hide Forgot
Description of problem: Running luci in a dedicated VM (RH6.1) and ricci on a physical box (RH6.1). When clicking around in luci, some requests load fine and intermittently others just hang. It is clearly ricci hanging, as restarting ricci on the client makes luci return an error. Normally clicking a few more times on the same link eventually loads the page. Luci was responding fine up until the point that I got my first 1 node cluster configured with a running service (MySQL). Version-Release number of selected component (if applicable): luci-0.23.0-13.el6.x86_64 ricci-0.16.2-35.el6.x86_64 How reproducible: Always. Steps to Reproduce: 1. Open luci (https://<ip>:8084) 2. Click on "Manage Clusters" 3. Even clicking on manage clusters often hangs. Otherwise any other link will intermittently hang. Actual results: Browser shows "Waiting for <luci host>..." forever (never loads). Expected results: Quickly load up the requested page. Additional info: Starting up ricci in debug mode shows the following for a successful request: client added ClientInstance.cpp:144: exception: SSL_read() error: SSL_ERROR_SYSCALL: Success request completed in 299 milliseconds client removed Whenever a hanging request takes place, I see the following: client added ClientInstance.cpp:144: exception: Receive timeout request completed in 121544 milliseconds client removed Luci simply shows "20:19:50,243 ERROR [luci.lib.ricci_communicator] An empty XML response was recei ved from <ricci host>:11111"
FWIW, starting up ricci shows: # ricci -f -d -u ricci failed to load authorized CAs failed to load authorized CAs Is this concerning?
Which version of modcluster do you have installed on both nodes?
Also, can you temporarily disable selinux, reboot and verify that you get the same "failed to load authorized CAs" error.
SELinux is already disabled, Chris. As for modcluster, it is only installed on the ricci (client) side: modcluster-0.16.2-10.el6.x86_64
Interesting... yesterday I deleted the cluster and started again. Now I can't reproduce the problem! And starting up ricci doesn't show the "failed to load authorized CAs" anymore!?
I'm closing this bug now since we don't have a reproducer, but if it happens again, please re-open this bug and send the contents of your '/var/lib/ricci' directory. This is where ricci attempts to open up it's certificates, if there are errors in there, we should be able to detect them. Thanks!
Hi Chris, I am able to replicate this issue again, however the luci part is running on RHEL 6.1, and the ricci/modcluster client is RHEL 5.7. When I try to create the cluster, it just hangs and on starting up ricci in debug mode, it shows: # ricci -d -f -u 102 failed to load authorized CAs failed to load authorized CAs client added client added exception: SSL_read() error: SSL_ERROR_SYSCALL: Success request completed in 119 milliseconds client added exception: SSL_read() error: SSL_ERROR_SYSCALL: Success request completed in 77 milliseconds exception: SSL_read() error: SSL_ERROR_SYSCALL: Success request completed in 141 milliseconds client removed client removed client removed The files in /var/lib/ricci/certs are: # find . . ./clients ./clients/client_cert_sJw8uX ./cacert.config ./privkey.pem ./cacert.pem Any ideas?
In ricci_defines.h #define CLIENT_AUTH_CAs_PATH "/var/lib/ricci/certs/auth_CAs.pem" ricci read the client CA from this file.
I don't believe running luci on 6.1 and the clients on 5.7 is supported. If you're configuring 5.7 nodes you'll want to use conga on 5.7.