Bug 819689
Summary: | mcollective client calls can hang if /etc/qpid directory perms aren't right | ||
---|---|---|---|
Product: | OKD | Reporter: | Thomas Wiest <twiest> |
Component: | Pod | Assignee: | William Henry <whenry> |
Status: | CLOSED WONTFIX | QA Contact: | libra bugs <libra-bugs> |
Severity: | low | Docs Contact: | |
Priority: | medium | ||
Version: | 2.x | CC: | bmeng, mmcgrath, rmillner, xtian |
Target Milestone: | --- | Keywords: | Triaged |
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2012-08-29 18:59:39 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 825075 | ||
Bug Blocks: |
Description
Thomas Wiest
2012-05-08 01:25:26 UTC
Can we clarify how this effects the broker? Besides the client (mc-ping) command hanging, what other symptoms are we seeing that demonstrate that the broker itself is hung? It's not clear that the broker is actually "basically useless when this happens." Can you run the mc-ping command from a different machine with the correct permissions while the original mc-ping command is hung? Ok so from Mike: broker != qpidd broker but instead the openshift "broker". That makes more sense. Do this is essentially the mcollective driver. Next to see if it's in the qpid layers or the mcollective driver layer. I've reproduced the bug. We noticed in the log that the client isn't really hanging but is attempting reconnection indefinitely despite the reconnect-time = 5. I added reconnect-limit=5 to the args for the connection in the hope that it would override the alleged Ruby 1.8 timeout issue. However it had not effect. Working with Qpid team for more ideas. This might be related to alleged Ruby 1.8 V Ruby 1.9 timeout.rb issues. Output snippet from client side mcoellctive log: D, [2012-05-16T18:48:17.828268 #28087] DEBUG -- : amqp.rb:69:in `connect' Connecting to localhost.localdomain:5671, {transport:ssl, reconnect:true, reconnect-timeout:5, reconnect-limit:5, heartbeat:1} You can see that the reconnect timeout and limit are set. However this log message gets logged continuously, consistently, and indefinitely in the log. (I added the timeout-limit to see if it would override the timeout. I also tried this with only the limit and removed the timeout. It had not effect.) I've created a BZ for MRG Messaging: https://bugzilla.redhat.com/show_bug.cgi?id=825075 A workaround for this has been found and we continue to investigate our messaging setup. (just doing bug cleanup) |