| Summary: | qpidd makes corosync spikes in CPU usage via excessive amount of cpg_mcast_joined | ||
|---|---|---|---|
| Product: | Red Hat Enterprise MRG | Reporter: | Fabio Massimo Di Nitto <fdinitto> |
| Component: | qpid-cpp | Assignee: | Alan Conway <aconway> |
| Status: | CLOSED CURRENTRELEASE | QA Contact: | MRG Quality Engineering <mrgqe-bugs> |
| Severity: | high | Docs Contact: | |
| Priority: | high | ||
| Version: | 2.3 | CC: | aconway, ccaulfie, gsim, jross, tross |
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2014-03-14 16:32:39 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
|
Description
Fabio Massimo Di Nitto
2013-12-02 14:44:37 UTC
The cause of the problem is qpidd updating the "cluster clock" via CPG every 10ms. The cluster clock is used to synchronize expiry of messages with a Time To Live setting. The default interval of 10ms is probably unreasonably low. The consequence of a longer interval is that expiry of TTL messages may be delayed by up to the interval, which is probably not a serious problem Unless openstack has a requirement for sub-second accuracy for TTL expiry, we can resolve this by setting a longer interval in qpidd.conf e.g. cluster-clock-interval=1000 This will reduce the traffic to one message per second which is probably acceptable, if not put this Bugzilla back to ASSIGNED and I will look further. testing with cluster-clock-interval=100 seems to reduce cpu usage enough to be almost unnoticeable (down to 6~10% CPU vs 30-35%). On real baremetal with lots of horse powers, it´s barely noticeable at 10 (4% cpu), 100 (1~2% cpu), 1000 can´t detect the process moving at all ;) I´ll check with the RHOS team what expectations they have on this functionality and let you know. FYI: TTL is an optional feature of Qpid, it's not used in all applications. It allows you to specify that a message will be dropped if it is not consumed after a specified expiry time. Typically it is used for messages that become irrelevant if not consumed in time and/or to avoid build-up of non-critical messages when consumers are slow to drain a queue. TTL is specified in milliseconds, but in practice it is often set to multiple seconds - something considered "too long" in the context of the system. At each clock "tick" the cluster expires all messages that are due to expire up to the present time. A longer duration, e.g. 1 sec, means that a message may be delivered up to 1 sec after it should have expired. However, there is no long-term build up of late messages. With a clock-duration of 1 sec, at most 1 seconds worth of "late-expired" messages are available for delivery at any time. > openstack is basically using qpidd as RPC implementation (given or > taken). Classic is client send message for servers to consume. > Servers take the message out of the queue and act on it.. (news at 11 ;)). I believe openstack uses TTL in the RPC interface to drop request messages if they are not processed inside a time limit [see topic_send() https://github.com/openstack/oslo-incubator/blob/master/openstack/common/rpc/impl_qpid.py] I think that is a use case that can tolerate some lateness. > Would a change of that setting affects performance? throughput? will all > servers still receive all the messages? It would only affect the system when there is a backlog and requests are timing out. Assuming the interval is 1 sec, It could result in requests being processed up to 1 sec after they should have timed out. So it can result in more messages being delivered to the server under load. However since they are using TTL, the system must already be coded to handle uncertainty about delivery of these messages in a situation of overload. So adding a little extra uncertainty is probably not a problem provided it's not drastically out of line with the TTL values that they are using. I don't know what TTL values they are using so not sure what's reasonable here. Testing with 100 seems to work fine for the current workload. I don´t expect any issue for now. The config change has been documented in the RHOS+RHEL-HA etherpad and will propagate around in due course. It might be a good idea tho to have kb article on this topic for future generations :) We will need a small RHOS+MRG task force to debug a series of issues I am seeing. It's not trivial to reproduce some of them and some appears to be related to cluster timing. https://bugzilla.redhat.com/show_bug.cgi?id=1036523 https://bugzilla.redhat.com/show_bug.cgi?id=1036518 Needinfo for Alan. (In reply to Fabio Massimo Di Nitto from comment #5) > We will need a small RHOS+MRG task force to debug a series of issues I am > seeing. > > It's not trivial to reproduce some of them and some appears to be related to > cluster timing. > > https://bugzilla.redhat.com/show_bug.cgi?id=1036523 > https://bugzilla.redhat.com/show_bug.cgi?id=1036518 Closing since this has been addressed via configuration. Please raise new bugzillas for any new problems. clearing needinfo flag |