Bug 682771
Summary: | RFE: remove 1M message size limit | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Florian Haas <florian> | ||||||||
Component: | corosync | Assignee: | Christine Caulfield <ccaulfie> | ||||||||
Status: | CLOSED ERRATA | QA Contact: | cluster-qe <cluster-qe> | ||||||||
Severity: | high | Docs Contact: | |||||||||
Priority: | high | ||||||||||
Version: | 7.0 | CC: | abeekhof, agk, ccaulfie, cfeist, cluster-maint, fdinitto, jfriesse, jkortus, wnix | ||||||||
Target Milestone: | rc | Keywords: | FutureFeature | ||||||||
Target Release: | --- | ||||||||||
Hardware: | All | ||||||||||
OS: | Linux | ||||||||||
Whiteboard: | |||||||||||
Fixed In Version: | corosync-2.3.4-6.el7 | Doc Type: | Enhancement | ||||||||
Doc Text: |
Feature:
The maximum size of a message that could be transferred using corosync CPG messaging facility was previously limited to 1MB. This limit has now been lifted.
Reason:
Pacemaker uses corosync CPG messaging to communicate changes in cluster state, and with larger numbers of resources this amount of information could get quite large and, even with data compression, exceed the maximum size allowed by corosync.
Result:
There is now no limit on the size of the data packets sent using CPG messaging in corosync. It is still necessary to configure pacemaker in /etc/sysconfig/pacemaker to allow larger messages to be sent.
|
Story Points: | --- | ||||||||
Clone Of: | |||||||||||
: | 975903 (view as bug list) | Environment: | |||||||||
Last Closed: | 2015-11-19 11:41:06 UTC | Type: | --- | ||||||||
Regression: | --- | Mount Type: | --- | ||||||||
Documentation: | --- | CRM: | |||||||||
Verified Versions: | Category: | --- | |||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||
Embargoed: | |||||||||||
Bug Depends On: | 975903 | ||||||||||
Bug Blocks: | 1133060, 1174884, 1205796, 1251103 | ||||||||||
Attachments: |
|
Description
Florian Haas
2011-03-07 14:50:39 UTC
The client->server ipc portion of this RFE could be addressed by using the zero-copy feature to allocate buffers when the requested buffer size is greater then 1MB (and then do a memcpy). From server to client, an additional message type could be added to indicate the buffer is a freshly mmapped buffer needing special attention by the dispatch code. The totempg code could then have a memory allocation that takes place if a new message is received that will be larger then 1MB. All sounds pretty complicated though and prone to breakage. Do you have customers that have run into this limit? Regards -steve Angus, please comment on how this RFE would be achieved in the libqb corosync 2.0+ case. Are you sending XML text? Is it possible to compress the text (it should compress well)? Another option is to automatically fragment the message between client and server. I'de need to have a look into a bit more though. It is XML that is being sent and we do compress it already. However the status section can get really big so hitting the limit is still conceivable. I don't think we necessarily need to remove the limit completely, just allow it to be tuned from corosync.conf (_before_ startup) by those that find it necessary. This would have the nice property of also allowing it to be tuned down, thus lowering corosync's memory footprint in situations not needing large messages. Will propose as a 2.0 feature (rhel7 timeframe). IPC is now handled by LibQB. According to https://github.com/asalkeld/libqb/issues/14, that problem still exists. There is also another problem https://github.com/asalkeld/libqb/issues/71. After removing these two issues, support in corosync should be seamless. Cloning this bug. This bug will be used for corosync and cloned one Bug 975903 for LibQB. commit 8cc8e513633a1a8b12c416e32fb5362fcf4d65dd Author: Christine Caulfield <ccaulfie> Date: Thu Mar 5 16:45:15 2015 +0000 cpg: Add support for messages larger than 1Mb Created attachment 1009656 [details]
cpg: Add support for messages larger than 1Mb
cpg: Add support for messages larger than 1Mb
If a cpg client sends a message larger than 1Mb (actually slightly
less to allow for internal buffers) cpg will now fragment that into
several corosync messages before sending it around the ring.
cpg_mcast_joined() can now return CS_ERR_INTERRUPT which means that the
cpg membership was disrupted during the send operation and the message
needs to be resent.
The new API call cpg_max_atomic_msgsize_get() returns the maximum size
of a message that will not be fragmented internally.
New test program cpghum was written to stress test this functionality,
it checks message integrity and order of receipt.
Signed-off-by: Christine Caulfield <ccaulfie>
Reviewed-by: Jan Friesse <jfriesse>
Created attachment 1041624 [details]
Really add cpghum
Really add cpghum
Signed-off-by: Jan Friesse <jfriesse>
Created attachment 1041839 [details]
Don't link with libz when not needed
Don't link with libz when not needed
Commit 8cc8e513633a1a8b12c416e32fb5362fcf4d65dd added check for libz
resulting in linking with lib z for all libraries. This is not expected
behavior. Patch solves it by making defining automake conditional so
cpghum is linked only if libz is available and LIBS variable is not
modified at all.
Signed-off-by: Jan Friesse <jfriesse>
I'm not able to get through the test case David used in bug 1174462 comment 8. Is there a configuration change that's needed too? [root@host-026 ~]# for x in `seq 1 40`; do pcs resource create FAKE$x Dummy meta target-role=Stopped fake="`openssl rand -hex 32000`" || break; echo $x done; done 1 done 2 done 3 done 4 done Error: unable to get cib Error: unable to get cib [root@host-026 ~]# tail /var/log/messages -n 30 Aug 14 10:59:31 host-026 crmd[13434]: notice: Initiating action 16: monitor FAKE2_monitor_0 on host-027 Aug 14 10:59:31 host-026 crmd[13434]: notice: Initiating action 14: monitor FAKE2_monitor_0 on host-026 (local) Aug 14 10:59:31 host-026 crmd[13434]: notice: Initiating action 15: probe_complete probe_complete-host-027 on host-027 - no waiting Aug 14 10:59:31 host-026 crmd[13434]: notice: Initiating action 17: probe_complete probe_complete-host-028 on host-028 - no waiting Aug 14 10:59:31 host-026 crmd[13434]: notice: Operation FAKE2_monitor_0: not running (node=host-026, call=43, rc=7, cib-update=479, confirmed=true) Aug 14 10:59:31 host-026 crmd[13434]: notice: Initiating action 13: probe_complete probe_complete-host-026 on host-026 (local) - no waiting Aug 14 10:59:31 host-026 crmd[13434]: notice: Transition 381 (Complete=7, Pending=0, Fired=0, Skipped=0, Incomplete=0, Source=/var/lib/pacemaker/pengine/pe-input-52.bz2): Complete Aug 14 10:59:31 host-026 crmd[13434]: notice: State transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS cause=C_FSA_INTERNAL origin=notify_crmd ] Aug 14 10:59:32 host-026 cibadmin[1527]: notice: Invoked: /usr/sbin/cibadmin --replace -o configuration -V --xml-pipe Aug 14 10:59:32 host-026 crmd[13434]: notice: State transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_FSA_INTERNAL origin=abort_transition_graph ] Aug 14 10:59:32 host-026 pengine[13433]: notice: Calculated Transition 382: /var/lib/pacemaker/pengine/pe-input-53.bz2 Aug 14 10:59:32 host-026 crmd[13434]: notice: Initiating action 18: monitor FAKE3_monitor_0 on host-028 Aug 14 10:59:32 host-026 crmd[13434]: notice: Initiating action 16: monitor FAKE3_monitor_0 on host-027 Aug 14 10:59:32 host-026 crmd[13434]: notice: Initiating action 14: monitor FAKE3_monitor_0 on host-026 (local) Aug 14 10:59:32 host-026 crmd[13434]: notice: Initiating action 15: probe_complete probe_complete-host-027 on host-027 - no waiting Aug 14 10:59:32 host-026 crmd[13434]: notice: Initiating action 17: probe_complete probe_complete-host-028 on host-028 - no waiting Aug 14 10:59:32 host-026 crmd[13434]: notice: Operation FAKE3_monitor_0: not running (node=host-026, call=47, rc=7, cib-update=481, confirmed=true) Aug 14 10:59:32 host-026 crmd[13434]: notice: Initiating action 13: probe_complete probe_complete-host-026 on host-026 (local) - no waiting Aug 14 10:59:32 host-026 crmd[13434]: notice: Transition 382 (Complete=7, Pending=0, Fired=0, Skipped=0, Incomplete=0, Source=/var/lib/pacemaker/pengine/pe-input-53.bz2): Complete Aug 14 10:59:32 host-026 crmd[13434]: notice: State transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS cause=C_FSA_INTERNAL origin=notify_crmd ] Aug 14 10:59:32 host-026 cibadmin[1547]: notice: Invoked: /usr/sbin/cibadmin --replace -o configuration -V --xml-pipe Aug 14 10:59:33 host-026 cib[13429]: error: Compression of 329080 bytes failed: output data will not fit into the buffer provided (-8) Aug 14 10:59:33 host-026 cib[13429]: error: Could not compress the message into less than the configured ipc limit (131072 bytes).Set PCMK_ipc_buffer to a higher value (658160 bytes suggested) Aug 14 10:59:33 host-026 cib[13429]: notice: Notification failed: Message too long (-90) Aug 14 10:59:33 host-026 cib[13429]: error: Compression of 286029 bytes failed: output data will not fit into the buffer provided (-8) Aug 14 10:59:33 host-026 cib[13429]: error: Could not compress the message into less than the configured ipc limit (131072 bytes).Set PCMK_ipc_buffer to a higher value (1316320 bytes suggested) Aug 14 10:59:33 host-026 cib[13429]: notice: Message to 0x18fdd00[1551] failed: Message too long (-90) Aug 14 10:59:33 host-026 cib[13429]: warning: A-Sync reply to cibadmin failed: No message of desired type Aug 14 11:00:01 host-026 systemd: Started Session 1541 of user root. Aug 14 11:00:01 host-026 systemd: Starting Session 1541 of user root. [root@host-026 ~]# rpm -q pacemaker corosync libqb pacemaker-1.1.13-6.el7.x86_64 corosync-2.3.4-7.el7.x86_64 libqb-0.17.1-2.el7.x86_64 I found the setting in /etc/sysconfig/pacemaker to adjust the IPC buffer and set it to 2MB. This allowed me to get through the test cases in bug 1174462 comment 8 and 9. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2015-2354.html |