| Summary: | [RFE] disambiguate cpg_dispatch timeout behavior from function behavior enhacement | ||||||
|---|---|---|---|---|---|---|---|
| Product: | [Retired] Corosync Cluster Engine | Reporter: | dan clark <2clarkd> | ||||
| Component: | unknown | Assignee: | Jan Friesse <jfriesse> | ||||
| Status: | CLOSED UPSTREAM | QA Contact: | |||||
| Severity: | low | Docs Contact: | |||||
| Priority: | unspecified | ||||||
| Version: | 1.3 | CC: | 2clarkd, agk, asalkeld, cluster-maint, fdinitto, jfriesse, sdake | ||||
| Target Milestone: | --- | Keywords: | Reopened | ||||
| Target Release: | --- | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Linux | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | |||||||
| : | 700517 (view as bug list) | Environment: | |||||
| Last Closed: | 2012-02-08 15:06:41 UTC | Type: | --- | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Bug Depends On: | |||||||
| Bug Blocks: | 700517 | ||||||
| Attachments: |
|
||||||
The cpg_dispatch routine masks the return values when used in the non-blocking mode. Ideally in a non-blocking mode the application could differentiate between a dispatch successfully occurring, especially in the DISPATCH_ONE case, and thus triggering the need to check again to see if there are more messages to process, and the case when there are no more messages to process, thus it is OK to register the FD associated with the connection for inbound messages in epoll and sleep for future events associated with corosync (or other aspects of the application). I believe this problem occurs due to the code in the library which over-writes the underlying error of CS_ERR_TRY_AGAIN with CS_OK.
if (error == CS_ERR_TRY_AGAIN) {
< error = CS_OK;
if (dispatch_types == CPG_DISPATCH_ALL) {
break; /* exit do while cont is 1 loop */
} else {
continue; /* next poll */
}
}
can this be addressed in the next release? Could there be some dialog about the future of the interface? At present it is difficult to appropriately test for new messages. This is additionally complicated by the fact that 'group change' messages are handled with the same dispatch routine as are message exchanges. It would be optimal to allow filtering of the dispatch by type to allow higher level layers to control the flow of messages thus providing copy free semantics back into the application domain. can this be re-categorized to fedora? *** Bug 700517 has been marked as a duplicate of this bug. *** Dan, Earliest we could address this issue is corosync 2.0. corosync 1.y is frozen. Created attachment 560271 [details]
Proposed patch for add of CS_DISPATCH_ONE_NONBLOCKING dispatch type
Dan,
I believe this bug has two parts:
1.) Incorrect comments. In reality, CS_DISPATCH_ONE (as correctly described in manpages) is BLOCKING. Comment was incorrect and included patch fixes that
2.) No non-blocking variant of CS_DISPATCH_ONE exists. This is what this patch is all about.
I hope this patch will fulfill your needs.
|
Description of problem: The cpg_dispatch routine overloads the 'dispatch_types' field to both provide the type of call DISPATCH_ALL vs DISPATCH_ONE or DISPATCH_BLOCK and the timeout behavior (-1 implying wait forever OR 0 implying return immediately). The code currently blocks on both DISPATCH_ONE and DISPATCH_BLOCK but returns immediately for DISPATCH_ALL (unlike the comment). Version-Release number of selected component (if applicable): 3.0 How reproducible: always Steps to Reproduce: 1. use cpg_dispatch(handle, CS_DISPATCH_ONE) Actual results: actual, block forever Expected results: return immediately Additional info: ideally a dispatch function would allow the behavior to be explicitly defined by the application instead of overloading the parameters cpg_dispatch_time(handle, CPG_DISPATCH_ONE, timeout) short of this it would be fine to have the code correspond with the comments and the perhaps intended functionality in the library: current: /* * Timeout instantly for CS_DISPATCH_ONE or CS_DISPATCH_ALL and * wait indefinately for CS_DISPATCH_BLOCKING */ if (dispatch_types == CPG_DISPATCH_ALL) { timeout = 0; } patch < if (dispatch_types == CPG_DISPATCH_ALL) { > if (dispatch_types != CPG_DISPATCH_BLOCKING) {