RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1286759 - Fix ERR_LIBRARY on finalize call in dispatch
Summary: Fix ERR_LIBRARY on finalize call in dispatch
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: corosync
Version: 6.8
Hardware: All
OS: Unspecified
medium
medium
Target Milestone: rc
: ---
Assignee: Jan Friesse
QA Contact: cluster-qe@redhat.com
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-11-30 16:45 UTC by Jan Friesse
Modified: 2016-05-10 19:43 UTC (History)
2 users (show)

Fixed In Version: corosync-1.4.7-4.el6
Doc Type: Bug Fix
Doc Text:
No doc text needed.
Clone Of:
Environment:
Last Closed: 2016-05-10 19:43:01 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Revert orig patch (1.03 KB, patch)
2015-11-30 16:48 UTC, Jan Friesse
no flags Details | Diff
Proposed patch (5.93 KB, patch)
2015-11-30 16:48 UTC, Jan Friesse
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2016:0753 0 normal SHIPPED_LIVE corosync bug fix update 2016-05-10 22:32:07 UTC

Description Jan Friesse 2015-11-30 16:45:45 UTC
Description of problem:

This patch is (hopefully) better version of patch in bug 1141367. Original patch was causing https://github.com/jfriesse/csts/blob/master/tests/start-cfgstop-with-load.sh to fall.
    
Main problem with original patch was masking error on incorrect place.
If ipc is closed there can still be messages in a buffer (both ipc
socket and shm). Because error in ipc_sem_wait was masked, socket fd
wasn't flushed and pointer in shm data wasn't incremented. If user
application then called dipatch again. old data was read.
    
This patch uses more steps to handle such behavior:
- Add new error code CS_ERR_IN_SHUTDOWN returned specifically and only
 if ipc is closed
- coroipcc_dispatch_get now tests if ipc is closed and if so it
  returns CS_ERR_IN_SHUTDOWN
- If ipc_sem_wait in the coroipcc_dispatch_put function returns
  CS_ERR_LIBRARY, it's checked if ipc is closed. If so, ipc fd is
  flushed and CS_ERR_IN_SHUTDOWN is returned.
- libcfg/confdb/cpg/evs/votequorum tests return code of
  coroipcc_dispatch_put. If it returns CS_ERR_IN_SHUTDOWN, error
  is masked and function terminated.


Version-Release number of selected component (if applicable):
1.4.7-2

How reproducible:
Depends. On my machines 25% but for example Chrissie was unable to reproduce bug at all.

Steps to Reproduce:
1. Execute https://github.com/jfriesse/csts/blob/master/tests/start-cfgstop-with-load.sh


Actual results:
Test should fall because (at least) one of cpg-load clients dispatches same message more than once.


Expected results:
Test success.

Additional info:
1. We have to ensure behavior fixed by bug 1141367 is unchanged.
2. Bug is quite low priority (no Z needed) because incorrect behavior happens only on PROPER shutdown of corosync. This is usually not happening (node is ether running or fenced)

Comment 1 Jan Friesse 2015-11-30 16:48:11 UTC
Created attachment 1100550 [details]
Revert orig patch

Comment 2 Jan Friesse 2015-11-30 16:48:30 UTC
Created attachment 1100551 [details]
Proposed patch

Comment 7 errata-xmlrpc 2016-05-10 19:43:01 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-0753.html


Note You need to log in before you can comment on or make changes to this bug.