1108525 – Fix corosync behavior when disk is full

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1108525 - Fix corosync behavior when disk is full

Summary: Fix corosync behavior when disk is full

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 7
Classification:	Red Hat
Component:	corosync
Sub Component:
Version:	7.1
Hardware:	Unspecified
OS:	Unspecified
Priority:	medium
Severity:	medium
Target Milestone:	rc
Target Release:	---
Assignee:	Jan Friesse
QA Contact:	Cluster QE
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2014-06-12 08:01 UTC by Jan Friesse
Modified:	2015-03-05 08:27 UTC (History)
CC List:	3 users (show)
Fixed In Version:	corosync-2.3.3-3.el7
Doc Type:	Bug Fix
Doc Text:	Cause: User filesystem is full and corosync is unable to store file. Consequence: Corosync can abort (assert) or doesn't log proper error. Fix: It's now correctly checked if blackbox can be stored. Also fail of ringid store operation is not handled by assert, but rather by logging error and (almost) graceful exit. Result: Corosync doesn't abort (assert) and always try to log proper error message.
Clone Of:
Environment:
Last Closed:	2015-03-05 08:27:07 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
logsys: Log error if blackbox cannot be created (1.23 KB, patch) 2014-06-12 08:02 UTC, Jan Friesse	no flags	Details \| Diff
logsys: Log warning if flightrecorder init fails (1.96 KB, patch) 2014-06-12 08:03 UTC, Jan Friesse	no flags	Details \| Diff
Introduce get_run_dir function (7.30 KB, patch) 2014-06-12 08:03 UTC, Jan Friesse	no flags	Details \| Diff
Move ringid store and load from totem library (11.05 KB, patch) 2014-06-12 08:03 UTC, Jan Friesse	no flags	Details \| Diff
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHBA-2015:0365	0	normal	SHIPPED_LIVE	corosync bug fix and enhancement update	2015-03-05 12:51:37 UTC

Description Jan Friesse 2014-06-12 08:01:36 UTC

Description of problem:
Counterpart of bug 1005179 for 7.1

Version-Release number of selected component (if applicable):
2.3.3

Comment 1 Jan Friesse 2014-06-12 08:02:59 UTC

Created attachment 908000 [details]
logsys: Log error if blackbox cannot be created

logsys: Log error if blackbox cannot be created

Signed-off-by: Jan Friesse <jfriesse>
Reviewed-by: Christine Caulfield <ccaulfie>

Comment 2 Jan Friesse 2014-06-12 08:03:03 UTC

Created attachment 908001 [details]
logsys: Log warning if flightrecorder init fails

logsys: Log warning if flightrecorder init fails

Signed-off-by: Jan Friesse <jfriesse>
Reviewed-by: Christine Caulfield <ccaulfie>

Comment 3 Jan Friesse 2014-06-12 08:03:09 UTC

Created attachment 908002 [details]
Introduce get_run_dir function

Introduce get_run_dir function

Run dir (LOCALSTATEDIR/lib/corosync) was hardcoded thru whole codebase.
Totemsrp was trying to create and chdir into it, but also
takes into account environment variable COROSYNC_RUN_DIR creating
inconsistency.

get_run_dir correctly returns COROSYNC_RUN_DIR (when set) or
LOCALSTATEDIR/lib/corosync. This is now used by all functions instead of
hardcoded string.

All occurrences of mkdir/chdir are removed from totemsrp and chdir is
now called in main function. Mkdir call is completely removed, because
it was not used anyway (check in main.c was called before totemsrp init,
so mkdir was never called) and also make install and/or package system
should take care of creating this directory with correct
permissions/context.

Signed-off-by: Jan Friesse <jfriesse>
Reviewed-by: Christine Caulfield <ccaulfie>

Comment 4 Jan Friesse 2014-06-12 08:03:15 UTC

Created attachment 908003 [details]
Move ringid store and load from totem library

Move ringid store and load from totem library

Functions for storing and loading ring id was in the totem library. This
causes problem, what to do when it's impossible to load or store ring
id. Easy solution seemed to be assert, but sadly this makes hard for
user to find out what happened (because corosync was just aborted and
logsys didn't flush)

Solution is to move these functions to main.c, where is much easier to
handle error. This also makes libtotem free of any file system
operations.

Signed-off-by: Jan Friesse <jfriesse>
Reviewed-by: Christine Caulfield <ccaulfie>

Comment 10 errata-xmlrpc 2015-03-05 08:27:07 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2015-0365.html

Note You need to log in before you can comment on or make changes to this bug.