Bug 1108525
Summary: | Fix corosync behavior when disk is full | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Jan Friesse <jfriesse> |
Component: | corosync | Assignee: | Jan Friesse <jfriesse> |
Status: | CLOSED ERRATA | QA Contact: | Cluster QE <mspqa-list> |
Severity: | medium | Docs Contact: | |
Priority: | medium | ||
Version: | 7.1 | CC: | ccaulfie, cluster-maint, jkortus |
Target Milestone: | rc | ||
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | corosync-2.3.3-3.el7 | Doc Type: | Bug Fix |
Doc Text: |
Cause:
User filesystem is full and corosync is unable to store file.
Consequence:
Corosync can abort (assert) or doesn't log proper error.
Fix:
It's now correctly checked if blackbox can be stored. Also fail of ringid store operation is not handled by assert, but rather by logging error and (almost) graceful exit.
Result:
Corosync doesn't abort (assert) and always try to log proper error message.
|
Story Points: | --- |
Clone Of: | Environment: | ||
Last Closed: | 2015-03-05 08:27:07 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Attachments: |
Description
Jan Friesse
2014-06-12 08:01:36 UTC
Created attachment 908000 [details]
logsys: Log error if blackbox cannot be created
logsys: Log error if blackbox cannot be created
Signed-off-by: Jan Friesse <jfriesse>
Reviewed-by: Christine Caulfield <ccaulfie>
Created attachment 908001 [details]
logsys: Log warning if flightrecorder init fails
logsys: Log warning if flightrecorder init fails
Signed-off-by: Jan Friesse <jfriesse>
Reviewed-by: Christine Caulfield <ccaulfie>
Created attachment 908002 [details]
Introduce get_run_dir function
Introduce get_run_dir function
Run dir (LOCALSTATEDIR/lib/corosync) was hardcoded thru whole codebase.
Totemsrp was trying to create and chdir into it, but also
takes into account environment variable COROSYNC_RUN_DIR creating
inconsistency.
get_run_dir correctly returns COROSYNC_RUN_DIR (when set) or
LOCALSTATEDIR/lib/corosync. This is now used by all functions instead of
hardcoded string.
All occurrences of mkdir/chdir are removed from totemsrp and chdir is
now called in main function. Mkdir call is completely removed, because
it was not used anyway (check in main.c was called before totemsrp init,
so mkdir was never called) and also make install and/or package system
should take care of creating this directory with correct
permissions/context.
Signed-off-by: Jan Friesse <jfriesse>
Reviewed-by: Christine Caulfield <ccaulfie>
Created attachment 908003 [details]
Move ringid store and load from totem library
Move ringid store and load from totem library
Functions for storing and loading ring id was in the totem library. This
causes problem, what to do when it's impossible to load or store ring
id. Easy solution seemed to be assert, but sadly this makes hard for
user to find out what happened (because corosync was just aborted and
logsys didn't flush)
Solution is to move these functions to main.c, where is much easier to
handle error. This also makes libtotem free of any file system
operations.
Signed-off-by: Jan Friesse <jfriesse>
Reviewed-by: Christine Caulfield <ccaulfie>
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2015-0365.html |