RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 619680 - cluster should not sync old cluster.conf during conf reload
Summary: cluster should not sync old cluster.conf during conf reload
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: cluster
Version: 6.0
Hardware: All
OS: Linux
high
high
Target Milestone: rc
: ---
Assignee: Fabio Massimo Di Nitto
QA Contact: Cluster QE
URL:
Whiteboard:
Depends On: 617161
Blocks: 625560
TreeView+ depends on / blocked
 
Reported: 2010-07-30 07:02 UTC by Fabio Massimo Di Nitto
Modified: 2016-04-26 15:04 UTC (History)
8 users (show)

Fixed In Version: cluster-3.0.12-21.el6
Doc Type: Bug Fix
Doc Text:
Clone Of: 617161
: 625560 (view as bug list)
Environment:
Last Closed: 2010-11-10 20:00:29 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Makes cman_tool fail on configuration regressions. (4.45 KB, patch)
2010-08-04 19:18 UTC, Lon Hohberger
fdinitto: review+
Details | Diff
proposed patch (4.54 KB, patch)
2010-08-05 10:15 UTC, Fabio Massimo Di Nitto
lhh: review+
Details | Diff

Comment 3 RHEL Program Management 2010-07-30 07:27:34 UTC
This issue has been proposed when we are only considering blocker
issues in the current Red Hat Enterprise Linux release.

** If you would still like this issue considered for the current
release, ask your support representative to file as a blocker on
your behalf. Otherwise ask that it be considered for the next
Red Hat Enterprise Linux release. **

Comment 7 Chris Feist 2010-08-03 21:46:44 UTC
This isn't a bug, ccs_sync works exactly as defined.  It distributes the cluster.conf file that you specify to all the nodes (no matter what's in that cluster.conf file).

If there is a check, it should be done in cman_tool.  Also, cman_tool probably shouldn't be propagating the files if it can't load them into corosync. It may make more sense to only propagate the files after they've been successfully loaded into corosync, (or include an option to force propagation).

Comment 11 Lon Hohberger 2010-08-04 19:18:37 UTC
Created attachment 436640 [details]
Makes cman_tool fail on configuration regressions.

See patch for more explanation.

Comment 12 Lon Hohberger 2010-08-04 19:22:08 UTC
I believe the synopsis of this bug is incorrect.  I do not believe this is a bug in ccs_sync at all.

ccs_sync is "stupid" and it's job is not to safeguard the user; plenty of other tools do it already: cman_tool, luci, even cman itself.

cman_tool calls out to ccs_config_validate, which the patch noted in comment #11 can be overloaded to also check the configuration version #.

Since cman_tool already knows the current version # when we call ccs_config_validate, it's simply a matter of passing it in.

Comment 13 Fabio Massimo Di Nitto 2010-08-05 03:20:38 UTC
Comment on attachment 436640 [details]
Makes cman_tool fail on configuration regressions.

The patch looks functionally OK, comments inline:


>+		result = validate_config(comline, ver.cv_config);
>+		if (result == 2)
>+			/* Config regression = fail. */
>+			die("Not reloading, configuration version regression\n");

I would prefer a more explicit error message here rather than "regression".
"New config version is older or the same as currently running.." or something.

>diff --git a/config/tools/xml/ccs_config_validate.in b/config/tools/xml/ccs_config_validate.in
>index 55ab4aa..08bae37 100644
>--- a/config/tools/xml/ccs_config_validate.in
>+++ b/config/tools/xml/ccs_config_validate.in
>@@ -17,6 +17,15 @@ if [ -z "$COROSYNC_DEFAULT_CONFIG_IFACE" ]; then
> 	export COROSYNC_DEFAULT_CONFIG_IFACE=$CONFIG_LOADER:cmanpreconfig
> fi
> 
>+get_config_version() {
>+	local file=$1
>+
>+	echo "xpath /cluster/@config_version" | \
>+		xmllint --shell "$file" | \
>+		grep content | \
>+	cut -f2 -d=
>+}

I personally don't like this style, but I'll look if i can find
a better way to avoid forking that many commands for one op.

>@@ -181,6 +196,16 @@ if [ -z "$quiet" ] || [ "$res" != "0" ]; then
> 		-e 's#'$tempfile'#tempfile#g'
> fi
> 
>+# Configuration regression = 2
>+if [ -n "$old_version" ] && [ $old_version -ne 0 ] && [ $res -eq 0 ]; then
>+	new_version=$(get_config_version $tempfile)
>+	lecho "Old version: $old_version   New version: $new_version"
>+	if [ $new_version -le $old_version ]; then
>+		echo "Warning: Configuration version regression!"
>+		res=2
>+	fi
>+fi

AFAIR we try to avoid using -ne/-eq for some portability issues.
No biggie again. just minor detail.

Comment 14 Fabio Massimo Di Nitto 2010-08-05 10:15:30 UTC
Created attachment 436792 [details]
proposed patch

this is a new proposed fix. diff from the previous patch:

ccs_config_validates.in:

- change return code from 2 to 254 because we don´t want to overlap with
  xmllint return codes.
- check if we can get new_version and if not return 253
- return 254 if new_version is <= old_config
- send errors from version check to stderr
- check the version before printing the xml validation result otherwise
  the output can be confusing:
* Configuration validates
* Error: Configuration version is older than running config!
doesn´t really make sense.

cman_tool:

- propagate the quiet option correctly, so that in case a config reload error
  can be properly debugged
- return WXITSTATUS from validate_config so we get the real error to parse
- parse the current 3 possible errors from ccs_config_validate to print a proper
  error message.

Comment 15 Lon Hohberger 2010-08-05 13:38:19 UTC
Comment on attachment 436792 [details]
proposed patch

Signed-off-by: Lon Hohberger <lhh>

Comment 17 Lon Hohberger 2010-08-05 15:53:00 UTC
http://git.fedorahosted.org/git?p=cluster.git;a=commit;h=3ac0bdfccc5c8b6883e3c331dfa3fca7eeb1b4c5

Corresponding man page update.

Comment 21 releng-rhel@redhat.com 2010-11-10 20:00:29 UTC
Red Hat Enterprise Linux 6.0 is now available and should resolve
the problem described in this bug report. This report is therefore being closed
with a resolution of CURRENTRELEASE. You may reopen this bug report if the
solution does not work for you.


Note You need to log in before you can comment on or make changes to this bug.