Hide Forgot
Description of problem: When attempting to push an updated cluster.conf using 'cman_tool version -r', ricci fails with dbus errors; ===== Dec 5 23:28:21 test-node-1 dbus[783]: [system] Rejected send message, 1 matched rules; type="method_call", sender=":1.8" (uid=998 pid=2471 comm="/usr/libexec/ricci/ricci-worker -f /var/lib/ricci/") interface="com.redhat.ricci" member="modcluster_rw" error name="(unset)" requested_reply="0" destination="com.redhat.ricci" (uid=0 pid=2359 comm="/usr/sbin/oddjobd -p /var/run/oddjobd.pid -t 300 ") Dec 5 23:28:21 test-node-1 dbus-daemon[783]: dbus[783]: [system] Rejected send message, 1 matched rules; type="method_call", sender=":1.8" (uid=998 pid=2471 comm="/usr/libexec/ricci/ricci-worker -f /var/lib/ricci/") interface="com.redhat.ricci" member="modcluster_rw" error name="(unset)" requested_reply="0" destination="com.redhat.ricci" (uid=0 pid=2359 comm="/usr/sbin/oddjobd -p /var/run/oddjobd.pid -t 300 ") Dec 5 23:28:21 test-node-1 corosync[1379]: [QUORUM] Members[3]: 1 2 3 ===== Each target node (nodes other than the one pushing) start emitting; ===== Dec 6 00:46:42 test-node-2 dbus[848]: [system] Rejected send message, 1 matched rules; type="method_call", sender=":1.12" (uid=998 pid=3239 comm="/usr/libexec/ricci/ricci-worker -f /var/lib/ricci/") interface="com.redhat.ricci" member="modcluster_rw" error name="(unset)" requested_reply="0" destination="com.redhat.ricci" (uid=0 pid=1121 comm="/usr/sbin/oddjobd -p /var/run/oddjobd.pid -t 300 ") Dec 6 00:46:42 test-node-2 dbus-daemon[848]: dbus[848]: [system] Rejected send message, 1 matched rules; type="method_call", sender=":1.12" (uid=998 pid=3239 comm="/usr/libexec/ricci/ricci-worker -f /var/lib/ricci/") interface="com.redhat.ricci" member="modcluster_rw" error name="(unset)" requested_reply="0" destination="com.redhat.ricci" (uid=0 pid=1121 comm="/usr/sbin/oddjobd -p /var/run/oddjobd.pid -t 300 ") Dec 6 00:46:42 test-node-2 corosync[1384]: [CMAN ] Unable to load new config in corosync: New configuration version has to be newer than current running configuration Dec 6 00:46:42 test-node-2 corosync[1384]: [CMAN ] Can't get updated config version 4: New configuration version has to be newer than current running configuration#012. Dec 6 00:46:42 test-node-2 corosync[1384]: [CMAN ] Activity suspended on this node Dec 6 00:46:42 test-node-2 corosync[1384]: [CMAN ] Error reloading the configuration, will retry every second Dec 6 00:46:43 test-node-2 corosync[1384]: [CMAN ] Unable to load new config in corosync: New configuration version has to be newer than current running configuration Dec 6 00:46:43 test-node-2 corosync[1384]: [CMAN ] Can't get updated config version 4: New configuration version has to be newer than current running configuration#012. Dec 6 00:46:43 test-node-2 corosync[1384]: [CMAN ] Activity suspended on this node Dec 6 00:46:43 test-node-2 corosync[1384]: [CMAN ] Error reloading the configuration, will retry every second ===== Version-Release number of selected component (if applicable): ricci-0.18.7-1.fc15.x86_64 cluster 3.1.8 rc How reproducible: 100% Steps to Reproduce: 1. Updated the cluster.conf 2. try to push out using cman_tool version -r 3. Actual results: Fails to push. Expected results: Pushes out file Additional info: Once I rsync the file to the other nodes, they pick up the changes and the cluster returns to normal.
digimer, spotting that D-Bus problem, do you have modclusterd installed by the time of running cman_tool?
(modclusterd is in modcluster package)
Installing modcluster, starting modclusterd and restarting ricci solved the problem. I would recommend adding a more verbose error message. The dbus error is cryptic and might not mean much or be of much help to users trying to diagnose this issue. Cheers
As per discussion on <irc://chat.freenode.net/linux-cluster>, there is an issue with modcluster package not being installed prior to starting ricci service on respective nodes. Under this circumstance, the same issue will show up when one wants to deploy cluster using luci interface with "install packages" option selected -- when modcluster package is installed as part of the process, updates to D-Bus policies are not propagated to yet-existing ricci's D-Bus connection. D-Bus error being cryptic is a feature of D-Bus side, not ours :) As mentioned, workaround is to install modcluster package followed by (re)starting ricci prior to using ricci's cluster functionality. Solution can be either (1) setting modcluster packages as a dependency for ricci as discussed with bug 721109 (public) or (2) there is a patch making ricci able to restart D-Bus connection when necessary (related but non-public bug 742345), but this is limited on create-cluster-via-luci scenario only. Thinking about it, it should be enough something like adding "service ricci condrestart" to %post in modcluster's spec file -- ricci is most probably robust enough to handle this, but this would require some (extensive) testing. Anyway, reassigning to the same person as with the mentioned bugs.
Another variation of "service ricci condrestart" idea is to add a function to force D-Bus connection restart to ricci's API and invoke this instead. Still (1) seems to be the sanest and safest way.
Fixed in ricci-0.18.7-2.fc16 and should be pushed live for fc16 in the next week or two.
This message is a notice that Fedora 15 is now at end of life. Fedora has stopped maintaining and issuing updates for Fedora 15. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At this time, all open bugs with a Fedora 'version' of '15' have been closed as WONTFIX. (Please note: Our normal process is to give advanced warning of this occurring, but we forgot to do that. A thousand apologies.) Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, feel free to reopen this bug and simply change the 'version' to a later Fedora version. Bug Reporter: Thank you for reporting this issue and we are sorry that we were unable to fix it before Fedora 15 reached end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged to click on "Clone This Bug" (top right of this page) and open it against that version of Fedora. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. The process we are following is described here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping