Red Hat Bugzilla – Bug 490449
domU's restart after cluster.conf update
Last modified: 2010-10-23 04:21:41 EDT
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:188.8.131.52) Gecko/2009030503 Fedora/3.0.7-1.fc10 Firefox/3.0.7
It's a 3 node cluster used exclusively for dom0 clustering. All nodes
are running RHEL 5.3 AP. The server hardware are HP DL-580, 4 socket
quad core (16 total cores), 64 GB of RAM, 2 4Gb/s Emulex HBA's. All
the VM's use block device pass through (no file backed guests).
If you look at the logs from one of the nodes (NCLDL58016) you will
see that the reconfig occurred on March 4th at 23:09:26
Mar 4 23:09:26 ncldl58016 ccsd: Update of cluster.conf complete
(version 1 -> 2).
And then 33 seconds later the vif's and block devs all start coming
And finally at 23:11:09 the VM services start recovering.
In our test cluster, which very closely mirrors the production cluster
except for the number of guests, we have found that we can reproduce
this issue by simply incrementing the config_version in cluster.conf
and then updating the cluster with that config. So *any* cluster
reconfig will cause the VMs to get restarted."
Steps to Reproduce:
2.Save changes and send to notes
3.Then vm will restart
The vm service will restart
The cluster.conf should be saved and sent to all nodes and vms that are running should not restart.
Here's something worth noting:
Mar 4 23:09:41 ncldl58016 clurgmgrd: <info> Stopping changed resources.
Mar 4 23:12:15 ncldl58016 clurgmgrd: <info> Restarting changed resources.
Mar 4 23:12:15 ncldl58016 clurgmgrd: <info> Starting changed resources.
Mar 4 23:12:15 ncldl58016 clurgmgrd: <notice> Initializing vm:lnxp0006
Mar 4 23:12:15 ncldl58016 clurgmgrd: <notice> vm:lnxp0006 was added to the config, but I am not initializing it.
That's 2.5 minutes of 'idling'. If there's nothing to do in the stop-phase after a configuration update, rgmanager immediately moves to 'restarting' followed by 'starting'.
rgmanager *must* have gotten confused about the configuration update -- the question is how and why, and why does rg_test work?
There's a possibility that this particular issue was caused by a status-check vs. reconfiguration ordering problem, which I have fixed in the master and stable3 branches already.
Created attachment 335614 [details]
Patch ported from stable3 which disables status checks during reconfiguration
^^ Explanation of previously-attached patch.
There is another patch which is in stable3 (and I think this customer already has it) which also helped somewhat:
Previous link was for master branches. STABLE3 links follow:
Created attachment 335728 [details]
debug log from ncldl38076
Created attachment 335766 [details]
Patch encompassing 3 additional patches:
* Fix rare segfault in -USR1 dump code
* Ensure we don't reconfig on the same new config version twice
* Ensure we always call init_resource_groups() with the right # of parameters
* Block signals in worker threads so SIGHUP/SIGINT/etc. do not abort status checks erroneously.
Created attachment 335774 [details]
debug log from ncldl38077
This one includes a scratch debugging patch to see if rgmanager actually believed it should restart/stop a VM after a reconfiguration.
Created attachment 335786 [details]
debug log from ncldl38077
Ok, so the current status here is as follows:
With all the above (working) patches, rgmanager still occasionally thinks it needs to stop a VM (or VMs). It literally thinks that the VM has changed configuration after the configuration update, and for some reason flags it as needing a restart (or worse: just a stop).
I added a patch here which has debugging log messages.
Additionally, the patch zaps all flags during the tree delta in case any were set that we weren't aware of. This -should- prevent the last instance of the VMs restarting, but why the 'RF_NEEDSTOP' flag was set in the first place is still not known.
I'm pretty sure the NEEDSTOP flag is getting set because of this:
* node A starts vm:foo. Before starting vm:foo, it asks the rest of the cluster if they have seen vm:foo
* node B receives a status inquiry request from node A. It then executes a status check on that VM to see if it is running. It's not, so status returns 1. At this point, node B sets a NEEDSTOP flag.
* Suppose you disable the VM on node A and start it on node B now. At this point, the NEEDSTOP flag is still persisted on node B, but is ignored by the start/status checks.
* If you then do a configuration update, the NEEDSTOP flag is -still- there. After a configuration update (or during a special "recover" operation", the NEEDSTOP flag is used by rgmanager to decide what resources need to be stopped or not. Presence of this flag does NOT alter service state.
* Rgmanager does its reconfiguration, sees the NEESTOP flag, and stops the virtual machine. Because the state has not actually changed according to rgmanager (NEEDSTOP is succeeded by NEEDSTART if a resource's parameters have changed, for example), the next status check causes a recovery of the VM and then the VM is restarted.
The previous patch masks this issue, but in the wrong way - it clears the NEEDSTOP flag during reconfiguration. The NEEDSTOP flag should either be cleared on resource start or not set during the special status inquiry operation.
Same package which *only* fixes the problem described in comment #21
confirming that the minimal patch did resolve the issue on our test cluster
This is a general problem which can occur on any cluster following the steps in comment #21.
Created attachment 336081 [details]
Fake virtual machine agent which can be used to reproduce/test issue
This agent fakes out rgmanager to make it think it's controlling a virtual machine. This is not for use in production environments. The only use for this attachment is testing the errant behavior present within the internals of rgmanager.
You can install this test utility in the following way:
chmod -x /usr/share/cluster/vm.sh
cp vm-test.sh /usr/share/cluster
chmod +x /usr/share/cluster/vm-test.sh
You must distribute ssh keys between all hosts in the cluster in order for 'migrate' to work.
How To Test:
* create a virtual machine on at least a 2 node cluster (or use the fake script provided)
* Start rgmanager on all nodes. It will get started on one node.
* Disable the virtual machine using 'clusvcadm -d'.
* Enable the virtual machine explicitly on another node using 'clusvcadm -e'
* Change the configuration file by incrementing the config_version.
* Distribute the configuration file using ccs_tool update
* Ensure configuration version consistency in cman by running 'cman_tool version -r <new_config_version>'
Old (broken) behavior:
* Virtual machine which was disabled and enabled on another node will restart.
* Virtual machine which was disabled and enabled on another node will not restart.
Created attachment 336082 [details]
Simple utility to bump configuration version and print out the new version
Created attachment 336083 [details]
Script which can be used to reproduce above behavior; edit as necessary
To compile attachment noted in comment #30, perform the following:
gcc -o config-bump-xml config-bump-xml.c -I/usr/include/libxml2 -lxml2 -ggdb
You will need the libxml2-devel package installed.
I can also confirm that the minimal rpm seems to fix this problem.
Will this Bugfix get an hotfix or go into Z-Stream or is this focused for 5.4?
Cause it's not yet flagged in any such way.
There's a z-stream bugzilla here:
~~ Attention - RHEL 5.4 Beta Released! ~~
RHEL 5.4 Beta has been released! There should be a fix present in the Beta release that addresses this particular request. Please test and report back results here, at your earliest convenience. RHEL 5.4 General Availability release is just around the corner!
If you encounter any issues while testing Beta, please describe the issues you have encountered and set the bug into NEED_INFO. If you encounter new issues, please clone this bug to open a new issue and request it be reviewed for inclusion in RHEL 5.4 or a later update, if it is not of urgent severity.
Please do not flip the bug status to VERIFIED. Only post your verification results, and if available, update Verified field with the appropriate value.
Questions can be posted to this bug or your customer or partner representative.
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.